Saturday, 29 November 2025
33.2 C
Singapore
29.5 C
Thailand
23.9 C
Indonesia
28.5 C
Philippines

Anthropic aims to uncover how AI models think by 2027

Anthropic CEO Dario Amodei aims to understand how AI models work by 2027 and urges industry-wide action for safety and transparency.

Anthropic’s CEO, Dario Amodei, has shared a clear message: we must better understand how artificial intelligence (AI) models work. Amodei sets a bold target in a new essay published on June 20, titled The Urgency of Interpretability. By 2027, Anthropic hopes to detect most problems within advanced AI systems reliably. While the task is complex, Amodei believes AI must be safe and responsible in society.

Why understanding AI is so important

When you interact with a powerful AI tool, such as a chatbot or summarising assistant, you might assume the developers know exactly how it works. But according to Amodei, that’s not the case. Even the companies creating the most advanced models don’t always understand why they make certain decisions or sometimes make mistakes.

For example, OpenAI recently released two new models called o3 and o4-mini. While they perform better on some tasks, they also tend to “hallucinate” more — in other words, produce false or confusing information. The problem? No one knows precisely why this happens.

Amodei warns that we could face serious risks if we build more powerful AI systems without improving our understanding. He compares the future of AI to “a country of geniuses in a data centre” — brilliant but mysterious and potentially unpredictable.

Chris Olah, Anthropic’s co-founder, adds that today’s AI systems are more grown than built. That means improvements often come from trial and error, not from clear plans or designs. As a result, researchers may create intelligent systems without fully grasping how they function.

What Anthropic is doing about it

Anthropic is a leader in mechanistic interpretability, which tries to open AI’s “black box.” The company wants to figure out exactly how AI systems make decisions and understand what drives their behaviour.

One promising area of research involves studying “circuits” within AI models. These are patterns that show how models process information. For instance, Anthropic has found a specific circuit that helps AI determine which US cities belong to which states. It’s just one example — researchers estimate millions of such circuits could be in a single model.

In the long run, Amodei says his team hopes to develop something like an “MRI scan” for AI systems. These deep checks would help spot problems such as lying, manipulation, or unexpected behaviour. He believes these scans will be essential for safely testing and launching future AI tools. While this could take 5 to 10 years, the company is already progressing early.

Recently, Anthropic also made its first outside investment in a startup working on AI interpretability, showing its commitment to this mission.

A call for shared responsibility

In his essay, Amodei doesn’t just speak to his team. He encourages others in the AI field — especially at OpenAI and Google DeepMind — to invest more in research that explains how AI works. He also suggests governments should get involved but in a careful way. For instance, light regulations can be set that require companies to share their safety practices.

He goes further, saying the US government should control the export of advanced computer chips to China. He worries that without such limits, we might end up in a global AI race where no one is paying enough attention to safety.

Unlike some major tech firms, Anthropic supported California’s AI safety bill, SB 1047, which would have set standards for reporting safety risks in advanced models. While the bill faced pushback, Anthropic offered helpful suggestions, showing its willingness to lead on responsibility.

In the end, Amodei’s message is simple but serious. As AI becomes central to business, defence, and everyday life, we must learn how these systems work. Without that knowledge, we’re building tools that could one day act in ways we don’t understand — a risk we can’t afford to take.

Hot this week

Crunchyroll brings world-first premieres and major anime showcases to AFA Singapore 2025

Crunchyroll brings exclusive premieres, guest panels and a large interactive booth to AFA Singapore 2025.

Google limits free Nano Banana Pro image generation due to high demand

Google is reducing free Nano Banana Pro and Gemini 3 Pro usage due to high demand, limiting daily access while paid plans remain unchanged.

Square Enix revisits a classic with Dragon Quest VII: Reimagined

Square Enix unveils Dragon Quest VII: Reimagined, a modern remake featuring new visuals, streamlined storytelling, and updated combat.

Singapore sees surge in ransomware attacks during holidays, Semperis study finds

A new Semperis study shows 59% of ransomware attacks in Singapore occur during holidays, driven by reduced staffing and major corporate events.

Statrys expands in Singapore with unified CAB platform for SMEs

Statrys launches a unified platform in Singapore to streamline incorporation, accounting and cross-border payments for SMEs.

Cronos: The New Dawn drives major profit surge for Bloober Team

Bloober Team reports record Q3 2025 results as Cronos: The New Dawn drives a major surge in global sales and profit.

China warns of growing risk of bubble in humanoid robot industry

China warns of a potential bubble in the humanoid robot industry, raising concerns about market saturation, investment risks, and global impact.

SMRT upgrades Bishan Depot with automation to double train overhaul capacity

SMRT upgrades Bishan Depot with automation to double overhaul capacity and enhance safety, efficiency, and workforce sustainability.

Apple is expected to overtake Samsung as the world’s leading smartphone maker

Apple is projected to overtake Samsung as the world’s top smartphone maker, driven by strong iPhone 17 demand and upcoming device launches.

Related Articles

Popular Categories