Skip to content
Tech News
← Back to articles

Anthropic's warning over AI self-improvement has a hidden message — accelerating development requires more compute before companies ever risk losing control of frontier AI models

read original more articles

The company that just a few weeks ago told us that its Mythos model was too powerful to be released publicly is now saying that we might need to hit the pause button on AI altogether, while also teaching its AI to build itself. On June 4, Anthropic published a report, when AI builds itself, showing that Claude now writes more than 80% of the code merged into its own production codebase, up from the low single digits before Claude Code reached research preview in February last year, and arguing that the loop has begun to accelerate AI development in a way that could eventually leave humans unable to control the systems being built.

The Anthropic Institute, the firm's research arm, casts the trend as early movement toward recursive self-improvement, the point at which a model designs and builds its own successor without meaningful human input, and warns that the rare misalignment in today's models could keep "growing more frequent but less understood until we lose control of them."

Reading further into the post, and taking the entire frontier AI model development ecosystem reveals some other uncomfortable truths that the developers of cutting-edge AI models also have to reckon with: compute.

Latest Videos From Watch full video here:

Loss of control

Anthropic gave us three predictions of ways the next few years could play out, reserving a particularly dire warning for the case in which models become capable of fully improving themselves. Progress, Amodei’s lab argues, would then be paced almost entirely by available compute, human engineers would be pushed into oversight and verification, and a self-improving model could come to dominate as its abilities outstrip those of the people who built it.

The firm called this — the task of keeping a system's behavior tied to human intent — the part of this future it’s least sure about. A capable, well-aligned model might discover new ways to keep its successors safe, it said, or the reverse could hold, and misalignment could compound generation over generation, with the unusual concession that a sufficiently wise model might instead choose to halt its own development.

The idea of an ultraintelligent machine designing still better machines (“singularity”) has been around for decades. British mathematician I. J. Good argued back in the 1960s through his “intelligence explosion” thesis that such a machine would be the “last invention that man ever need make,” so long as it remained “docile enough” to tell us how to control it. Meanwhile, the “Godfather of AI,” Geoffrey Hinton, has put the odds of AI causing human extinction within three decades at 10% to 20%.

The International AI Safety Report, chaired by Yoshua Bengio and published in January 2025 with input from more than 100 experts across 30 countries, defines loss of control as a scenario in which AI systems operate outside anyone's control with no clear path to regaining it.

Every figure behind the warning coming out of Anthropic is based on data from within, and none of it has been independently audited. Among this data is its claim that in Q2 2026, the typical Anthropic engineer is merging eight times as much code per day as in 2024. On the hardest, least-specified coding tasks, Claude succeeded 76% of the time in May 2026, a rise of 50 percentage points in six months. On an internal test that asks each new model to make training code run faster, results climbed from roughly triple the original speed with Claude Opus 4 in May 2025 to about 52 times with the unreleased Mythos Preview model by April 2026, against the four to eight hours a skilled researcher needs for a fourfold gain.

... continue reading