Alignment Is Capability

Here's a claim that might actually be true: alignment is not a constraint on capable AI systems. Alignment is what capability is at sufficient depth.

A model that aces benchmarks but doesn't understand human intent is just less capable. Virtually every task we give an LLM is steeped in human values, culture, and assumptions. Miss those, and you're not maximally useful. And if it's not maximally useful, it's by definition not AGI.

OpenAI and Anthropic have been running this experiment for two years. The results are coming in.

The Experiment

Anthropic and OpenAI have taken different approaches to the relationship between alignment and capability work.

Anthropic's approach: Alignment researchers are embedded in capability work. There's no clear split.

From Jan Leike (former OpenAI Superalignment lead, now at Anthropic):

Some people have been asking what we did to make Opus 4.5 more aligned.

... continue reading