Opus 4.5 is out and people cannot stop raving about it. AGI is nigh! It's a step-change in capabilities!
Don't get me wrong. It's very impressive. But after trying it out in a real codebase for a few weeks, I think that view is overly simplistic. Claude is now incredibly good at assembling well-designed blocks – but it still falls apart when it has to create them.
To demonstrate, I'll run through three real examples: a Sentry debugging loop where Claude ran on its own for 90 minutes and solved the problem; an AWS migration it one-shotted in three hours; and a React refactor where it proposed a hack that would have made our codebase worse.
The same pattern explains all three. And in doing so, it also demonstrates what senior engineers actually do – and why we'll be safe from AGI for a long time.
🎯 — Ryan Nystrom (@ryannystrom) January 10, 2026 The tl;dr.
The Good
Running a Playwright-and-Sentry debugging loop
The most impressive thing Claude Code has done for me is debug, on its own.
I was trying to attach Sentry to our system. Sentry is a wonderful service that creates nice traces of when parts of your code run. This makes it easy to figure out why it’s running slower than you expect.
An example sentry trace, showing that we need to optimize our database lookups...
... continue reading