The current AI pricing was always going to go away. It just doesn’t make sense.
Microsoft canceled internal Claude Code licenses this week (for whatever reason, even if it’s because they integrated it), Uber blew its entire 2026 AI budget in four months, and GitHub is dropping flat-rate plans across its products.
You’ll see the framing “the AI subsidy era is ending” which is a polite way of what everyone’s been doing when they slap AI features into every tier of their product on a bet that inference costs would keep falling.
They didn’t and the cost curve is bending the wrong way, and the labs have no choice except to pass that along.
Did we collectively forget second order thinking?
Each model generation, costs per token did fall in theory, sometimes 10x less but that was for comparable quality… Lots of people extrapolated and built business models on the extrapolation, which… isn’t how you think about it.
Second-order thinking anyone?
Everyone who deals with road planning knows about is induced demand. Each new capability invents new demand. Highways are the textbook case. Add a lane, you get new commutes. The commutes weren’t there before the lane. AI is the same shape. Cheaper inference doesn’t reduce the bill, it expands what people ask the model to do.
Now my reasoning queries take >4 minutes, where the old ones took 2m… Agentic workflows make 50 calls where the old workflow made one. Unit cost falls, units explode, but still the total spend goes up.
Anyone selling a flat-rate “AI assistant” assumed user behavior wouldn’t change. It did. It always does.
... continue reading