Amazon is the latest hyperscaler where employees have been caught inflating AI token consumption to hit internal usage targets, following similar behavior documented at Meta and Microsoft last month, the Financial Times reports.
The company set targets requiring more than 80% of its developers to use AI tools each week and tracked consumption on internal leaderboards. Some employees told FT they had been using MeshClaw, an in-house agent platform that can initiate code deployments, triage emails, and interact with Slack to maximize their token numbers. Amazon said usage statistics would not factor into performance evaluations, but multiple employees said they believed managers were monitoring the data. One said there was "so much pressure to use these tools," another described how tracking created "perverse incentives."
The practice — dubbed "tokenmaxxing" — has become widespread enough to generate its own vocabulary and leaderboards, but beyond workplace culture, if a meaningful share of AI consumption is performative, how reliable are the demand figures that hundreds of billions in AI infrastructure procurement are being allocated against?
Latest Videos From
Combined 2026 capex from Amazon, Microsoft, Alphabet, and Meta is tracking between $650 billion and $700 billion, with some Wall Street projections exceeding $1 trillion for 2027, and every hyperscaler has told investors that inference capacity is being absorbed as fast as it can be deployed. Internal developer consumption is obviously part of that absorption, and it sits alongside paying external customers in the usage data that informs the likes of capacity planning, GPU orders, HBM procurement, and power infrastructure.
Tokenmaxxing doesn’t mean the demand is fabricated — enterprise AI adoption is broadening, and inference workloads are scaling into production — but there’s a distinction between adoption and consumption intensity. The former is a durable driver of demand, whereas the latter is gameable, and it’s currently being amplified by the incentive structures that these companies built. The water is further muddied by reports that AI is more expensive than actual workers.
Meta's internal leaderboard lasted days after public exposure, and Amazon recently restricted visibility of team-wide usage statistics. And when measurement shifts, the consumption intensity they incentivized will shift with them.
Nvidia CEO Jensen Huang has highlighted per-engineer token consumption as a key metric, stating he’d be "deeply alarmed" if a $500,000-a-year engineer was not consuming at least $250,000 in tokens. Nvidia's inference growth obviously depends on that consumption being a productive workload that persists and compounds because every inflated token is real GPU time.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter Get Tom's Hardware's best news and in-depth reviews, straight to your inbox. Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors
Angie Jones, formerly VP of engineering for AI tools at Block, told LeadDev she expected the industry to pivot toward measuring efficient token usage rather than celebrating volume. In a cycle where GPU orders and power commitments are being placed years in advance, the quality of the demand projections behind them matters. The hyperscalers are building for a world where every knowledge worker consumes hundreds of thousands of dollars in annual compute. Whether that consumption proves productive or performative will determine how much of this year's $700 billion generates durable returns.
... continue reading