Skip to content
Tech News
← Back to articles

What I'm Finding About LLM Code Style and Token Costs

read original more articles
Why This Matters

This article highlights the growing concern over token costs in large language models (LLMs) like Claude, emphasizing how inefficient code styles and excessive output tokens can significantly increase expenses for developers and organizations. It underscores the importance of optimizing prompt design and code practices to manage costs effectively while maintaining quality, which is crucial as AI integration becomes more widespread in the tech industry and for end-users.

Key Takeaways

What I’m Finding About LLM Code Style and Token Costs

Spending output tokens to share it. Before the price spikes.

Jim

Where This Started

I’ve been working through creating and reviewing features with Claude the past year. It’s been remarkable seeing the tension in token consumption and legacy patterns. Right when I think something is complete, a problem surfaces—regression, edge case, whatever. All the while watching the slow, steady and natural march toward eventual full-price rates. Alongside this phenomenon, my accumulated push to stay at the pragmatic edge of modern Web work. The sweet spot where nearly ubiquitous features remove lines of code and improve quality—the place where I keep wondering: why did I get that output? Why did that line of code appear instead of what’s been available for years? I usually dismiss it with the observable fact that Claude is effectively junior level at best, and a useful approximation of the encyclopedic knowledge asked in interviews.

In trying to make progress on something I am finding myself reviewing my practice and looking at where that outrageous token usage is coming from. Every one of those is output tokens, the ones that cost several times more (3x to 5x!!!) than input tokens in API pricing. Patterns that are longer, more fragile, more insecure, and solving problems the platform already solved–often years ago.

It’s enough to start imagining there’s some conspiracy to take the entire web platform backward, right when Ryan Dahl and separately Alex Russell, Dimitri Glazkov (and many others) made Web Components, etc. They literally made the entire Web platform great again. All to eke out some return on the tokens. So for the sake of conspiracy, this is what I’m finding.

Because my background as human being, who uses language, designed typography, programmed early on, alongside drawing and many other eclectic oddities, I actually consider things like tabs as a remarkable innovation. I can literally reduce indentation to 1 character, not some abstraction I have to go ask someone how to define or get permission to use. (I guess I’m just far too egalitarian to appreciate the exclusionary attitude of the entire software community.) I care about humans, and want things to work within some parsimonious baseline. And multiplying stuff by 4 or some arbitrary number just really doesn’t make sense–to me. I could go on, but maybe this grounds the orientation—someone who’s worked with actual language on actual media and has opinions about when something works and when it doesn’t. That part tends to speak for itself.

I mention this because it colors what I looked into from a purely pragmatic standpoint. I’m not arguing for a specific position where everyone uses tabs (despite that speaking for itself). I’m disclosing background that shaped opinions I’d been sitting on—there was always an economic argument I kept to myself, and it’s now showing up in real API costs. My opinions on convention are not the article. The token usage optimizations are what I came here to share. So you can benefit too. If you want to keep using multiple spaces, I’ll remind myself that the literature said it seemed ok and the LLM doesn’t know any better.

The Easiest Token Optimization on the Planet Is Already in the Runtime

... continue reading