Use Boring Languages with LLMs

Use boring languages with LLMs

I'm Jacob - I run Sancho Studio a software consulting group, we help companies with technical leadership, strategy, and security.

I keep coming back to this idea that consistency compounds. I’ve noticed it acutely as a consultant working on multiple different projects in the last two years. Large language models amplify inconsistent technology and quietly reinforce consistent ones. The languages and ecosystems that suffer from the most fragmentation produce the worst agentic output, and the ones with the strongest conventions produce the best. I think this effect will increasingly determine which tool survives in the paradigm of massive models trained on large corpuses.

Even if code is cheap, running inference is a gamble. It’s impossible to know if at any moment the model will make a decision to install a package or produce a bizarre coding pattern from 2019. If we consider that we’re gambling with tokens, we should bet on the set of embeddings which represent strongly consistent and reinforced model weights to produce median output. For software development this is actually ideal as the median program is typically doing the basics: processing information, reading/writing files, responding to network requests, etc.

Before AI, engineers complained about languages which reinvented themselves on what felt like an annual basis. These complaints were real but mostly aesthetic and symptomatic of a frustration that humans needed to maintain or keep up with needlessly changing ecosystems.

If we look back, the 2024 State of JS survey describes a relatively fragmented ecosystem. For a human, fragmentation is annoying. For a model trained on the public corpus of all of it; fragmentation is something closer to a problem which needs to be solved in reinforcement learning or agent harnesses (e.g. Claude Code leaked and showed us that Anthropic hard-coded some bias for JavaScript frameworks).

Python is the same story but sung in a different key. Asking a simple question like “which package manager are you using?” produces a matrix of language versions, package manager version, and OS compatibility which I find to be completely mind-numbing as a technical lead.

Should one use pip, poetry, or uv? Does your toolchain matter, or are you cross-compiling? How do you know if a Python package silently has a link to a C dependency? Are you using async, or have you reached for a task queue instead? Django or FastAPI?

xkcd 1987

From a model’s standpoint, there are simply too many ways to write any of this, and the corpus reflects every one of them in roughly equal weight unless recency bias is introduced at training time. What this means is straightforward to me and should be to you as well… Languages and ecosystems with low variance in their training corpus are represented better and executed more reliably by coding agents. In higher dimension vector spaces, cosine similarity in the training data is the substrate on which the model’s attention and MLP layers learn to predict the next token. A consistent corpus produces consistent inference tokens.

... continue reading