Software development may become (at least in some aspects) more like witchcraft than engineering. The present enthusiasm for “AI coworkers” is preposterous. Automation can paradoxically make systems less robust; when we apply ML to new domains, we will have to reckon with deskilling, automation bias, monitoring fatigue, and takeover hazards. AI boosters believe ML will displace labor across a broad swath of industries in a short period of time; if they are right, we are in for a rough time. Machine learning seems likely to further consolidate wealth and power in the hands of large tech companies, and I don’t think giving Amazon et al. even more money will yield Universal Basic Income.
Decades ago there was enthusiasm that programs might be written in a natural language like English, rather than a formal language like Pascal. The folk wisdom when I was a child was that this was not going to work: English is notoriously ambiguous, and people are not skilled at describing exactly what they want. Now we have machines capable of spitting out shockingly sophisticated programs given only the vaguest of plain-language directives; the lack of specificity is at least partially made up for by the model’s vast corpus. Is this what programming will become?
In 2025 I would have said it was extremely unlikely, at least with the current capabilities of LLMs. In the last few months it seems that models have made dramatic improvements. Experienced engineers I trust are asking Claude to write implementations of cryptography papers, and reporting fantastic results. Others say that LLMs generate all code at their company; humans are essentially managing LLMs. I continue to write all of my words and software by hand, for the reasons I’ve discussed in this piece—but I am not confident I will hold out forever.
Some argue that formal languages will become a niche skill, like assembly today—almost all software will be written with natural language and “compiled” to code by LLMs. I don’t think this analogy holds. Compilers work because they preserve critical semantics of their input language: one can formally reason about a series of statements in Java, and have high confidence that the Java compiler will preserve that reasoning in its emitted assembly. When a compiler fails to preserve semantics it is a big deal. Engineers must spend lots of time banging their heads against desks to (e.g.) figure out that the compiler did not insert the right barrier instructions to preserve a subtle aspect of the JVM memory model.
Because LLMs are chaotic and natural language is ambiguous, LLMs seem unlikely to preserve the reasoning properties we expect from compilers. Small changes in the natural language instructions, such as repeating a sentence, or changing the order of seemingly independent paragraphs, can result in completely different software semantics. Where correctness is important, at least some humans must continue to read and understand the code.
This does not mean every software engineer will work with code. I can imagine a future in which some or even most software is developed by witches, who construct elaborate summoning environments, repeat special incantations (“ALWAYS run the tests!”), and invoke LLM daemons who write software on their behalf. These daemons may be fickle, sometimes destroying one’s computer or introducing security bugs, but the witches may develop an entire body of folk knowledge around prompting them effectively—the fabled “prompt engineering”. Skills files are spellbooks.
I also remember that a good deal of software programming is not done in “real” computer languages, but in Excel. An ethnography of Excel is beyond the scope of this already sprawling essay, but I think spreadsheets—like LLMs—are culturally accessible to people who are do not consider themselves software engineers, and that a tool which people can pick up and use for themselves is likely to be applied in a broad array of circumstances. Take for example journalists who use “AI for data analysis”, or a CFO who vibe-codes a report drawing on SalesForce and Ducklake. Even if software engineering adopts more rigorous practices around LLMs, a thriving periphery of rickety-yet-useful LLM-generated software might flourish.
Executives seem very excited about this idea of hiring “AI employees”. I keep wondering: what kind of employees are they?
Imagine a co-worker who generated reams of code with security hazards, forcing you to review every line with a fine-toothed comb. One who enthusiastically agreed with your suggestions, then did the exact opposite. A colleague who sabotaged your work, deleted your home directory, and then issued a detailed, polite apology for it. One who promised over and over again that they had delivered key objectives when they had, in fact, done nothing useful. An intern who cheerfully agreed to run the tests before committing, then kept committing failing garbage anyway. A senior engineer who quietly deleted the test suite, then happily reported that all tests passed.
You would fire these people, right?
... continue reading