Codegen is not productivity

There is a whole lot to say about generative AI. LLMs generate a bunch of code, this much is certainly true. Should we celebrate that? There is a long tradition of trying to measure software development output, and most of it tells us that lines of code is a poor metric of programmer productivity. I have some thoughts.

I have seen many people talk about the productivity they get from LLMs in terms of the code it generates for them. I have seen claims of 10,000 lines of code in a day or hundreds of thousands of lines in a week; these often seem like brags or at least they are presented positively. I do not believe that LLMs and generative AI change anything fundamental about using lines of code as a measure of output or productivity.

This is a rant. This is what I think about when I hear people talking about lines of code, whether generated by an LLM or pouring forth from human hands. I do not think anyone should celebrate code output.

From the preface to the first edition of SICP:

First, we want to establish the idea that a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute. Second, we believe that the essential material to be addressed by a subject at this level is not the syntax of particular programming-language constructs, nor clever algorithms for computing particular functions efficiently, nor even the mathematical analysis of algorithms and the foundations of computing, but rather the techniques used to control the intellectual complexity of large software systems.

In other---worse---words, programming is not about writing code that makes the computer do a specific thing, or at least not exclusively or primarily about that. Programming is an exercise in representing abstract ideas and managing complexity while doing that. Programming is as often an exploration of these things as it is an implementation of them.

I will note that none of the ideas below are new or original. I encourage you to check out the appendix of anecdotes and quotes for many takes on this. For just about as long as we have had programming languages, experts and more have argued that code should be thought of as a liability, not an asset; some of the anecdotes are about this and you can find many more online; this is a critical thing to keep in mind.

An average human could probably type about 4,000 lines of code a day. That said, developers do not spend all their time writing code. In fact, developers spend most of their time on activities other than coding.

LOC is a poor predictor of---and is poorly predicted by---other metrics of interest in software development, including defects, effort, and time.

... continue reading