Appearing productive in the workplace

Appearing Productive in The Workplace

Parkinson’s Law states that work expands to fill the time available. In the era of AI, workers now have a tool that expands to fill whatever a large language model can be persuaded to generate, which is to say, without limit.

What I have watched happen in my profession in the last two years, I am still struggling to describe. The first time I knew something was wrong, roughly a year and a quarter ago, I noticed a colleague replying to me using AI. His response was obviously generated by Claude. The punctuation gave it away — em dashes where no one types em dashes, the rhythmic structure, the confident grasp of technologies I knew for a fact he did not understand. I sat with it for a while, weighing whether to debate someone who was visibly copy-pasting verbatim from a model. The channel was public, and I spent more time than I should have correcting fundamentals. Eventually I stopped. He was not, in any meaningful sense, on the other side of the conversation.

Generative AI can produce work that looks expert without being expert, and the failure arrives in two shapes. The first is when novices in a field are able to produce work that resembles what their seniors produce, faster or more advanced than their judgment. The second is when people generate artifacts in disciplines they were never trained in. The two failures look similar from a distance and are not the same. Research has mostly measured the first. The second is what it is missing, and in my experience it is the riskier of the two.

Cross domain generation

People who cannot write code are building software. People who have never designed a data system are designing data systems. Most of it is not shipped; it is built, often for many hours, possibly shown internally with great vigor, used quietly, and occasionally surfaced to a client without much fanfare. Workers can obsess over an idea, working many hours overtime. There are a few practitioners who use the current agentic tools to do complex things properly, but they are scarce and as I find, typically in code generation. AI, for all its capabilities at the level of the individual, has not scaled properly in my workplace.

I have a colleague, a careful and intelligent person in a role that is not engineering, who spent two months earlier this year building a system that should have been designed by someone with formal training in data architecture. He used the tools well, by the standards by which use of the tools is currently measured. He produced a great deal of code, a great deal of documentation, a great deal of what looked, to anyone who did not know what to look for, like progress. He could not, when asked, explain how any of it actually worked. The work was wrong from the first day. The schemas, and more importantly the objectives, were wrong in a way that would have been obvious to anyone with two years in the field. Several of us did know. When opinions were voiced even as high as a V.P., he fought back. The room had been arranged in such a way that saying so was not a contribution; his managers were too invested in the appearance of momentum to want the appearance disturbed. The work will continue, in all probability, until it is shown to a stakeholder, and they decide not to invest.

This is the part of the phenomenon I find hardest to write about. The tool did not make him a worse colleague. It made him able to impersonate, for months, a discipline he had never trained in, and the impersonation was good enough that the institutional incentives all bent toward letting him continue. Perhaps it’s a failure of management, but I have been finding management to be so eager to embrace AI that they’re willing to accept the risk.

It would be tolerable, perhaps, if the tool offered an honest assessment of what it had produced. The Cheng et al. Stanford study published in Science this spring [1] confirmed what every regular user already knew: leading models are roughly fifty percent more agreeable than human respondents, affirming the user even where the affirmation is unwarranted. Berkeley CMR meta-analyses [4] found AI-literate users often overestimate their performance. Particularly interesting when workers stray outside of their training. An NBER study of support agents [2] found generative AI boosted novice productivity by about a third while barely helping experts. Harvard Business School researchers found the same pattern in consulting work [3]. So you have overconfident, novices able to improve their individual productivity in an area of expertise they are unable to review for correctness. What could go wrong?

The conduit problem

... continue reading