Boring is good - GoKawiil

The initial, feverish enthusiasm for large language models (LLMs) is beginning to cool, and for good reason. It’s time to trade the out-of-control hype for a more pragmatic, even “boring,” approach. A recent MIT report shows that 95% of companies implementing this technology have yet to see a positive outcome. It’s understandable to feel confused.

When I get confused, I write. This is why I wrote the first part of this series, Hype is a Business Tool as the online debate had become so overheated. In part 2, The Timmy Trap, I covered why we are, surprisingly, a large part of this hype problem. We’ve allowed ourselves to be fooled, confusing an LLM’s language fluency with actual intelligence. LLMs have effectively hacked our social protocols, fooling us into believing they are more intelligent than they are.

So in this final part, I want to answer the question: why should we still care? The tech is problematic, and signs point to the bubble bursting. When we hit the “Trough of Disillusionment,” what rises from the ashes? Two lessons from my career help me navigate uncertainty: 1. technology flows downhill, and 2. we usually start on the wrong path.

Lesson 1: Tech flows downhill

In his 1989 paper, The Dynamo and the Computer, Paul David describes how as technology matures, its impact changes dramatically. He uses the example of the dynamo, an old-fashioned term for a powerful electric motor. This power source completely changed the Industrial Revolution.

Early factories were tied to rivers to harness water power, but the dynamo freed them from this geographic limitation. Initially, factories had just one large dynamo, which required a complicated system of pulleys to distribute power to the rest of the building. This made the factory’s workflow convoluted. But as dynamos became smaller and more affordable, factories were able to put them in multiple locations. This second development was even more liberating than the first because it allowed for the creation of the assembly line. The power could now adapt to the workflow, instead of the other way around, which led to a major boost in productivity.

David used this historical shift as an analogy for what was happening in the late 1980s. Instead of everyone having to work around a single, clunky mainframe, the new, smaller desktop computers were conforming to the workflows of the modern office. This same pattern, from large and centralized to small and distributed, is happening with LLMs right now.

This downsizing of LLMs is mostly being pushed by the open-source community, which is creating a wide variety of models that challenge this assumption that we need bigger, centralized models. These smaller forms of LLM are called SLMs (Small Language Models) that are trained on much smaller sets of data, with far fewer parameters, and reduced quantization. Microsoft’s Phi3 model is very reasonable for small tasks and runs on my 8 year old PC without using more than 10% of the CPU.

But I can understand why you’d be skeptical. These smaller open-source models, while very good, usually don’t score as well as the big foundational models by OpenAI and Google which makes them feel second-class. That perception is a mistake. I’m not saying they perform better; I’m saying it doesn’t matter. We’re asking them the wrong questions. We don’t need models to take the bar exam.

Several companies are experimenting with better questions, using SLMs for smaller, even invisible tasks. For example, performing query rewrites behind the scenes. This is a vastly simpler task. The user has no idea an LLM is even involved; they just get better results. By sticking to lower level syntactic tasks, they’re not asking LLMs to pretend to be human which generates no hallucinations! What’s even more exciting about this use case is that the company could likely use a very small, bespoke, and local LLM for this.

... continue reading