LLMs aren't world models

I believe that language models aren’t world models. It’s a weak claim — I’m not saying they’re useless, or that we’re done milking them. It’s also a fuzzy-sounding claim — with its trillion weights, who can prove that there’s something an LLM isn't a model of? But I hope to make my claim clear and persuasive enough with some examples.

A friend who plays better chess than me — and knows more math & CS than me - said that he played some moves against a newly released LLM, and it must be at least as good as him. I said, no way, I’m going to cRRRush it, in my best Russian accent. I make a few moves – but unlike him, I don't make good moves , which would be opening book moves it has seen a million times; I make weak moves, which it hasn't . The thing makes decent moves in response, with cheerful commentary about how we're attacking this and developing that — until about move 10, when it tries to move a knight which isn't there, and loses in a few more moves. This was a year or two ago; I’ve just tried this again, and it lost track of the board state by move 9.

When I’m saying that LLMs have no world model, I don’t mean that they haven't seen enough photos of chess knights, or held a knight in their greasy fingers; I don’t mean the physical world, necessarily. And I obviously don’t mean that a machine can’t learn a model of chess, when all leading chess engines use machine learning. I only mean that, having read a trillion chess games, LLMs, specifically, have not learned that to make legal moves, you need to know where the pieces are on the board. Why would they? For predicting the moves or commentary in chess games, which is what they’re optimized for, this would help very marginally, if at all.

Of course, nobody uses LLMs as chess engines — so whatever they did learn about chess, they learned entirely “by accident”, without any effort by developers to improve the process for this kind of data. And we could say that the whole argument that LLMs learn about the world is that they have to understand the world as a side effect of modeling the distribution of text — which is soundly refuted by them literally failing to learn the first thing about chess. But maybe we could charitably assume that LLMs fail this badly with chess for silly reasons you could easily fix, but nobody bothered. So let’s look at something virtual enough to learn a model of without having greasy fingers to touch it with, but also relevant enough for developers to try to make it work.

So, for my second example, we will consider the so-called “normal blending mode” in image editors like Krita — what happens when you put a layer with some partially transparent pixels on top of another layer? What’s the mathematical formula for blending 2 layers? An LLM replied roughly like so:

In Krita Normal blending mode, colors are not blended using a mathematical formula. The "Normal" mode simply displays the upper layer's color, potentially affected by its transparency, without any interaction or calculation with the base layer's color. (It then said how other blending modes were different and involved mathematical formulas.)

This answer tells us the LLM doesn't know things such as:

Computers work with numbers. A color is represented by a number in a computer.

Therefore, a color cannot be blended by something other than a mathematical formula — nor can it be “affected” without a “calculation” by transparency, another number.

“Transparency” is when you can see through something.

... continue reading