Generative AI's crippling failure to induce robust models of the world

Synthesized video from Dawid van Straaten, prompt (“Generate me a video of two men playing chess”) in which the player for black reaches across the table and, in the midst of a rather unusual position moves his opponent’s pawn horizontally, and quite illegally, several squares across the board.

A few weeks ago, I had the singular honor of recording a podcast (to be released soon) with one of my heroes, Garry Kasparov, not only one of the greatest chess players of all time, but also one of the bravest, most foresightful people I know. I wish with all my heart that more people had heeded his warnings about both Russia and the United States.

This essay, which I started in preparation for our recording, looks at chess (though not only chess) as a window into LLMs and one of their most serious, yet relatively rarely commented-upon shortcomings: their inability to build and maintain adequate, interpretable, dynamically updated models of the world — a liability that is arguably even more fundamental than the failures in reasoning recently documented by Apple.

A world model (or cognitive model) is a computational framework that a system (a machine, or a person or other animal) uses to track what is happening in the world. World models are not always 100% accurate, or complete, but I believe that they are absolutely central to both human and animal cognition.

Another of my heroes, the cognitive psychologist Randy Gallistel, has written extensively about how even some of the simplest animals, like ants, use cognitive models, which they regularly update, in tasks such as navigation. A wandering ant, for example, tracks where it is through the process of dead reckoning. An ant uses variables (in the algebraic/computer science sense) to maintain a readout of its location, even as as it wanders, constantly updated, so that it can directly return to its home.

In AI, what I would call a cognitive model is often called a world model, which is to say it is some piece of software’s model of the world. I would define world models as persistent, stable, updatable (and ideally up-to-date) internal representations of some set of entities within some slice of the world. One might, for example, use a database to track a set of individuals over time, including, for example, their addresses, telephone numbers, social security numbers, etc. Every physics engine (and video game) has a model of the world, too (tracking, for example, a set of entities and their locations and their properties and their motions).

Here’s the crux: in classical artificial intelligence, and indeed classic software design, the design of explicit world models is absolutely central to the entire process of software engineering. LLMs try — to their peril — to live without classical world models.

As the title of a book by the late Turing Award winner Niklaus Wirth put it, Algorithms + Data Structures = Programs, and world models are central to those data structures. In a video game, a world model (sometimes nowadays implemented as a scene graph) might include detailed maps, information about the location of particular characters, the main character’s inventory, and so on; in a word processor, one could consider the user’s document and the file system to be part of the program’s model of its world, and so on.

In classical AI, models are absolutely central — and pretty much always have been. Alan Turing made a dynamic world model, updated after every move, central to his chess program, now known as Turochamp, written in 1949 (designed, amazingly, even before he had the hardware to try it on).

The idea of (world) models was no less central to the thinking of Nobel Laureate and AI co-founder Herb Simon, who entitled his memoir Models of My Life. His systIn 1957, the system General Problem Solver —which could dance rings around o3 in solving the Tower of Hanoi started with (world) models of whatever problem was to be solved.

... continue reading