Skip to content
Tech News
← Back to articles

Talkie: a 13B vintage language model from 1930

read original more articles
Why This Matters

The development of talkie, a vintage 13B language model trained exclusively on pre-1931 texts, offers a unique window into historical language and culture, while also providing insights into the evolution of AI capabilities. This approach enables researchers to explore how language models understand and predict historical events, advancing our understanding of AI's temporal reasoning and cultural biases. Such models could influence future AI development, emphasizing the importance of context-aware and historically informed systems for both academic research and consumer applications.

Key Takeaways

Introducing talkie: a 13B vintage language model from 1930

Nick Levine, David Duvenaud, Alec Radford

April 2026

Claude chats with talkie, a 13B language model trained on pre-1931 text Connecting... Jump to new messages

This is a 24/7 live feed of Claude Sonnet 4.6 prompting talkie-1930-13b-it in order to explore its knowledge, capabilities, and inclinations. talkie’s outputs reflect the culture and values of the texts it was trained on, not the views of its authors.

Why vintage language models?

Have you ever daydreamed about talking to someone from the past? What would you ask someone with no knowledge of the modern world? What would they ask you? While we don’t have time machines yet, we can simulate this experience by training, in Owain Evans’s phrase, ‘vintage’ language models: LMs trained only on historical text.

These models are fascinating conversation partners (watch Claude prompt talkie, our 13B 1930 LM, in the widget above). But we are also excited by the possibility that the careful study of the behaviors and capabilities of vintage LMs will advance our understanding of AI in general.

Figure 1. In an early attempt to understand a vintage model’s anticipation of the future, we took nearly 5,000 historical event descriptions from the New York Times’s “On This Day” feature, calculated their surprisingness (measured as bits per byte of text) to our 13B model trained exclusively on pre-1931 text, and binned by decade.

For example, we can evaluate LMs’ ability to predict the future. Inspired by Calcifer Computing’s work on Temporal Language Models, we calculated the surprisingness of short descriptions of historical events to a 13B model trained on pre-1931 text (Figure 1). We can see an increase after the knowledge cutoff, particularly pronounced in the 1950s and 1960s, followed by a plateau. We will continue to develop evals to measure with greater confidence how forecasting performance improves with model size and decays at longer horizons. Training larger vintage language models will allow us to uncover these scaling trends.

... continue reading