Smallest transformer that can add two 10-digit numbers

AdderBoard

Challenge: Build the smallest transformer that can add two 10-digit numbers with >= 99% accuracy on a held-out 10K test set.

This started with Addition Under Pressure, where I gave Claude Code and Codex the same prompt: train the smallest possible transformer that can do 10-digit addition with at least 99% accuracy. Claude Code came back with 6,080 parameters and Codex came back with 1,644. The community has since pushed this dramatically lower.

Maintained by Dimitris Papailiopoulos (@dimitrispapail).

We track two categories:

Trained — weights learned from data by any training algorithm (SGD, Adam, evolutionary search, etc.). The algorithm must be generic — it should work with any model and dataset, not just this specific problem. This encourages creative ideas around data format, tokenization, curriculum learning, and architecture search.

— weights learned from data by any training algorithm (SGD, Adam, evolutionary search, etc.). The algorithm must be generic — it should work with any model and dataset, not just this specific problem. This encourages creative ideas around data format, tokenization, curriculum learning, and architecture search. Hand-coded — weights set analytically. This is a constructive proof that the architecture can represent addition, regardless of whether SGD would find it.

Both are valid. Both are interesting.

Leaderboard

Hand-Coded Weights (Constructive Proofs)

... continue reading