Show HN: MacMind – A transformer neural network in HyperCard on a 1989 Macintosh

MacMind

A complete transformer neural network implemented entirely in HyperTalk, trained on a Macintosh SE/30.

MacMind is a 1,216-parameter single-layer single-head transformer that learns the bit-reversal permutation -- the opening step of the Fast Fourier Transform -- from random examples. Every line of the neural network is written in HyperTalk, a scripting language from 1987 designed for making interactive card stacks, not matrix math. It has token embeddings, positional encoding, self-attention with scaled dot-product scores, cross-entropy loss, full backpropagation, and stochastic gradient descent. No compiled code. No external libraries. No black boxes.

Option-click any button and read the actual math.

Why This Exists

The same fundamental process that trained MacMind -- forward pass, loss computation, backward pass, weight update, repeat -- is what trained every large language model that exists today. The difference is scale, not kind. MacMind has 1,216 parameters. GPT-4 has roughly a trillion. The math is identical.

We are at a moment where AI affects nearly everyone but almost nobody understands what it actually does. MacMind is a demonstration that the process is knowable -- that backpropagation and attention are not magic, they are math, and that math does not care whether it is running on a TPU cluster or a 68000 processor from 1987.

Everything is inspectable. Everything is modifiable. Change the learning rate, swap the training task, resize the model -- all from within HyperCard's script editor. This is the engine with the hood up.

What It Learns

The bit-reversal permutation reorders a sequence by reversing the binary representation of each position index. For an 8-element sequence:

... continue reading