Skip to content
Tech News
← Back to articles

Show HN: MacMind – A transformer neural network in HyperCard on a 1989 Macintosh

read original get Vintage Macintosh Emulator Software → more articles
Why This Matters

MacMind demonstrates that fundamental AI processes like backpropagation and attention are purely mathematical and can be implemented on vintage hardware using simple scripting languages. This project highlights the accessibility and transparency of neural network mechanics, emphasizing that AI's complexity is rooted in math rather than mystique, even on a 1987 Macintosh. It serves as a reminder that understanding AI's core principles is achievable and that the technology is more approachable than often perceived.

Key Takeaways

MacMind

A complete transformer neural network implemented entirely in HyperTalk, trained on a Macintosh SE/30.

MacMind is a 1,216-parameter single-layer single-head transformer that learns the bit-reversal permutation -- the opening step of the Fast Fourier Transform -- from random examples. Every line of the neural network is written in HyperTalk, a scripting language from 1987 designed for making interactive card stacks, not matrix math. It has token embeddings, positional encoding, self-attention with scaled dot-product scores, cross-entropy loss, full backpropagation, and stochastic gradient descent. No compiled code. No external libraries. No black boxes.

Option-click any button and read the actual math.

Why This Exists

The same fundamental process that trained MacMind -- forward pass, loss computation, backward pass, weight update, repeat -- is what trained every large language model that exists today. The difference is scale, not kind. MacMind has 1,216 parameters. GPT-4 has roughly a trillion. The math is identical.

We are at a moment where AI affects nearly everyone but almost nobody understands what it actually does. MacMind is a demonstration that the process is knowable -- that backpropagation and attention are not magic, they are math, and that math does not care whether it is running on a TPU cluster or a 68000 processor from 1987.

Everything is inspectable. Everything is modifiable. Change the learning rate, swap the training task, resize the model -- all from within HyperCard's script editor. This is the engine with the hood up.

What It Learns

The bit-reversal permutation reorders a sequence by reversing the binary representation of each position index. For an 8-element sequence:

... continue reading