Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs

LLM Neuroanatomy: How I Topped the AI Leaderboard Without Changing a Single Weight

In mid-2024, the HuggingFace Open LLM Leaderboard was the Colosseum for Open-Weight AI. Thousands of models were battling it out, submitted by both well-funded labs with teams of PhDs and fine-tuning wizards creating fantastically named models (e.g. Nous-Hermes, Dolphin and NeuralBeagle14-7B…), fighting for the top spot across six benchmarks: IFEval, BBH, MATH Lvl 5, GPQA, MuSR, and MMLU-PRO.

And there at #1 was dnhkng/RYS-XLarge . Mine.

I didn’t train a new model. I didn’t merge weights. I didn’t run a single step of gradient descent. What I did was much weirder: I took an existing 72-billion parameter model, duplicated a particular block of seven of its middle layers, and stitched the result back together. No weight was modified in the process. The model simply got extra copies of the layers it used for thinking?

This is the story of how two strange observations, a homebrew “brain scanner” for Transformers, and months of hacking in a basement led to the discovery of what I call LLM Neuroanatomy, and a finding about the internal structure of AI that still hasn’t been published until now *.

* - because I discovered blogging is way more fun than drafting scientific papers, and I walk you through how the discovery was made :)

Let’s start with how this whole project came into being.

“The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ but ‘That’s funny…’“ — Isaac Asimov

Clue #1: You Can Chat with an LLM in Base64

In late 2023, I was messing about with a bizarre LLM quirk. Try this yourself - take any question, e.g.

... continue reading