CPUs Aren't Dead. Gemma 2B Just Scored Higher Than GPT-3.5 Turbo on the Test That Made It Famous — Your Laptop Can Run It, or Cloudflare for $5/Mo.
Gemma 2B scored ~8.0 on MT-Bench. GPT-3.5 Turbo scored 7.94. An 87-times-smaller model on a laptop CPU, no GPU anywhere in the stack. We published the full tape — every question, every turn, every score — so anyone can verify it. We found seven failure classes. Not hallucinations. Specific patterns: arithmetic where it computed correctly but committed the wrong number first, logic puzzles where it proved the right answer then shipped the wrong one, constraints it drifted on, personas it broke, qualifiers it ignored. Six surgical fixes, about 60 lines of Python each. One known limitation documented. Score climbed to ~8.2. The hardware was enough all along. What the field has been calling a compute problem is a software engineering problem — and any motivated developer can close that gap in a weekend. The tape, the code, and the fixes are all open. A bot running the raw model — no fixes applied, warts and all — is live on Telegram right now. Talk to it. Push it. Break it. Then read about what you just experienced.
Run it yourself for free, forever:
pip install torch transformers accelerate python chat.py # full script below
Works offline after the first download. No account. No API key. Your laptop. Your data. Nobody else involved.
Want it globally accessible? Cloudflare Containers, $5/month. Scales to zero. Sleeps when idle. Wakes on request. Details below.
Or preview it first — no install needed.
A bot running the raw model — no guardrails, no scaffolding — is live on Telegram right now. The same inference path that produced every score in this article. Give it 30–60 seconds per response. It is thinking on a CPU, not streaming from a GPU cluster.
Live on Telegram now t.me/CPUAssistantBot Go to SeqPU.com, create a free API key, send /connect yourkey.access in Telegram. Every account comes with enough free credits for hundreds of messages.
Real conversation with @CPUAssistantBot — text in, voice in, story out. Nobody else saw this.
... continue reading