Skip to content
Tech News
← Back to articles

A 10 year old Xeon is all you need (for 26B-A4B MTP Drafters without GPU)

read original get Intel Xeon E5-2670 → more articles
Why This Matters

This article demonstrates that even a decade-old server with modest hardware, like a Xeon E5-2620 v4 and DDR3 RAM, can effectively run large language model inference tasks without a GPU. This insight is significant for the tech industry and consumers, as it highlights the potential for cost-effective, accessible AI deployment on older hardware, reducing barriers to entry and expanding AI accessibility. It also underscores the importance of optimizing software to leverage existing resources efficiently.

Key Takeaways

Published on June 01, 2026 A 10 year old Xeon is all you need 17 minutes read

The previous post covered getting Gemma 4’s MTP drafters quantized and paired with a verifier. This one is about running the result on a machine that has no business running it.

I have a recycled server. To its credit, it has a whopping 128 GB RAM, but it’s DDR3… That RAM is 5-6 times slower than the current best laptop ram. It also has a single Intel Xeon E5-2620 v4 from 2016, which is about 5 times slower than my laptops CPU…

Oh, and as I did mention, we have no GPU. And no, the Xeon does not have an integrated GPU.

But, just hear me out…

If we were to just break out ollama here, well… as explained in earlier blog posts, we can’t. And we’d be lucky if we could in 6 months when they add support for the model we need, if they ever do. Might be they never do. And even still, ollama simply doesn’t expose enough knobs for us to ever make this run well, neither does even the standard llama-cpp .

But. Why would that stop us?

I’ve recieved feedback that some of the previous posts were too high level, I’ll try to make things as clear as reasonably possible here. If you’re a tech worker, or a Linux enthusiast that has built a computer and used something like ChatGPT, most of this should be approachable.

So, just to really set the stage fully. The hardware, per lscpu :

CPU: Intel Xeon E5-2620 v4 @ 2.10 GHz

... continue reading