Skip to content
Tech News
← Back to articles

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

read original get Intel Optane Memory Module → more articles
Why This Matters

This innovative use of second-hand Intel Optane DIMMs to run a trillion-parameter language model demonstrates a cost-effective approach to large-scale AI inference, highlighting the potential of unconventional hardware configurations in the tech industry. While performance remains modest, this achievement underscores the possibilities for enthusiasts and researchers to push the boundaries of AI with affordable, readily available components.

Key Takeaways

A Redditor has caused a stir by coaxing a workstation build using Optane PMem DIMMs as RAM to run a 1-trillion-parameter LLM. APFrisco explains in a mini tutorial/guide on the Local LLaMA subreddit how they bought some used Intel Optane Persistent Memory, acquired relatively cheaply second-hand, to “run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second” on their Xeon workstation.

Central to the headlining feat was the Redditor’s sourcing of six Optane PMem (DCPMM) sticks. The discontinued memory format was designed to bridge the DRAM-SSD divide. While the 768GB of Optane (6x 128GB) does indeed offer far lower latency than the best NVMe SSDs, it is still two or three times slower than DRAM. These characteristics are still rather sweet for LLM inference frameworks, and the second-hand price was “much less than what the equivalent DRAM capacity would cost.” But, alas, Optane is dead, so this is an exotic solution.

APFrisco’s hardware specs were given as follows:

Latest Videos From

Intel Xeon Gold 6246 CPU

Tyan S5630GMRE-CGN motherboard

Asus Dual GeForce RTX 3060 OC 12GB GPU

6x 32GB Samsung 2666MHz DDR4 ECC DRAM sticks

6x 128GB Intel Optane DCPMM PC4-2666 NMA1XBD128GQS persistent memory modules

Western Digital WD SN850X 2TB M.2 2280 NVMe SSD

... continue reading