The DGX Spark is a well-rounded toolkit for local AI thanks to solid performance from its GB10 SoC, a spacious 128GB of RAM, and access to the proven CUDA stack. But it's a pricey platform if you don't intend to use its features to the fullest.
Why you can trust Tom's Hardware Our expert reviewers spend hours testing and comparing products and services so you can choose the best for you. Find out more about how we test.
The fruits of the AI gold rush thus far have frequently been safeguarded in proprietary frontier models running in massive, remote, interconnected data centers. But as more and more open models with state-of-the-art capabilities are distilled into sizes that can fit into the VRAM of a single GPU, a burgeoning community of local AI enthusiasts has been exploring what’s possible outside the walled gardens of Anthropic, OpenAI, and the like.
Today's hardware hasn’t entirely caught up to the rapid shift in resources that AI trailblazers demand, though. Thin and light x86-powered "AI PCs” largely constitute familiar x86 CPUs with lightweight GPUs and some type of NPU bolted on to accelerate machine learning features like background blur and replacement. These systems rarely come with more than 32GB of RAM, and their relatively anemic integrated GPUs aren’t going to churn through inference at the rates enthusiasts and developers expect.
Gaming GPUs bring much more raw compute power to the table, but they still aren't well-suited to running more demanding local models, especially large language models. Even the RTX 5090 "only" has 32GB of memory on board, and it’s trivial to exhaust that pool with cutting-edge LLMs. Getting the model into RAM is just part of the problem, too. As conversations lengthen and context lengths grow, the pressure on VRAM only increases.
If you want to get more VRAM on a discrete GPU to hold larger AI models, or more of them at once, you're looking at professional products like an $8500+ RTX Pro 6000 Blackwell and its 96GB of GDDR7 (or two, or three, or four). And that’s not even counting the cost of the exotic host system you’ll need for such a setup. (Can I interest you in a Tinybox for $60K?)
The hunger for RAM extends to other common AI development tasks, like fine-tuning an already trained AI model for better performance on domain-specific data or quantizing an existing model to reduce its resource footprint for less powerful systems.
In the face of this endless hunger for RAM, systems with large pools of unified memory have become attractive platforms for those looking to explore the frontiers of local AI. Apple paved the way with its M-series SoCs, which in their latest and greatest forms pair as much as 512GB of LPDDR5 and copious memory bandwidth with powerful GPUs.
And AMD’s Ryzen AI Max+ 395 (aka Strix Halo) platform has found a niche as a somewhat affordable way to get 128GB of RAM alongside a relatively powerful GPU for local LLM tinkering.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter Get Tom's Hardware's best news and in-depth reviews, straight to your inbox. Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors
... continue reading