Microsoft’s “1‑bit” AI model runs on a CPU only, while matching larger systems
Published on: 2025-04-20 06:46:06
When it comes to actually storing the numerical weights that power an LLM's underlying neural network, most modern AI models rely on the precision of 16- or 32-bit floating point numbers. But that level of precision can come at the cost of large memory footprints (in the hundreds of gigabytes for the largest models) and significant processing resources needed for the complex matrix multiplication used when responding to prompts.
Now, researchers at Microsoft's General Artificial Intelligence group have released a new neural network model that works with just three distinct weight values: -1, 0, or 1. Building on top of previous work Microsoft Research originally published in 2023, the new model's "ternary" architecture offers a reduction in overall complexity and "substantial advantages in computational efficiency," the researchers write, allowing it to run effectively on a simple desktop CPU. And despite the massive reduction in weight precision, the researchers claim that the model
... Read full article.