Today, we're releasing LFM2.5-8B-A1B, an edge model built for fast, reliable tool calling on consumer hardware.
It builds on our LFM2-8B-A1B release from October 2025, with an expanded 128K context window, scaled-up pretraining (from 12T to 38T tokens), and large-scale reinforcement learning. We also doubled its vocabulary to improve tokenization efficiency for non-Latin languages. The result is a model that chains tool calls, achieves tasks, and fits comfortably even on an entry-level laptop.
The base (LFM2.5-8B-A1B-Base) and post-trained (LFM2.5-8B-A1B) models are available today on Hugging Face and our Playground. Check out our docs on how to run and fine-tune them locally.
*AA-Omniscience Index (higher is better) rewards correct answers and penalizes hallucinations. Scores range from -100 to 100. See more results on Artificial Analysis.
Highlights
On-device personal assistant. Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices.
Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices. Compressed performance. Competitive with much larger dense and MoE models on instruction following and agentic tasks.
Competitive with much larger dense and MoE models on instruction following and agentic tasks. Unmatched throughput. Fastest in its size class on both CPU and GPU inference, with day-one support for llama.cpp, MLX, vLLM, and SGLang.
What changed since LFM2-8B-A1B
Compared to LFM2-8B-A1B, this new version expands the context window from 32,768 to 128,000 tokens. This allows the model to process longer documents and reason for longer. Its vocabulary size was also scaled up from 65,536 to 128,000 to tokenize non-Latin scripts more efficiently. We see particularly strong compression gains in Hindi, Thai, Vietnamese, Indonesian, and Arabic. The rest of the architecture follows the same combination of MoE, GQA, and gated short convolution blocks as LFM2-8B-A1B, as shown in the following figure.
... continue reading