SmolLM3: smol, multilingual, long-context reasoner
Published July 8, 2025 Update on GitHub
Base model: https://hf.co/HuggingFaceTB/SmolLM3-3B-Base
Instruct and reasoning model: https://hf.co/HuggingFaceTB/SmolLM3-3B
Small language models are becoming increasingly important as users seek capable models that can be deployed efficiently. The community has produced a fascinating range of capable small models, each pushing the boundaries of what's possible at this scale. With SmolLM3, we're excited to contribute a new competitive fully open 3B model:
SmolLM3 sits in the efficiency sweet spot. Our 3B model outperforms Llama-3.2-3B and Qwen2.5-3B while staying competitive with larger 4B alternatives (Qwen3 & Gemma3). Beyond the performance numbers, we're sharing exactly how we built it using public datasets and training frameworks.
Model summary:
3B model trained on 11T tokens, SoTA at the 3B scale and competitive with 4B models
... continue reading