Unsloth Dynamic 2.0 GGUFs

We're excited to introduce our Dynamic v2.0 quantization method - a major upgrade to our previous quants. This new method outperforms leading quantization methods and sets new benchmarks for Aider Polglot, 5-shot MMLU and KL Divergence.

This means you can now run + fine-tune while preserving as much accuracy as possible! You can run the 2.0 GGUFs on most inference engines like llama.cpp, LM Studio etc.

bell-ring

(lower is better) Qwen3.5 Perplexity Benchmarks (lower is better)

circle-info You asked for tougher benchmarks, so we’re showcasing Aider Polyglot results! Our Dynamic 3-bit DeepSeek V3.1 GGUF scores 75.6%, surpassing many full-precision SOTA LLMs.

Detailed analysis of our benchmarks and evaluation further below.

DeepSeek-V3.2 Thinking Llama 4 5-shot MMLU Benchmarks

hashtag 💡 What's New in Dynamic v2.0?

Revamped Layer Selection for GGUFs + safetensors: Unsloth Dynamic 2.0 now selectively quantizes layers much more intelligently and extensively. Rather than modifying only select layers, we now dynamically adjust the quantization type of every possible layer, and the combinations will differ for each layer and model.

Current selected and all future GGUF uploads will utilize Dynamic 2.0 and our new calibration dataset. The dataset contains more than >1.5M tokens (depending on model) and comprise of high-quality, hand-curated and cleaned data - to greatly enhance conversational chat performance.

... continue reading