Qwen3.5 Fine-Tuning Guide – Unsloth Documentation

Learn how to fine-tune Qwen3.5 LLMs locally with Unsloth.

You can now fine-tune model family (0.8B, 2B, 4B, 9B, 27B, 35B‑A3B, 122B‑A10B) with . Support includes both and text fine-tuning. Qwen3.5‑35B‑A3B - bf16 LoRA works on 74GB VRAM.

Unsloth makes Qwen3.5 train 1.5× faster and uses 50% less VRAM than FA2 setups.

Qwen3.5 bf16 LoRA VRAM use: 0.8B : 3GB • 2B : 5GB • 4B : 10GB • 9B : 22GB • 27B : 56GB

Fine-tune 0.8B, 2B and 4B bf16 LoRA via our free Google Colab notebooks:

If you want to preserve reasoning ability, you can mix reasoning-style examples with direct answers (keep a minimum of 75% reasoning). Otherwise you can emit it fully.

Full fine-tuning (FFT) works as well. Note it will use 4x more VRAM.

After fine-tuning, you can export to (for llama.cpp/Ollama/LM Studio/etc.) or

(RL) for Qwen3.5 also works via Unsloth inference.

We have A100 Colab notebooks for and .

... continue reading