Skip to content
Tech News
← Back to articles

Qwen3.5-397B at 4.74 tok/s using 5.9GB RAM

read original more articles
Why This Matters

The development of Qwen3.5-397B achieving 4.74 tokens per second with minimal RAM highlights significant advancements in AI model efficiency and performance. These improvements can lead to more accessible and cost-effective AI solutions for both industry applications and consumers. Continued optimization of such models promises to enhance real-time AI capabilities across various sectors.

Key Takeaways

It ran for ~5 hours and got 1 tok/s. Another ~3 hours of optimizing and it's at 4.74 tok/s using 5.9GB RAM

Mar 17, 2026 · 4:16 PM UTC