Advanced Quantization Algorithm for LLMs

English | 简体中文

User Guide | 用户指南

🚀 What is AutoRound?

AutoRound is an advanced quantization toolkit designed for Large Language Models (LLMs) and Vision-Language Models (VLMs). It achieves high accuracy at ultra-low bit widths (2–4 bits) with minimal tuning by leveraging sign-gradient descent and providing broad hardware compatibility. See our papers SignRoundV1 and SignRoundV2 for more details. For usage instructions, please refer to the User Guide.

🆕 What's New

✨ Key Features

✅ Superior Accuracy Delivers strong performance even at 2–3 bits example models, with leading results at 4 bits benchmark.

✅ Ecosystem Integration Seamlessly works with Transformers, vLLM, SGLang and more.

✅ Multiple Formats Export Support AutoRound, AutoAWQ, AutoGPTQ, and GGUF for maximum compatibility. Details are shown in export formats

✅ Fast Mixed Bits/Dtypes Scheme Generation Automatically configure in minutes, with about 1.1X-1.5X the model’s BF16 RAM size as overhead. Accuracy results and user guide.

... continue reading