Find Related products on Amazon

Shop on Amazon

SVDQuant+NVFP4: 4× Smaller, 3× Faster FLUX with 16-bit Quality on Blackwell GPUs

Published on: 2025-07-13 07:46:18

SVDQuant supports NVFP4 on NVIDIA Blackwell GPUs with 3× speedup over BF16 and better image quality than INT4. Try our interactive demo below or at https://svdquant.mit.edu/! Our code is all available at https://github.com/mit-han-lab/nunchaku. With Moore's law slowing down, hardware vendors are shifting toward low-precision inference. NVIDIA's latest Blackwell architecture introduces a new 4-bit floating point format (NVFP4), improving upon the previous MXFP4 format. NVFP4 features more precise scaling factors and a smaller microscaling group size (16 v.s. 32), enabling it to maintain 16-bit model accuracy even at 4-bit precision while delivering 4× higher peak performance. In our previous blog, we shared a tutorial on setting up a 5090 workspace with the Blackwell architecture. In this blog, we’re excited to announce that SVDQuant now supports NVFP4 on the 5090 GPU, delivering better image quality and performance! Our code and demo are all publicly available! SVDQuant: Absorbing O ... Read full article.