TurboDiffusion
This repository provides the official implementation of TurboDiffusion, a video generation acceleration framework that can speed up end-to-end diffusion generation by $100 \sim 200\times$ on a single RTX 5090, while maintaining video quality.
TurboDiffusion primarily uses SageAttention, SLA (Sparse-Linear Attention) for attention acceleration, and rCM for timestep distillation.
Paper: TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Note: the checkpoints and paper are not finalized, and will be updated later to improve quality.
Original, E2E Time: 184s TurboDiffusion, E2E Time: 1.9s An example of a 5-second video generated by Wan-2.1-T2V-1.3B-480P on a single RTX 5090. An example of agenerated by Wan-2.1-T2V-1.3B-480P on a single
Available Models
Note: All checkpoints support generating videos at 480p or 720p. The "Best Resolution" column indicates the resolution at which the model provides the best video quality.
Installation
Base environment: python>=3.9 , torch>=2.7.0 . torch==2.8.0 is recommended, as higher versions may cause OOM.
... continue reading