Qwen3-Omni: Native Omni AI model for text, image and video
đ Qwen Chat | đ¤ Hugging Face | đ¤ ModelScope | đ Blog | đ Cookbooks | đ Paper đĽď¸ Hugging Face Demo | đĽď¸ ModelScope Demo | đŹ WeChat (垎俥) | 𫨠Discord | đ API We release Qwen3-Omni, the natively end-to-end multilingual omni-modal foundation models. It is designed to process diverse inputs including text, images, audio, and video, while delivering real-time streaming responses in both text and natural speech. Click the video below for more information đ English Version Chinese Version News 2025