ElevenLabs’ new speech-to-text model Scribe is here with highest accuracy rate so far (96.7% for English)
Published on: 2025-07-14 00:56:35
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
ElevenLabs, the highly-valued AI voice cloning and generation startup from former Palantir alumni, today launched Scribe v1, a new speech-to-text model that reportedly achieves the highest accuracy across multiple languages. Users can try it here on the ElevenLabs site.
According to the company’s benchmarks, it outperforms Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3, and Deepgram Nova-3 on accurately converting spoken speech into text on the web, achieving new record-low error rates.
The company claims that Scribe delivers state-of-the-art transcription accuracy in 99 languages, including improved performance in previously underserved languages such as Serbian, Cantonese, and Malayalam.
As Flavio Schneider, ElevenLabs Lead Researcher wrote on X, Scribe is the “smartest audio understanding model” released by ElevenLabs yet.
“Scribe doesn’t just transcrib
... Read full article.