GoKawiil - A new, open source text-to-speech model called Dia has arrived to challenge ElevenLabs, OpenAI and more

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A two-person startup by the name of Nari Labs has introduced Dia, a 1.6 billion parameter text-to-speech (TTS) model designed to produce naturalistic dialogue directly from text prompts — and one of its creators claims it surpasses the performance of competing proprietary offerings from the likes of ElevenLabs, Google’s hit NotebookLM AI podcast generation product. It could also threaten uptake of OpenAI’s recent gpt-4o-mini-tts. “Dia rivals NotebookLM’s podcast feature while surpassing ElevenLabs Studio and Sesame’s open model in quality,” said Toby Kim, one of the co-creators of Nari and Dia, on a post from his account on the social network X. In a separate post, Kim noted that the model was built with “zero funding,” and added across a thread: “…we were not AI experts from the beginning. It all started when we fell in love with NotebookLM’s podcast featu ... Read full article.

Find Related products on Amazon

A new, open source text-to-speech model called Dia has arrived to challenge ElevenLabs, OpenAI and more

Related Articles