OpenAI upgrades its transcription and voice-generating AI models
Published on: 2025-06-07 01:00:00
OpenAI is bringing new transcription and voice-generating AI models to its API that the company claims improve upon its previous releases.
For OpenAI, the models fit into its broader “agentic” vision: building automated systems that can independently accomplish tasks on behalf of users. The definition of “agent” might be in dispute, but OpenAI Head of Product Olivier Godemont described one interpretation as a chatbot that can speak with a businesses’ customers.
“We’re going to see more and more agents pop up in the coming months” Godemont told TechCrunch during a briefing. “And so the general theme is helping customers and developers leverage agents that are useful, available, and accurate.”
OpenAI claims that its new text-to-speech model, “gpt-4o-mini-tts,” not only delivers more nuanced and realistic-sounding speech but is more “steerable” than its previous-gen speech-synthesizing models. Developers can instruct gpt-4o-mini-tts on how to say things in natural language — for exampl
... Read full article.