Tech News
← Back to articles

In crowded voice AI market, OpenAI bets on instruction-following and expressive speech to win enterprise adoption

read original related products more articles

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

OpenAI adds to an increasingly competitive AI voice market for enterprises with its new model, gpt-realtime, that follows complex instructions and with voices “that sound more natural and expressive.”

As voice AI continues to grow, and customers find use cases such as customer service calls or real-time translation, the market for realistic-sounding AI voices that also offer enterprise-grade security is heating up. OpenAI claims its new model provides a more human-like voice, but it still needs to compete against companies like ElevenLabs.

The model will be available on the Realtime API, which the company also made generally available. Along with the gpt-realtime model, OpenAI also released new voices on the API, which it calls Cedar and Marin, and updated its other voices to work with the latest model.

OpenAI said in a livestream that it worked with its customers who are building voice applications to train gpt-realtime and “carefully aligned the model to evals that are built on real-world scenarios like customer support and academic tutoring.”

AI Scaling Hits Its Limits Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are: Turning energy into a strategic advantage

Architecting efficient inference for real throughput gains

Unlocking competitive ROI with sustainable AI systems Secure your spot to stay ahead: https://bit.ly/4mwGngO

The company touted the model’s ability to create emotive, natural-sounding voices that also align with how developers build with the technology.

Speech-to-speech models

... continue reading