Skip to content
Tech News
← Back to articles

OpenAI has new voice models that reason, translate, and transcribe as you speak

read original get OpenAI Voice Assistant Kit → more articles
Why This Matters

OpenAI's new voice models mark a significant advancement in real-time voice AI, enabling more natural, responsive, and multilingual voice applications for developers. These models enhance live interactions through reasoning, translation, and transcription capabilities, opening new possibilities for voice-driven tech solutions.

Key Takeaways

OpenAI has just released three new realtime voice models that it says will “unlock a new class of voice apps for developers.” Each new voice intelligence model has a unique speciality for different purposes.

Developers can build new app experiences with OpenAI’s 3 new voice models

There are three new OpenAI voice models for different purposes, including reasoning, translation, and transcription.

Here’s what the company announced today:

GPT‑Realtime‑2 , our first voice model with GPT‑5‑class reasoning that can handle harder requests and carry the conversation forward naturally.

, our first voice model with GPT‑5‑class reasoning that can handle harder requests and carry the conversation forward naturally. GPT‑Realtime‑Translate , a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker.

, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker. GPT‑Realtime‑Whisper, a new streaming speech-to-text that transcribes speech live as the speaker talks.

OpenAI explains in more detail what’s new with the GPT-5-class GPT-Realtime-2 voice model with reasoning:

GPT‑Realtime‑2 is built for live voice interactions where the model keeps the conversation moving while it reasons through a request, calls tools, handles corrections or interruptions, and responds in a way that fits the moment.

Meanwhile, the new translation voice model supports “70 input languages and 13 output languages,” the company says.

... continue reading