LIVE NEWS
  • The 3 big takeaways from historic meeting in Beijing
  • Boy, 15, shot dead in France as prosecutors blame drug war
  • Global Euro moment still unfulfilled – ING
  • Not just commercial litigation: China is trying to keep Darwin Port
  • 4 bright stars form a giant ‘diamond’ in the May night sky: Here’s how to find it
  • Finland ends drone alert amid regional fears of Ukraine war spillover | Russia-Ukraine war News
  • ‘The Buildup Is Sincerely Strong’: Michaël van de Poppe Says Bitcoin Could See a Fast Move to a Four-Month High – Here Are His Targets
  • American Lending Center Data Breach Affects 123,000 Individuals
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • See More
    • Artificial Intelligence
    • Climate Risks
    • Defense
    • Healthcare Innovation
    • Science
    • Technology
    • World
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • Artificial Intelligence
  • Climate Risks
  • Defense
  • Healthcare Innovation
  • Science
  • Technology
  • World
Home»Artificial Intelligence»OpenAI brings GPT-5-level reasoning to its speech models
Artificial Intelligence

OpenAI brings GPT-5-level reasoning to its speech models

primereportsBy primereportsMay 8, 2026No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
OpenAI brings GPT-5-level reasoning to its speech models
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI launched three new speech-focused models on Thursday: GPT-Realtime-2, its first voice model with what the company calls “GPT-5-class reasoning”; GPT-Realtime-Translate for live translations; and GPT-Realtime-Whisper for fast transcriptions.

GPT-Realtime-2

The first GPT-Realtime model launched in the summer of 2025, focusing on providing a voice-native model that could interact with users in a far more natural way than previous models. OpenAI last updated GPT-Realtime in February with the launch of version 1.5.

OpenAI brings GPT-5-level reasoning to its speech models
Credit: OpenAI.

Now, with GPT-Realtime-2, the company promises an 11% performance improvement over GPT-Realtime-1.5. OpenAI extended the context window from 32,000 tokens — which was surely a pain point for developers — to 128,000 tokens. This will allow the model to work better for longer and handle more complex interactions, which is especially important in the voice-agent workflows OpenAI is targeting.

But what really matters here is that OpenAI is bringing far more powerful reasoning to this class of models. As OpenAI notes in its announcement, “building useful voice products takes more than fast turn-taking and a natural-sounding voice. A voice agent needs to understand what someone means, keep track of context, recover when a request changes, use tools while the conversation continues, and respond in a way that feels appropriate to the moment.”

With this update, developers can now, for example, have Realtime-2 start conversations with short preambles such as “let me check that,” so users know the agent is working. The model can now also make parallel tool calls, just like most modern agentic systems, and tell the user what it is doing.

By default, the model’s reasoning effort is set to low, with developer options limited to minimal, low, medium, high, and xhigh.

Developers will pay $32 per 1 million audio input tokens and $64 per 1 million output tokens. That’s the same price the company charged for GPT-Realtime-1.5.

GPT-Realtime-Translate

As the name implies, GPT-Realtime-Translate is OpenAI’s model for live translations. It can handle over 70 input languages and translate them into 13 output languages.

While OpenAI’s speech models were already able to handle some translation tasks, this is the first time it is offering a dedicated model for this use case.

In the API, developers will pay $0.034 per minute to use this capability.

GPT-Realtime-Whisper

GPT-Realtime-Whisper, meanwhile, is OpenAI’s latest streaming transcription model.

Whisper has long been the company’s brand for speech-to-text models, and Whisper has remained one of the most popular open-weight models for this task since the first version launched back in 2022.

The open model hasn’t seen an update in quite a while, though OpenAI has long offered transcription models through its API with gpt-4o-transcribe and 4o-mini-transcribe.

This model is priced at $0.017 per minute.

Building new kinds of apps

In its announcement, OpenAI stresses that it sees three patterns in how developers are using voice AI.

There is voice-to-action, which allows users to describe what they need and then have the system perform a task; system-to-voice for having the AI provide voice-based guidance (“Your inbound flight is delayed, but you can still make your connection.”); and voice-to-voice, arguably the most complex one of the three, for building live, interactive conversations across tasks and changing context.


Group Created with Sketch.

Before joining The New Stack as its senior editor for AI, Frederic was the enterprise editor at TechCrunch, where he covered everything from the rise of the cloud and the earliest days of Kubernetes to the advent of quantum computing….

Read more from Frederic Lardinois



Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleGas-fired power still looks a safe bet for Centrica in the renewables era | Nils Pratley
Next Article Zelenskyy warns Russia’s friends against attending parade
primereports
  • Website

Related Posts

Artificial Intelligence

Your Sonos smart speaker has an underutilized automation feature – 5 helpful ways I use mine

May 15, 2026
Artificial Intelligence

Physical AI moves closer to factory floors as companies test humanoid robots

May 15, 2026
Artificial Intelligence

A Coding Implementation to Master GPU Computing with CuPy, Custom CUDA Kernels, Streams, Sparse Matrices, and Profiling

May 15, 2026
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Global Resources Outlook 2024 | UNEP

December 6, 20258 Views

The D Brief: DHS shutdown likely; US troops leave al-Tanf; CNO’s plea to industry; Crowded robot-boat market; And a bit more.

February 14, 20265 Views

German Chancellor Merz faces difficult mission to Israel – DW – 12/06/2025

December 6, 20254 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest Reviews

Subscribe to Updates

Get the latest tech news from FooBar about tech, design and biz.

PrimeReports.org
Independent global news, analysis & insights.

PrimeReports.org brings you in-depth coverage of geopolitics, markets, technology and risk – with context that helps you understand what really matters.

Editorially independent · Opinions are those of the authors and not investment advice.
Facebook X (Twitter) LinkedIn YouTube
Key Sections
  • World
  • Geopolitics
  • Popular Now
  • Artificial Intelligence
  • Cybersecurity
  • Crypto
All Categories
  • Artificial Intelligence
  • Climate Risks
  • Crypto
  • Cybersecurity
  • Defense
  • Economy
  • Geopolitics
  • Global Markets
  • Healthcare Innovation
  • Politics
  • Popular Now
  • Science
  • Technology
  • World
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms & Conditions
  • Disclaimer
  • Cookie Policy
  • DMCA / Copyright Notice
  • Editorial Policy

Sign up for Prime Reports Briefing – essential stories and analysis in your inbox.

By subscribing you agree to our Privacy Policy. You can opt out anytime.
Latest Stories
  • The 3 big takeaways from historic meeting in Beijing
  • Boy, 15, shot dead in France as prosecutors blame drug war
  • Global Euro moment still unfulfilled – ING
© 2026 PrimeReports.org. All rights reserved.
Privacy Terms Contact

Type above and press Enter to search. Press Esc to cancel.