Simon Willison · June 12, 2026 · 1 min read

OpenAI WebRTC Audio Session, now with document context

Mirrored from Simon Willison for archival readability. Support the source by reading on the original site.

OpenAI WebRTC Audio Session, now with document context

I built the first version of this tool in December 2024 to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.

Last month OpenAI introduced a brand new model to that API called GPT‑Realtime‑2, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.

I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.

You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.

Tags: audio, tools, ai, openai, generative-ai, llms, multi-modal-output, webrtc

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from Simon Willison