OpenAI WebRTC Audio Session, now with document context
Mirrored from Simon Willison for archival readability. Support the source by reading on the original site.
OpenAI WebRTC Audio Session, now with document context
I built the first version of this tool in December 2024 to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.Last month OpenAI introduced a brand new model to that API called GPT‑Realtime‑2, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.
I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.
You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.
Tags: audio, tools, ai, openai, generative-ai, llms, multi-modal-output, webrtc
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.