Simon Willison · · 1 min read

OpenAI WebRTC Audio Session, now with document context

Mirrored from Simon Willison for archival readability. Support the source by reading on the original site.

OpenAI WebRTC Audio Session, now with document context

I built the first version of this tool in December 2024 to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.

Last month OpenAI introduced a brand new model to that API called GPT‑Realtime‑2, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.

I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.

You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.

Screenshot of a web interface titled "OpenAI WebRTC Audio Session" with a gray status dot. Form fields: "OpenAI API Token" showing a masked password of dots, "Voice" dropdown set to "Coral", "Model" dropdown set to "gpt-realtime-2". A collapsible section labeled "▼ Document context (optional — paste text to talk about)" with bold instruction "Paste a document here before starting the session and the model will be able to discuss it with you" above a textarea containing a pasted Markdown document about whether DuckDB can run untrusted SQL as safely as Datasette runs SQLite. Below are a blue "Start Session" button and a gray disabled "Mute Mic" button, then a green success message "Session established successfully!" At the bottom, a dark panel headed "Last transcript" reads: "DuckDB can be made about as safe as SQLite for running untrusted SELECT queries, but only if you lock it down properly. Using read only true by itself is not enough, because SQL can still" (text cut off).

Tags: audio, tools, ai, openai, generative-ai, llms, multi-modal-output, webrtc

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Simon Willison