r/LocalLLaMA · May 29, 2026 · 1 min read

We gave a Reachy Mini a real-time voice brain

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

We gave a Reachy Mini a real-time voice brain

We attended an event the other day and found this little guy lying on our desk, a Reachy Mini from Hugging Face.

It belongs to the daughter of the event organizer. We got curious about how it worked, and an hour later we'd given it a brain.

The model basically becomes Reachy. It hears through its mic, sees through its camera, talks through its speaker, and calls motion tools to physically react while it talks.

Repo: https://github.com/opper-ai/reachy-voice-realtime

Key things:

Web UI to watch the camera feed, transcript, and tool calls live.
19 motion and perception tools the model calls mid-conversation (emotes, head/antenna/body movement, camera, sound direction).
Mimics you, wave and it waves back, nod and it nods, tilt your head and it tilts.
Runs on GPT Realtime 2, routed through Opper so the model is a one-line swap.
The realtime client and tool layer are separate, so you can also wire it straight to a provider or a local/OS realtime model.

Setup's in the README (Python 3.12+), MIT licensed.

We handed it back to his daugther so now she can finally talk to her robot.

submitted by /u/facethef
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA