Open Dungeon: local roleplay with Gemma 4 QAT + inline Uncen-FLUX images, running at full 256K context under 8GB RAM (OS)
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| I wanted AI Dungeon but fully local and actually private, so I built it. The narrator is Gemma 4 (QAT Q4) through Ollama, and when a scene is worth showing it draws the picture too, locally, with FLUX. No API keys, no cloud, nothing leaves your machine. The part that surprised me: you can run the 12B at its full 256k context and it still only sits around 7.7GB of RAM, because Gemma 4 barely grows the KV cache. So the narrator can basically hold the whole story in its head. Old scenes that do scroll out get folded into a running summary so it never forgets what happened in chapter one. It plays like you would expect: Do / Say / Story modes, Continue, Retry, Erase, edit any line. Pick your model in the UI and it shows you the RAM cost up front. Mac one-click build in releases, or run from source. MIT, would love for people to break it and tell me what is missing. [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.