Llama-Studio, WebUI for llama-server Management
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Hey all, I have built myself a WebUI for configuring and managing llama-server sessions, and want to share the code and concept. Python and a bit of JS. Hack away! Local only. https://github.com/m94301/llama-studio The major use case is running various instances of llama-server on fixed ports to act as infrastructure for home development (and entertainment) frameworks. Read: Fiddling with settings, comparing experimental builds to mainline, and optimizing. Also good for everyday fooling around. Configs are saved per model in a json, consisting of all launch args and optional paths for custom llama-server. I have a launch arg browser with search using the current llama-server's actual -help output. I hate forgetting a launch arg format and having to open a new terminal to do -help. Spec MTP what? Draft type who? Launch to choice of GPU, monitor VRAM, load, and temp. And a somewhat rudimentary VRAM calculator to help estimate what fits where when using what quant. Last, a reasonable mobile interface to run tests and fool with config on phone when in a basement or IT closet. Show and hide logs, start, stop, change config. Less keystrokes on tiny phone keyboards. Sanity +100. [link] [comments] |
More from r/LocalLLaMA
-
club-5060ti: practical RTX 5060 Ti local LLM notes and configs
May 15
-
MiniMax M2.7 ultra uncensored heretic is Out Now with 4/100 Refusals, Available in Safetensors and GGUFs Formats!
May 15
-
Need a second pair of eyes, this Qwen3.6 27B quant recipe consistently thinks less and is correct
May 15
-
RDNA3 Flash Attention fix just dropped by llama.cpp b9158
May 15
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.