Same model, same prompt, 4 different agents
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Setup: one self-hosted Qwen3.6-27B (Q4) on llama.cpp, identical prompt, identical hardware. The only variable is the agent scaffolding. Agents tested: pi, opencode, hermes, qwen code. Task: a single-file 2D canvas solar system with scripted orbits and gravity that acts only on user-launched comets. The exact prompt (note the explicit "build incrementally, your context window is small" instruction): Results: all 4 produced a working sim, but the code quality differs a lot: opencode, my pick. Cleanest architecture, pi, most correct. Coordinate-consistent, distance softening to avoid singularities, removes comets that hit the Sun, planet labels, and the only one with touch support. Less flashy, most robust. hermes, flashiest, but physically wrong. Only one with real elliptical orbits + a nice drag-vector arrow. But it computes planet gravity on comets at a different time step than it renders the planets, so comets pull toward where the planets aren't. Looks best, simulates worst. qwen code, most minimal. Shortest, runs, but crude: huge launch-velocity multiplier flings comets off instantly, no softening, no stars. Takeaway: with a fixed local model, the agent's scaffolding visibly changes the output (integration strategy, coordinate hygiene, edge-case handling). The prettiest demo (hermes) was the buggiest; the plain-looking one (pi) was the most correct; opencode hit the best balance of clean code + stable physics. Curious whether others get the same ranking on their own local setups. [link] [comments] |
More from r/LocalLLaMA
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
-
Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images!
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.