The power of structured workflows and small local models
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| A month ago, I experimented with a very basic home-rolled agent loop with a handful of tools and found it worked surprisingly well in spite of how crude it was: Later, I wrote about how I addictive developing your own agent loop is, esp. once you reach the point that the agent loop is capable of editing itself: https://www.reddit.com/r/LocalLLaMA/comments/1sq7cie/warning_do_not_write_your_own_ai_agent_if_you/ Well, 28 days later, it's been getting out of hand. I've been working until 5am on it as it was so addictive. Once you have a good agentic setup, you quickly realise that you, as the human, is the main bottleneck. You have a massive todo list, but the agent is sitting idle, waiting your your approvals and reviews. Not only that, since I am using Qwen3.5 9B as the model, the model has limited intelligence and context. I can't just dump hundreds of data files onto it and expect it to crunch it all at the same time, so then I thought to manage the context limits through a map-reduce pattern, breaking tasks down into smaller chunks that can be run in parallel to extract maximum FLOPs out of the GPU while staying within context limits. Enforcing structured outputs also helps to reduce LLM variability and make a smooth reduce step. Lastly, it is helpful to have a database to monitor and track workflows. Managed to get it up and running today and happy that small local models can handle this task. My custom agent has now replace Claude Code for 99% of tasks. [link] [comments] |
More from r/LocalLLaMA
-
I hope that someday we will have a 124B Gemma.
May 17
-
ROCm 7.13 nightly adds strix halo optimizations
May 17
-
llama: avoid copying logits during prompt decode in MTP by am17an · Pull Request #23198 · ggml-org/llama.cpp
May 17
-
MiroThinker-1.7, an open-weight deep research agent (Qwen3 MoE base) — mini is 30B/3B active, curious what tok/s people get on consumer hardware
May 17
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.