r/LocalLLaMA · May 24, 2026 · 1 min read

How are you all handling agents and sub agents?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Currently got it setup in Librechat to use DeepSeek v4 pro via OpenRouter to be the master planner, then have my PC running Qwen 35B @ 160ish tok/sec locally, and my mini PC running Gemma E2B locally for smaller tasks. Im wondering if there are setups out there to effectively utilize this structure, or better and smaller models with purpose built roles you are using. My 35B is my worker bee and Gemma is the model for handling trivial things and they run in parallel. I'm curious if there are even smaller and more nimble models built for this type of thing.

submitted by /u/Honest-Kangaroo-1830
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA