r/LocalLLaMA · · 1 min read

Agent recommendations

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Hi,

I have a Strix Halo with 128GB setup that runs a couple of models (GPT-OSS 120b, Qwen3.5-122b, Gemma-4-31b) on llama-swap. GPT and Qwen run quite fast at 40-50T/s, while Gemma is a slow 4-5T/s but seems to have the best quality.

I'd like to vibe code a personal Webproject in Python, using Pycharm.

What would be a good setup, i.e. software stack to have this help create the app? I did get to a certain level using GPT-OSS 120b, but it was quite tedious as I had to test extensively even basic errors. So I am hoping there would he ways to have it create a plan, then execute it and another model doing testing.

But I have no idea how I would get going with that. What are my options?

submitted by /u/MatthKarl
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA