Holding machine upgrade waiting for a model?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hey guys.
This is the sub I spent most of my time in Reddit and just decided to make this post to know if I’m alone on it or there’s also other people waiting for a release of a specific open weight model to do the next step of upgrade its own machine?
If yes, what would be your next setup configuration and which model would encourage to do that?
In my case I have a 48gb m4 max, having fun with qwen3.6 35ba3b, sometimes running 27b(which pp is sometimes painful for my code base) and I eventually I do some runs on the 122b with open router just to flirt.
My upgrade would be a 128gb m5 max in case a qwen3.7/3.6 122b is released and demonstrates what I think it would be in terms of performance.
How about you guys?
[link] [comments]
More from r/LocalLLaMA
-
How small can the orchestration model in an agent be? (separating it from code-gen — that obviously wants a big model)
May 22
-
BeeLlama v0.2.0 – major DFlash update. Single RTX 3090: Qwen 3.6 27B up to 164 tps (4.40x), Gemma 4 31B up to 177.8 tps (4.93x). Prompt processing speed near baseline.
May 22
-
trained a prompt injection detector using ml-intern and DeepSeek v4 Flash, runs in the browser
May 22
-
ByteShape Qwen3.6-35B-A3B: 30% faster than Unsloth IQ on 6GB VRAM laptop
May 22
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.