We need a 80-160B model urgently. The unified memory device market needs more Models.
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hello guys,
I will keep myself short.
There are so many people that have a lot but not enough of "slow" RAM.
Anybody with a Apple Device with >96GB
Anybody with a Ryzen AI 395 Device with >96GB
Anybody with a DGX Spark
Even people with RTX 6000 Pros or 4x3090s or other configurations.
Or People with 128GB DDR4/5 RAM
Yet the models that came out in the last 3 months
were particulary made for high speed low capacity machines
(27B Qwen, 31B Gemma)
or the other extreme, massive models
(GLM 5.2, Deepseek V4 Pro, Kimi 2.7, Mimo 2.5 Pro, MiniMax M3)
We people with unified memory devices or other 80-128GB configurations
have to either use older models that are not great at all currently as the frontier has expanded.
(Glm 4.5 Air, GPT OSS 120B, Qwen 3.5 122B, Nemotron 3 Super 120B, Qwen 3 Coder Next 80B)
Or we have to use small models due to our slow bandwidth RAM/VRAM
(Qwen 3.6 35B or Gemma 4 26B)
We need something in the range of 100B 10B Sparse. Something that people with a AMD 9700 AI Pro or a Rtx 3090/5090 and 64GB Vram could use. Something that DGX Spark Users, Ai395+, Apple Users, etc.
Something like Gpt OSS 120B V2, Gemma 4 122B, Qwen 3.6/3.7 122B, GLM 5.2 Air, Deepseek V4 Mini with 100B, Mimo 2.5 Mini with 100B or anything similar to that class of models. Or heck even a Qwen 3.6 Coder 80B would be something people would love.
I really hope we are gonna get something - else I am left with Qwen 3.5 122B on my Spark for now.
Cheers.
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.