Running GLM5.2 on budget hardware < $2500.
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Too many times I hear people whine about not being ble to run SOTA models or claim it would require $50k, or $100k.
https://www.ebay.com/itm/398079051468 Epcy Motherboard & CPU - $460
https://www.ebay.com/itm/206374955959 P40 24gb - $230 get 2 - $460
https://www.ebay.com/itm/318489798853 512gb dd4 $1000
Total = $1920.
You need PSU, Storage, Fan for P40. You can source those for $350 easily. But let's go ahead and budget $580 to put the total for everything at $2500.
You can run GLM5.2 Q2/Q3/Q4 variants with cmoe and llama.cpp on this. Sure, it would be slow, but it's yours! If you have money or when you get more, you can replace the P40s faster GPUs 4080, 3090, etc. You could for a bit more than $460 about $500 source 2 2080ti 22gb GPUs from China. If you are willing to be resourceful, you can make things happen for you. This will also run KimiK2.6, DeepSeek, MiniMax, etc
Yes, the trade off is that it's slow. You will not be running agents with these huge models, but you can spin it up for planning and serious debugging. They can take away Fable, Mythos or whatever the F model. You will not be counted in the group of have nots.
[link] [comments]
More from r/LocalLLaMA
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
-
Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images!
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.