GPU Prices. Buy now, or buy later?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
If the Community could sound off on this, I'd be grateful.
Do you think GPU prices are going to stop skyrocketing? Is this FOMO and hype driving the adoption of local inference? I wonder if this mass-market adoption will last for years? Is it a long-term trend? If I wait 6 months, will I regret it? (cause prices are going to keep screaming). I don't know about RAM pricing... is that temporary?
Backstory:
I bought an M3 mbp max in Nov 2023 (128g, 4tb, 16core cpu / 40core gpu).
I use it as a desktop, with 20tb of external memory.
5 different production workflows running about a dozen daily crons. (everything from BERT models to 30b LLMs in prod, with RSLoRA adapters I've trained for specific tasks.)
3 different agent harnesses (2 customs and Hermes). I still hit openrouter (glm-5.1/minimax) for orchestration, and even anthropic for heavy coding tasks.
I'm sitting on the fence about buying a 1x5090 rig, expandable to 3 GPUs, and plug-n-play with a Pro 6000. But $10k is a hard swallow.
This would allow me to run Qwen3.6-35B-A3B-4bit and 27b-4bit in production for sub-agent delegations (4x sub agents concurrent with sufficient KV Cache).
Plan to run this headless as an inference server:
Build: ~$10k
AMD Ryzen 9 9950X 4.3GHz 16 Core 170W
64GB (2x DDR5 32GB)
NVIDIA GeForce RTX 5090 32GB
2TB NVMe PCIe Gen5 M.2 SSD
Fractal Design Define 7 XL case
Super Flower LEADEX Titanium 1700W
Asetek 624S-M2 240mm CPU Cooler
Case Fans Upgrade Kit (PWM Ramping)
Be kind. lol
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.