Local LLMs on Refurb M4 Max vs new M5 Max
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hoping the community can guide me on this one. I'm on the fence about the following purchase:
Refurbished 16-inch MacBook Pro Apple M4 Max Chip with 16‑Core CPU and 40‑Core GPU, 64gb ram, 1Tb Drv for $3,479.00
vs
The new 16-inch MacBook Pro Apple M5 Max Chip with 18‑core CPU, 40‑core GPU, 64gb ram, 2Tb Drv for $4,599.00
I'm drawn to the refurb due to price.
I'm going to be using it for work (data scientist & intelligence analyst), but I also want to run models like Gemma 4 31B at Q8, and Qwen3.6-27B Q8. Mainly data work (derivation and data element extraction etc). I've been using local models for a while, but hitting my head on the resource ceiling of 24gb shared ram.
There's a huge price difference ($1,120). Just wanted to check myself. Is the difference in pre-fill worth it for the m5, and any other enhancements? The reviews seem to indicate the M4 Max can run hot.
Thanks in advance.
Editing:
New info which may help shape advice:
M5 better Prefill
Memory Bandwidth:
- M4 Max 40-core GPU: 546 GB/s
- M5 Max 40-core GPU: 614 GB/s
=> 12.5% bandwidth increase.
[link] [comments]
More from r/LocalLLaMA
-
Gemma-4-Harmonia-31B-Uncensored-Heretic Is Out Now, a Merge of Multiple gemma-4-31B-it Finetunes Designed for a Targeted Approach to Deep Neural Consolidation, Minimizing Regression While Amplifying Unique Capability Boundaries. With KLD 0.0047 and 9/100 Refusals!
May 28
-
Vulnerability found in framework used by VLLM, many MCP servers, and other LLM tools
May 28
-
CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!!
May 27
-
Vram 16gig poor. What models do I test?
May 27
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.