I guess 4 units wasn’t enough.
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| I don’t think this thing is going to work out, if anyone wants a 4u gpu server complete with half a terabyte of ram hit me up. (/s) [link] [comments] |
More from r/LocalLLaMA
-
Qwen will release another 27B with high probability
May 20
-
Move to backend sampling for MTP draft path by gaugarg-nv · Pull Request #23287 · ggml-org/llama.cpp
May 20
-
Waiting on Qwen to drop those 3.7 models be like:
May 20
-
Qwen 3.6 35B GGUF: NTP vs MTP quantization results across GPUs and CPUs
May 20
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.