Mimo 2.5 Pro - 40t/s on 8x Nvidia Spark/GB10 cluster
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
I got Mimo 2.5 Pro running on my 8x Asus Nvidia GB10 cluster using mtp-2, single user request, coding:
40 t/s - 1k context,
32t/s - 30k context,
25t/s - 125k context,
17t/s - 250k context.
2 parallel reached 60t/s and in 4 parallel reached 83t/s, not bad for 1T model
https://forums.developer.nvidia.com/t/mimo-2-5-pro-nvfp4-on-8xgb10-cluster/370803
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.