r/LocalLLaMA · · 1 min read

Mimo 2.5 Pro - 40t/s on 8x Nvidia Spark/GB10 cluster

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

I got Mimo 2.5 Pro running on my 8x Asus Nvidia GB10 cluster using mtp-2, single user request, coding:
40 t/s - 1k context,
32t/s - 30k context,
25t/s - 125k context,
17t/s - 250k context.

2 parallel reached 60t/s and in 4 parallel reached 83t/s, not bad for 1T model
https://forums.developer.nvidia.com/t/mimo-2-5-pro-nvfp4-on-8xgb10-cluster/370803

submitted by /u/ciprianveg
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA