NVIDIA Developer Blog · · 1 min read

Achieving Peak System and Workload Efficiency on NVIDIA GB200 NVL72 with Slurm Block Scheduling

Mirrored from NVIDIA Developer Blog for archival readability. Support the source by reading on the original site.

NVIDIA GB200 NVL72 introduces a fundamentally new way to build GPU clusters by extending NVIDIA NVLink coherence across an entire rack. This design enables...

NVIDIA GB200 NVL72 introduces a fundamentally new way to build GPU clusters by extending NVIDIA NVLink coherence across an entire rack. This design enables exascale performance, but it also changes the assumptions that many scheduling systems were built on. As a result, “rack-scale locality” becomes a hard constraint. When workloads cross domain boundaries, performance drops sharply…

Source

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from NVIDIA Developer Blog