Virtualizing Nvidia HGX B200 GPUs with Open Source

EU's new cloud portability requirements - What do they mean?

Difference between running Postgres for yourself and for others

Life of an inference request (vLLM V1): How LLMs are served efficiently at scale

Virtualizing NVIDIA HGX B200 GPUs with Open Source

Benjamin Satzger Principal Software Engineer

HGX B200 Hardware Overview HGX is NVIDIA’s server-side reference platform for dense GPU compute. Instead of using PCIe cards connected through the host’s PCIe bus, HGX systems use SXM modules - GPUs mounted directly to a shared baseboard. NVIDIA’s earlier generation GPUs like Hopper came in both SXM and PCIe versions, but the B200 ships only with the SXM version.

Also, even when H100 GPUs use SXM modules, their HGX baseboard layouts look different than the B200s. Within an HGX system, GPUs communicate through NVLink, which provides high-bandwidth GPU-to-GPU connectivity. NVSwitch modules merge these connections into a uniform all-to-all fabric, so every GPU can reach every other GPU with consistent bandwidth and latency. This creates a tightly integrated multi-GPU module rather than a collection of independent devices.

In short, the B200 HGX platform’s uniform, high-bandwidth architecture is excellent for performance - but less friendly to virtualization than discrete PCIe GPUs.

... continue reading