AMD Strix Halo RDMA Cluster Setup Guide
This guide details how to configure a two-node AMD Strix Halo cluster linked via Intel E810 (RoCE v2) for distributed vLLM inference using Tensor Parallelism.
Table of Contents
On Both Nodes:
Preparation: Install/Update Fedora 43 and the E810 NICs (Check firmware: ethtool -i <iface> ).
and the E810 NICs (Check firmware: ). BIOS/Kernel : Set iGPU to 512MB and apply kernel params ( iommu=pt , pci=realloc , etc.).
: Set iGPU to 512MB and apply kernel params ( , , etc.). SSH: Configure passwordless SSH between nodes. Networking: Assign static IPs ( 192.168.100.1 & .2 ), set MTU 9000, and trust the interface in firewall. Install Toolbox: Run ./refresh_toolbox.sh (this automatically installs the container with RDMA support and the custom librccl.so patch). Run Cluster: Run start-vllm-cluster .
. Select "2. Start Ray Cluster" (Follow prompts using the TUI).
(Follow prompts using the TUI). Select "4. Launch VLLM Serve" and choose your model. (Export HF_TOKEN first for gated models!)
Key Note: The refresh_toolbox.sh script detects your Infiniband/RDMA devices and automatically configures the container to expose them.
... continue reading