Building A16Z's Personal AI Workstation

In the era of foundation models, multimodal AI, LLMs, and ever-larger datasets, access to raw compute is still one of the biggest bottlenecks for researchers, founders, developers, and engineers. While the cloud offers scalability, building a personal AI Workstation delivers complete control over your environment, latency reduction, custom configurations and setups, and the privacy of running all workloads locally.

This post covers our version of a four-GPU workstation powered by the new NVIDIA RTX 6000 Pro Blackwell Max-Q GPUs. This build pushes the limits of desktop AI computing with 384GB of VRAM (96GB each GPU), all in a shell that can fit under your desk.

Why Build This Workstation?

Training, fine-tuning, and running inference on modern AI models require massive VRAM bandwidth, high CPU throughput, and ultra-fast storage. Running these workloads in the cloud can introduce latency, setup overhead, slower data transfer speeds, and privacy tradeoffs.

By building a workstation around enterprise-grade GPUs with full PCIe 5.0 x16 connectivity, we get:

Maximum GPU-to-CPU bandwidth: No bottlenecks from PCIe switches or shared lanes.

No bottlenecks from PCIe switches or shared lanes. Enterprise-class VRAM: Each RTX 6000 Pro Blackwell Max-Q provides 96GB of VRAM, enabling dense training runs and large model inference without quantization. Each card consumes only 300W of power at peak (Max-Q version).

Each RTX 6000 Pro Blackwell Max-Q provides 96GB of VRAM, enabling dense training runs and large model inference without quantization. Each card consumes only 300W of power at peak (Max-Q version). 8TB of NVMe 5.0 storage: 4x 2TB of NVMe PCIe 5.0 x4 modules.

4x 2TB of NVMe PCIe 5.0 x4 modules. 256 GB of total ECC DDR5 RAM.

of total ECC DDR5 RAM. Surprising efficiency: Despite its scale, the workstation pulls 1650W at peak , low enough to run on a standard 15-amp / 120V household circuit.

... continue reading