Skip to content
Tech News
← Back to articles

Use your Nvidia GPU's VRAM as swap space on Linux

read original more articles
Why This Matters

This article introduces a novel method for leveraging NVIDIA GPU VRAM as swap space on Linux, significantly expanding available memory for laptops with soldered RAM. By using a user-space daemon and the NBD protocol, it enables faster spillover handling and enhances system performance without requiring kernel modifications or risking driver incompatibilities. This approach is especially valuable for users with high-memory-demand applications and limited hardware upgrade options.

Key Takeaways

Use your NVIDIA GPU's VRAM as swap space on Linux.

Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you're getting swapped to SSD, this puts that VRAM to work.

Tested on: RTX 3070 Laptop (GA104M, 16 GB physical, 8 GB VRAM), driver 580.159.03, kernel 6.17, Pop!_OS. Allocated 7 GB for swap. End result including zram and SSD swap ~46 GB, tripled the addressable memory. Overflow order is: RAM fills, then VRAM absorbs spill (fast, PCIe), then zram compresses the rest (CPU), then SSD only if everything else is exhausted.

How it works

A small daemon allocates VRAM via the CUDA driver API, then serves it as a block device using the NBD (Network Block Device) protocol over a Unix socket. The kernel's built-in nbd driver connects to it and exposes /dev/nbdX . From there it's a normal swap device.

Data path: kernel swap subsystem - /dev/nbdX - nbd kernel driver - Unix socket - nbd-vram daemon - cuMemcpyHtoD/DtoH - GPU VRAM.

No kernel module to write or maintain. No NVIDIA kernel symbols. Survives kernel and driver updates without rebuilding anything.

Why not the NVIDIA P2P API?

The "obvious" approach is nvidia_p2p_get_pages_persistent , which pins VRAM pages in BAR1 so the CPU can access them directly via ioremap_wc . Every existing project that tried this route hits the same wall: the NVIDIA driver returns EINVAL on consumer GeForce GPUs. Both the persistent and non-persistent variants, both flag values. It's gated at the RM level for Quadro/datacenter SKUs only, regardless of driver version.

The other approach - directly ioremap_wc the BAR1 physical address without going through the P2P API - also doesn't work. The GPU's internal page tables only have ~16 MiB of BAR1 mapped (just the display framebuffer). Reads from the rest return zeros. mkswap appears to succeed, then swapon fails because the swap header isn't actually there.

... continue reading