Benchmarking Nvidia's RTX Neural Texture Compression tech that can reduce VRAM usage by over 80%

Today, we're benchmarking and analyzing one of Nvidia's most interesting new technologies in development: RTX Neural Texture Compression (NTC), an AI-driven technology that uses Tensor Cores to compress and decompress data, thus reducing VRAM requirements by up to 80%.

When Nvidia unveiled the RTX 50-series graphics cards, the company also announced several neural rendering technologies alongside those GPUs. These technologies improve the representation of materials, provide more efficient compression of textures, and increase the quality of indirect light through inferred path-traced rays.

This is all part of a new paradigm for real-time graphics called neural shading, which makes part of the graphics pipeline trainable. Small neural networks are executed within shaders as they work together with the rest of the renderer and are accelerated by hardware through Cooperative Vectors to allow for efficient real-time performance.

Article continues below

Instead of having to write complex shader code, developers can train AI models to estimate the result that the shader code would have computed. This approach can tackle rendering challenges that are difficult to solve through traditional methods.

Today, we will focus on one of these technologies in particular – RTX Neural Texture Compression. We'll break down how NTC works, then proceed with benchmarking on a number of GPUs, and also share some insights shared by Alexey Panteleev, a Distinguished DevTech Engineer at Nvidia and NTC developer.

What is RTX Neural Texture Compression?

RTX Neural Texture Compression (NTC) is a machine learning-based method for texture compression and decompression. It can run in three different modes in DirectX 12: Inference on Load, Inference on Sample, and Inference on Feedback. In Vulkan, Inference on Feedback is not supported, so the only two available modes are Inference on Load and Inference on Sample.

The compression phase consists of the original textures being transformed into a combination of weights for a small neural network and latent features. In Inference on Sample mode, the decompression phase consists of reading the latent data and then performing an inference operation by passing it through a small Multi-Layer Perceptron (MLP) network whose weights were determined during the compression phase. Each texel is decompressed when needed. NTC is deterministic – it is not generative.

Stay On the Cutting Edge: Get the Tom's Hardware Newsletter Get Tom's Hardware's best news and in-depth reviews, straight to your inbox. Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors

... continue reading