Tech News
← Back to articles

AMD’s RDNA4 GPU architecture

read original related products more articles

RDNA4 is AMD’s latest graphics-focused architecture, and fills out their RX 9000 line of discrete GPUs. AMD noted that creating a good gaming GPU requires understanding both current workloads, as well as taking into account what workloads might look like five years in the future. Thus AMD has been trying to improve efficiency across rasterization, compute, and raytracing. Machine learning has gained importance including in games, so AMD’s new GPU architecture caters to ML workloads as well.

From AMD’s perspective, RDNA4 represents a large efficiency leap in raytracing and machine learning, while also improving on the rasterization front. Improved compression helps keep the graphics architecture fed. Outside of the GPU’s core graphics acceleration responsibility, RDNA4 brings improved media and display capabilities to round out the package.

Media Engine

The Media Engine provides hardware accelerated video encode and decode for a wide range of codecs. High end RDNA4 parts like the RX 9070XT have two media engines. RDNA4’s media engines feature faster decoding speed, helping save power during video playback by racing to idle. For video encoding, AMD targeted better quality in H.265, H.265, and AV1, especially in low latency encoding.

Low latency encoder modes are mostly beneficial for streaming, where delays caused by the media engine ultimately translate to a delayed stream. Reducing latency can make quality optimizations more challenging. Video codecs strive to encode differences between frames to economize storage. Buffering up more frames gives the encoder more opportunities to look for similar content across frames, and lets it allocate more bitrate budget for difficult sequences. But buffering up frames introduces latency. Another challenge is some popular streaming platforms mainly use H.264, an older codec that’s less efficient than AV1. Newer codecs are being tested, so the situation may start to change as the next few decades fly by. But for now, H.264 remains important due to its wide support.

Testing with an old gameplay clip from Elder Scrolls Online shows a clear advantage for RDNA4’s media engine when testing with the latency-constrained VBR mode and encoder tuned for low latency encoding (-usage lowlatency -rc vbr_latency). Netflix’s VMAF video quality metric gives higher scores for RDNA4 throughout the bitrate range. Closer inspection generally agrees with the VMAF metric.

RDNA4 does a better job preserving high contrast outlines. Differences are especially visible around text, which RDNA4 handles better than its predecessor while using a lower bitrate. Neither result looks great with such a close look, with blurred text on both examples and fine detail crushed in video encoding artifacts. But it’s worth remembering that the latency-constrained VBR mode uses a VBV buffer of up to three frames, while higher latency modes can use VBV buffer sizes covering multiple seconds of video. Encoding speed has improved slightly as well, jumping from ~190 to ~200 FPS from RDNA3.5 to RDNA4.

Display Engine

The display engine fetches on-screen frame data from memory, composites it into a final image, and drives it to the display outputs. It’s a basic task that most people take for granted, but the display engine is also a good place to perform various image enhancements. A traditional example is using a lookup table to apply color correction. Enhancements at the display engine are invisible to user software, and are typically carried out in hardware with minimal power cost. On RDNA4, AMD added a “Radeon Image Sharpening” filter, letting the display engine sharpen the final image. Using dedicated hardware at the display engine instead of the GPU’s programmable shaders means that the sharpening filter won’t impact performance and can be carried out with better power efficiency. And, AMD doesn’t need to rely on game developers to implement the effect. Sharpening can even apply to the desktop, though I’m not sure why anyone would want that.

Power consumption is another important optimization area for display engines. Traditionally that’s been more of a concern for mobile products, where maximizing battery life under low load is a top priority. But RDNA4 has taken aim at multi-monitor idle power with its newer display engine. AMD’s presentation stated that they took advantage of variable refresh rates on FreeSync displays. They didn’t go into more detail, but it’s easy to imagine what AMD might be doing. High resolution and high refresh rate displays translate to high pixel rates. That in turn drives higher memory bandwidth demands. Dynamically lowering refresh rates could let RDNA4’s memory subsystem enter a low power state while still meeting refresh deadlines.

... continue reading