Racing karts on a Rust GPU kernel driver

A few months ago, we introduced Tyr, a Rust driver for Arm Mali GPUs that continues to see active development upstream and downstream. As the upstream code awaits broader ecosystem readiness, we have focused on a downstream prototype that will serve as a baseline for community benchmarking and help guide our upstreaming efforts.

Today, we are excited to share that the Tyr prototype has progressed from basic GPU job execution to running GNOME, Weston, and full-screen 3D games like SuperTuxKart, demonstrating a functional, high-performance Rust driver that matches C-driver performance and paves the way for eventual upstream integration!

Setting the stage

I previously discussed the relationship between user-mode drivers (UMDs) and kernel-mode drivers (KMDs) in one of my posts about how GPUs work. Here's a quick recap to help get you up to speed:

One thing to be understood from the previous section is that the majority of the complexity tends to reside at the UMD level. This component is in charge of translating the higher-level API commands into lower-level commands that the GPU can understand. Nevertheless the KMD is responsible for providing key operations such that its user-mode driver is actually implementable, and it must do so in a way that fairly shares the underlying GPU hardware among multiple tasks in the system.

While the UMD will take care of translating from APIs like Vulkan or OpenGL into GPU-specific commands, the KMD must bring the GPU hardware to a state where it can accept requests before it can share the device fairly among the UMDs in the system. This covers power management, parsing and loading the firmware, as well as giving the UMD a way to allocate GPU memory while ensuring isolation between different GPU contexts for security.

This was our initial focus for quite a few months while working on Tyr, and testing was mainly done through the IGT framework. These tests would mainly consist of performing simple ioctls() against the driver and subsequently checking whether the results made sense.

By the way, those willing to further understand the relationship between UMDs and KMDs on Linux should watch a talk given at Kernel Recipes by my colleague Boris Brezillon on the topic!

Submitting a single job

Once the GPU is ready to accept requests and userspace can allocate GPU memory as needed, the UMD can place all the resources required by a given workload in GPU buffers. These can be further referenced by the command buffers containing the instructions to be executed, as we explain in the excerpt below:

... continue reading