How to write Rust in the kernel: part 3 [LWN subscriber-only content]
Welcome to LWN.net The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net!
The interfaces between C and Rust in the kernel have grown over time; any non-trivial Rust driver will use a number of these. Tasks like allocating memory, dealing with immovable structures, and interacting with locks are necessary for handling most devices. There are also many subsystem-specific bindings, but the focus of this third item in our series on writing Rust in the kernel will be on an overview of the bindings that all kernel Rust code can be expected to use.
Rust code can call C using the foreign function interface (FFI); given that, one potential way to integrate Rust into the kernel would have been to let Rust code call kernel C functions directly. There are a few problems with that approach, however: __always_inline functions, non-idiomatic APIs, etc. In particular, C and Rust have different approaches to freeing memory and locking.
During the early planning phases, the project proposed adopting a rule that there should be a single, centralized set of Rust bindings for each subsystem, as explained in the kernel documentation. This has the disadvantage (compared to direct use of Rust's FFI) of creating some extra work for a Rust programmer who wishes to call into a new area of the kernel, but as more bindings are written that need should go away over time. The advantage of the approach is that there's a single set of standardized Rust interfaces to learn, with all of the documentation in one place, which should make building and understanding the bindings less work overall. The interfaces can also be reviewed by the Rust maintainers in one place for safety and quality.
Allocating memory
Like C, Rust puts local variables (including compound structures) on the stack by default. But most programs will eventually need the flexibility offered by heap allocation and the limitations on kernel-stack size mean that even purely local data may require heap-allocation. In user space, Rust programs use automatic heap allocations for some types — mainly Box (a smart pointer into the heap) and Vec (a growable, heap-allocated array). In the kernel, these interfaces would not provide nearly enough control. Instead, allocations are performed using the interfaces in the kernel::alloc module, which allow for specifying allocation flags and handling the possibility of failure.
The Rust interfaces support three ways to allocate kernel memory: Kmalloc , Vmalloc , and KVmalloc , corresponding to the memory-management API functions with similar names. The first two allocate physically contiguous memory or virtually contiguous memory, respectively. KVmalloc first tries to allocate physically contiguous memory, and then falls back to virtually contiguous memory. No matter which allocator is used, the pointers that are exposed to Rust are part of the virtual address space, as in C.
These three different types all implement the Allocator interface, which is similar to the unstable user-space trait of the same name. While the allocators can be used to directly create a [u8] (a sized array of bytes; conceptually similar to how malloc() returns a void * instead of a specific type), the more ergonomic and less error-prone use is to allocate Box or Vec structures. Since memory allocation is so common, the interfaces provide short aliases for boxes and vectors made with each allocator, such as KBox , KVBox , VVec , etc. Reference counted allocations can be made with Arc .
The choice of allocator is far from the only thing that kernel programmers care about when allocating memory, however. Depending on the context, it may or may not be acceptable to block, to swap, or to receive memory from a particular zone. When allocating, the flags in kernel::alloc::flags can be used to specify more details about how the necessary memory should be obtained:
let boxed_integer: Result, AllocError> = KBox::new(42, GFP_KERNEL);
That example allocates an unsigned 64-bit integer, initialized to 42, with the usual set of allocation flags ( GFP_KERNEL ). For a small allocation like this, that likely means the memory will come from the kernel's slab allocator, possibly after triggering memory reclamation or blocking. This particular allocation cannot fail, but a larger one using the same API could, if there is no suitable memory available, even after reclamation. Therefore, the KBox::new() function doesn't return the resulting heap allocation directly. Instead, it returns a Result that contains either the successful heap allocation, or an AllocError .
Reading generic types C doesn't really have an equivalent of Rust's generic types; the closest might be a macro that can be used to define a structure with different types substituted in for a field. In this case, the Result that KBox::new() returns has been given two additional types as parameters. The first is the data associated with a non-error result, and the second is the data associated with an error result. Matching angle brackets in a Rust type always play this role of specifying a (possibly optional) type to include as a field nested somewhere inside the structure.
Boxes, as smart pointers, have a few nice properties compared to raw pointers. A KBox is always initialized — KBox::new() takes an initial value, as shown in the example above. Boxes are also automatically freed when they are no longer referenced, which is almost always what one wants from a heap allocation. When that isn't the case, the KBox::leak() or KBox::into_raw() methods can be used to override Rust's lifetime analysis and let the heap allocation live until the programmer takes care of it with KBox::from_raw() .
Of course, there are also times when a programmer would like to allocate space on the heap, but not actually fill it with anything yet. For example, the Rust user-space memory bindings use it to allocate a buffer for user-space data to be copied into without initializing it. Rust indicates that a structure may be uninitialized by wrapping it in MaybeUninit ; allocating a Box holding a MaybeUninit works just fine.
Self-referential structures
The kernel features a number of self-referential structures, such as doubly linked lists. Sharing these structures with Rust code poses a problem: moving a value that refers to itself (including indirectly) could cause the invariants of this kind of structure to be violated. For example, if a doubly linked list node is moved, node->prev->next will no longer refer to the right address. In C, programmers are expected to just not do that.
But Rust tries to localize dangerous operations to areas of the code marked with unsafe . Moving values around is a common thing to do; it would be inconvenient if it were considered unsafe. To solve this, the Rust developers created an idea called "pinning", which is used to mark structures that cannot be safely relocated. The standard library is designed in such a way that these structures cannot be moved by accident. The Rust kernel developers imported the same idea into the kernel Rust APIs; when referencing a self-referential structure created in C, it must be wrapped in the Pin type on the Rust side. (Some other pointers in the kernel API, notably Arc , include an implicit Pin , so the wrapping may not always be visible). It might not immediately cause problems if Pin were omitted in the Rust bindings for a self-referential structure, but it would still be unsound, since it could let ostensibly safe Rust driver code cause memory corruption.
To simplify the process of allocating a large structure with multiple pinned components, the Rust API includes the pin_init!() and try_pin_init!() macros. Prior to their inclusion in the kernel, creating a pinned allocation was a multi-step process using unsafe APIs. The macro works along with the #[pin_data] and #[pin] macros in a structure's definition to build a custom initializer. These PinInit initializers represent the process of constructing a pinned structure. They can be written by hand, but the process is tedious, so the macros are normally used instead. Language-level support is the subject of ongoing debate in the Rust community. PinInit structures can be passed around or reused to build an initializer for a larger partially-pinned structure, before finally being given to an allocator to be turned into a real value of the appropriate type. See below for an example.
Locks
User-space Rust code typically organizes locks by having structures that wrap the data covered by the lock. The kernel API makes lock implementations matching that convention available. For example, a Mutex actually contains the data that it protects, so that it can ensure all accesses to the data are made with the Mutex locked. Since C code doesn't tend to work like this, the kernel's existing locking mechanisms don't translate directly into Rust.
In addition to traditional Rust-style locks, the kernel's Rust APIs include special types for dealing with locks separated from the data they protect: LockedBy , and GlobalLockedBy . These use Rust's lifetime system to enforce that a specific lock is held when the data is accessed.
Currently, the Rust bindings in kernel::sync support spinlocks, mutexes, and read-side read-copy-update (RCU) locks. When asked to look over an early draft of this article, Benno Lossin warned that the current RCU support is " very barebones ", but that the Rust developers plan to expand on it over time. The spinlocks and mutexes in these bindings require a lockdep class key to create, so all of the locks used in Rust are automatically covered by the kernel's internal locking validator. Internally, this involves creating some self-referential state, so both spinlocks and mutexes must be pinned in order to be used. In all, defining a lock in Rust ends up looking like this example lightly adapted from some of the Rust sample code:
// The `#[pin_data]` macro builds the custom initializer for this type. #[pin_data] struct Configuration { #[pin] data: Mutex<(KBox<[u8; PAGE_SIZE]>, usize)>, } impl Configuration { // The value returned can be used to build a larger structure, or it can // be allocated on the heap with `KBox::pin_init()`. fn new() -> impl PinInit { try_pin_init!(Self { // The `new_mutex!()` macro creates a new lockdep class and // initializes the mutex with it. data <- new_mutex!((KBox::new([0; PAGE_SIZE], flags::GFP_KERNEL)?, 0)), }) } } // Once created, references to the structure containing the lock can be // passed around in the normal way. fn show(container: &Configuration, page: &mut [u8; PAGE_SIZE]) -> Result { // Calling the mutex's `lock()` function returns a smart pointer that // allows access only so long as the lock is held. let guard = container.data.lock(); let data = guard.0.as_slice(); let len = guard.1; page[0..len].copy_from_slice(&data[0..len]); Ok(len) // `guard` is automatically dropped at the end of its containing scope, // freeing the lock. Trying to return data from inside the lock past the // end of the function without copying it would be a compile-time error. }
Using a lock defined in C works much like in show() above, except that there is an additional step to handle the fact that the data may not be directly contained in the lock structure:
// The C lock will still be released when guard goes out of scope. let guard = c_lock.lock(); // Data that is marked as `LockedBy` in the Rust/C bindings takes a reference // to the guard of the matching lock as evidence that the lock has been acquired. let data = some_other_structure.access(&guard);
See the LockedBy examples for a complete demonstration. The interface is slightly more conceptually complicated than C's mutex_lock() and mutex_unlock() , but it does have the nice property of producing a compiler error instead of a run-time error for many kinds of mistakes. The mutex in this example cannot be double-locked or double-freed, nor can the data be accessed without the lock held. It can still be locked from a non-sleepable context or get involved in a deadlock, however, so some care is still required — at least until the custom tooling to track and enforce kernel locking rules at compile time is complete.
This kind of safer interface is, of course, the ultimate purpose behind introducing Rust bindings into the kernel — to make it possible to write drivers where more errors can be caught at compile time. No machine-checked set of rules can catch everything, however, so the next (and likely final) article in this series will focus on things to look for when reviewing Rust patches.
to post comments