The phaseout of the mmap() file operation in Linux

The phaseout of the mmap() file operation [LWN subscriber-only content]

Welcome to LWN.net The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net!

file_operations

mmap()

The file_operations structure in the kernel is a set of function pointers implementing, as the name would suggest, operations on files. A subsystem that manages objects which can be represented by a file descriptor will provide astructure providing implementations of the various operations that a user of the file descriptor may want to carry out. Themethod, in particular, is invoked when user space calls the mmap() system call to map the object behind a file descriptor into its address space. That method, though, is currently on its way out in a multi-release process that started in 6.17.

The file_operations structure was introduced in the 0.95 release in March 1992; at that point it supported the basic read() and write() operations and not much else. Support for mmap() first appeared in 0.98.2 later that year, though it took a while before it actually worked as expected. The interface has evolved a bit over time, of course; in current kernels, its prototype is:

int (*mmap) (struct file *, struct vm_area_struct *);

The vm_area_struct structure (usually referred to as a VMA) describes a range of a process's address space; in this case, it provides mmap() with information about the offset within the file that is to be mapped, how much is to be mapped, the intended page protections, and the address range where the mapping will be. The driver implementing mmap() is expected to do whatever setup is necessary to make the right thing happen when user space accesses memory within that range. There are hundreds of mmap() implementations within the kernel, some of which are quite complex.

As described in this 6.17 commit by Lorenzo Stoakes, though, there are some significant problems with this API. The mmap() method is invoked after the memory-management layer has done much of its setup for the new mapping. If the operation fails at the driver layer, all of that setup must be unwound, which can be a complicated task. The real problem, though, is that mmap() gives the driver direct access to the VMA, which is one of the core memory-management data structures. The driver can make changes to the VMA, and many do with gusto. Those changes can force the memory-management layer to redo some of its setup; worse, they can introduce bugs or create other types of unpleasant surprises.

Over the years, a number of important memory-management structures have been globally exposed in this way; more recently, developers have been working to make more of those structures private to the memory-management code. One step in that direction is to retire the mmap() method in favor of a new API that more clearly constrains what code outside of the memory-management layer can do.

... continue reading