How to use storytelling to fit inline assembly into Rust
The Rust Abstract Machine is full of wonderful oddities that do not exist on the actual hardware. Inevitably, every time this is discussed, someone asks: “But, what if I use inline assembly? What happens with provenance and uninitialized memory and Tree Borrows and all these other fun things you made up that don’t actually exist?” This is a great question, but answering it properly requires some effort. In this post, I will lay down my current thinking on how inline assembly fits into the Rust Abstract Machine by giving a general principle that explains how anything we decide about the semantics of pure Rust impacts what inline assembly may or may not do.
Note that everything I discuss here applies to FFI calls just as much as it applies to inline assembly. Those mechanisms are fundamentally very similar: they allow Rust code to invoke code not written in Rust. I will not keep repeating “inline assembly or FFI” throughout the post, but every time I refer to inline assembly this is meant to also include FFI.
To get started, let me explain why there are things that even inline assembly is fundamentally not allowed to do.
Why can’t inline assembly do whatever it wants?
People like to think of inline assembly as freeing them from all the complicated requirements of the Abstract Machine. Unfortunately, that’s a pipe dream. Here is an example to demonstrate this:
use std :: arch :: asm ; #[inline(never)] fn innocent ( x : & i32 ) { unsafe { // Store 0 at the address given by x. asm! ( "mov dword ptr [{x}], 0" , x = in ( reg ) x , ); } } fn main () { let x = 1 ; innocent ( & x ); assert! ( x == 1 ); }
When the compiler analyzes main , it realizes that only a shared reference is being passed to innocent . This means that whatever innocent does, it cannot change the value stored at *x . Therefore, the assertion can be optimized away.
However, innocent actually does write to *x ! Therefore, the optimization changed the behavior of the program. And indeed, this is exactly what happens with current versions of rustc: without optimizations, the assertion fails, but with optimizations, it passes. Therefore, either the optimization was wrong, or the program had Undefined Behavior. And since this is an optimization that we really want to be able to perform, we can only pick the second option.
However, where does the UB come from? If the entire program was written in Rust, the answer would be “the aliasing model”. Both Stacked Borrows and Tree Borrows, and any other aliasing model worth considering for Rust, will make it UB to write through pointers derived from a shared reference. However, this time, parts of the program are not written in Rust, so things are not that simple. How can we say that the inline asm block violated Tree Borrows, when it is written in a language that does not have anything even remotely comparable to Tree Borrows? That’s what the rest of this post is about.
... continue reading