I will often say that the so-called “C ABI” is a very bad one, and a relatively unimaginative one when it comes to passing complicated types effectively. A lot of people ask me “ok, what would you use instead”, and I just point them to the Go register ABI, but it seems most people have trouble filling in the gaps of what I mean. This article explains what I mean in detail.
I have discussed calling conventions in the past, but as a reminder: the calling convention is the part of the ABI that concerns itself with how to pass arguments to and from a function, and how to actually call a function. This includes which registers arguments go in, which registers values are returned out of, what function prologues/epilogues look like, how unwinding works, etc.
This particular post is primarily about x86, but I intend to be reasonably generic (so that what I’ve written applies just as well to ARM, RISC-V, etc). I will assume a general familiarity with x86 assembly, LLVM IR, and Rust (but not rustc’s internals).
Today, like many other natively compiled languages, Rust defines an unspecified0- calling convention that lets it call functions however it likes. In practice, Rust lowers to LLVM’s built-in C calling convention, which LLVM’s prologue/epilogue codegen generates calls for.
Rust is fairly conservative: it tries to generate LLVM function signatures that Clang could have plausibly generated. This has two significant benefits:
Good probability debuggers won’t choke on it. This is not a concern on Linux, though, because DWARF is very general and does not bake-in the Linux C ABI. We will concern ourselves only with ELF-based systems and assume that debuggability is a nonissue. It is less likely to tickle LLVM bugs due to using ABI codegen that Clang does not exercise. I think that if Rust tickles LLVM bugs, we should actually fix them (a very small number of rustc contributors do in fact do this).
However, we are too conservative. We get terrible codegen for simple functions:
fn extract ( arr : [ i32 ; 3 ]) -> i32 { arr [ 1 ] } Rust extract : mov eax , dword ptr [ rdi + 4 ] ret x86 Assembly
arr is 12 bytes wide, so you’d think it would be passed in registers, but no! It is passed by pointer! Rust is actually more conservative than what the Linux C ABI mandates, because it actually passes the [i32; 3] in registers when extern "C" is requested.
extern "C" fn extract ( arr : [ i32 ; 3 ]) -> i32 { arr [ 1 ] } Rust extract : mov rax , rdi shr rax , 32 ret x86 Assembly
... continue reading