There is no memory safety without thread safety
Memory safety is all the rage these days. But what does the term even mean? That turns out to be harder to nail down than you may think. Typically, people use this term to refer to languages that make sure that there are no use-after-free or out-of-bounds memory accesses in the program. This is then often seen as distinct from other notions of safety such as thread safety, which refers to programs that do not have certain kinds of concurrency bugs. However, in this post I will argue that this distinction isn’t all that useful, and that the actual property we want our programs to have is absence of Undefined Behavior.
Breaking memory safety with a data race
My main issue with the division of safety into fine-grained classes such as memory safety and thread safety is that there’s no meaningful sense in which a thread-unsafe language provides memory safety. To see what I mean by this, consider this program written in Go, which according to Wikipedia is memory-safe:
package main // Just some arbitrary interface so we can later use an interface type. type Thing interface { get () int } // Two types implementing the interface, with fields of very different types. type Int struct { val int } func ( s * Int ) get () int { return s . val } type Ptr struct { val * int } func ( s * Ptr ) get () int { return * s . val } // A global variable of interface type, that we will swap back and // forth between pointing to an `Int` and to a `Ptr`. var globalVar Thing = & Int { val : 42 } // Repeatedly invoke the interface method on the global variable. func repeat_get () { for { x := globalVar x . get () } } // Repeatedly change the dynamic type of the global variable. func repeat_swap () { var myval = 0 for { globalVar = & Ptr { val : & myval } globalVar = & Int { val : 42 } } } func main () { go repeat_get () repeat_swap () }
If you run this program (e.g. on the Go playground), it will crash very quickly:
panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x2a pc=0x468863]
Note that the address that caused the segfault is 0x2a , the hex representation of 42. What is happening here?
This example exploits that Go stores values of interface types like Thing as pairs of a pointer to the data and a pointer to the vtable. Every time repeat_swap stores a new value in globalVar , it just does two separate stores to update those two pointers. In repeat_get , there’s thus a small chance that when we read globalVar in between those two stores, we get a mix of a pointer to an Int with the vtable for a Ptr . When that happens, we will run the Ptr version of get , which will dereference the Int ’s val field as a pointer – and hence the program accesses address 42, and crashes.
One could construct a similar example using Go’s slices, where the data pointer, length, and capacity of the slice are stored in separate words, and reading a half-updated value can lead to an out-of-bounds access.
... continue reading