Previously on this blog I've written about how Nova JavaScript engine models garbage collection using the Rust borrow checker and how to work with it, I've rambled about how I came up with the model, and I've written about the philosophical underpinnings of garbage collection in general. Most importantly I have, together with a lot of contributors, written a JavaScript engine encompassing more than 100,000 lines of Rust using this model which is equal parts excellent and awful. It is excellent in that it manages to explain garbage collected handles in such a way that the borrow checker can check that unrooted handles are not kept on stack past garbage collection safepoints, but it is awful in how it achieves this, turning code into a soup of let handle = handle.bind(nogc) and handle.unbind() calls. A Norwegian university employee said of the system just last month: "That's worse than C++."
This entire time, I've been working with this model with the assumption that it is the correct way to model garbage collection, and that the manual aspects and some limitations of it are simply caused by limitations of the Rust borrow checker. Much of this changed last weekend because I was writing a safety comment to explain a very contrarian limitation of the system.
Working with a garbage collected heap
A garbage collected system always has some heap structure wherein it stores the garbage collected data. The heap will then contain garbage collected handles, ie. self-references. Let's consider a singular handle Handle<'_, T> stored on the heap and try to figure out what is the correct lifetime that we should ascribe to '_ .
Because this is a garbage collected system, as long as this Handle<'_, T> exists on the heap (and is itself reachable from some root) then the T is kept alive as well. It is incorrect for the Handle to be alive but the T to be dead, but once the Handle is dropped by the garbage collector it is also free to drop the T . This also applies to moving the T : conceptually we can say that if the data is moved, then it should first be copied into a new location, then a new Handle<'_, T> should replace the old handle, and only after that are we allowed to drop the original T . (Also note how this relates with eg. tombstones in concrete garbage collector implementations.) When the heap is dropped, all Handle s within are likewise dropped, but if the heap stays alive until the end of the program then so do the Handle s. It thus seems like the correct lifetime is some 'external that is determined by the heap's owner, but for convenience's sake we'll choose to use the 'static lifetime here.
Now, consider a singular handle Handle<'_, T> on the stack, and remember that these are unrooted handles and that our garbage collector does not do stack scanning. That means that the T is only guaranteed to exist until the next garbage collection run: the fact that we have a Handle<'_, T> in the first place means that the T should at least exist when we get the handle, but once garbage collection runs it might have dropped or moved the T such that our handle no longer points to a valid value. The lifetime we can ascribe to Handle is thus some 'local lifetime during which it is guaranteed that garbage collection does not happen. This 'local lifetime is obviously shorter than 'static . Now imagine we get a mutable reference to the handle on the heap and try to write a copy of our local handle into it, and watch what happens:
let local_handle: Handle<'local, T> = local; let heap_mut: &mut Handle<'static, T> = heap.get_mut(); *heap_mut = local_handle;
In garbage collection terms this is (basically) the act of "rooting" the local handle: we store the local handle on the heap where the garbage collector can see it, thus increasing its lifetime. This code is therefore completely fine from a logical standpoint. But! If we do this in the Nova JavaScript engine of today, it does not compile: our handles are covariant on their lifetime parameter, just like normal references, and using Rust references the above would look like this:
let local_handle: &'local T = &0; let heap_mut: &mut &'static T = heap.get_mut(); *heap_mut = local_handle;
This absolutely does not and should not compile: what the code here is saying is " heap_mut is a place that can store a reference to a T as long as that reference is valid until the end of the program", but we're trying to store a reference that is only valid until the end of this function call. Our reference's lifetime is too short, and allowing the code to compile would lead to use-after-free. So, obviously covariant lifetimes for garbage collected handles do not work. You can probably find many articles on the Internet decrying the borrow checker for not allowing this, but it is absolutely correct to stop us from doing use-after-free here. Yet, for garbage collected handles this is what we want to do and to do that we must turn to unsafe Rust. This is the kind of code that I was writing a safety comment on last weekend. Boiled down to its essentials, it looked much like this:
... continue reading