Abstract Rust is a non-Garbage Collected (GCed) language, but the lack of GC makes expressing data-structures that require shared ownership awkward, inefficient, or both. In this paper we explore a new design for, and implementation of, GC in Rust, called Alloy. Unlike previous approaches to GC in Rust, Alloy allows existing Rust destructors to be automatically used as GC finalizers: this makes Alloy integrate better with existing Rust code than previous solutions but introduces surprising soundness and performance problems. Alloy provides novel solutions for the core problems: finalizer safety analysis rejects unsound destructors from automatically being reused as finalizers; finalizer elision optimises away unnecessary finalizers; and premature finalizer prevention ensures that finalizers are only run when it is provably safe to do so.
1 Introduction
Amongst the ways one can classify programming languages are whether they are Garbage Collected (GCed) or not: GCed languages enable implicit memory management; non-GCed languages require explicit memory management (e.g C 's malloc / free functions). Rust's use of affine types [25, p. 5] and ownership does not fit within this classification: it is not GCed but it has implicit scope-based memory management. Most portions of Rust programs are as succinct as a GCed equivalent, but ownership is too inflexible to express shared ownership for data-structures that require multiple owners (e.g. doubly linked lists). Workarounds such as reference counting impose an extra burden on the programmer, make mistakes more likely, and often come with a performance penalty.
In an attempt to avoid such problems, there are now a number of GCs for Rust (e.g. [2, 11, 14, 32, 33]). Most introduce a user-visible type Gc which takes a value v of type T and moves v to the 'GC heap'. The Gc value itself is a wrapper around a pointer to v on the GC heap. Gc can be cloned (i.e. duplicated) and dereferenced to a value of type &T (i.e. a type-safe pointer) at will by the user. When no Gc wrappers pointing to v can be found, indirectly or directly, from the program's roots (e.g. variables on the stack), then the GC heap memory for v can be reclaimed.
It has proven hard to find a satisfying design and implementation for a GC for Rust, as perhaps suggested by the number of attempts to do so. We identify two fundamental challenges for GC for Rust: how to give Gc an idiomatic and complete API; and how to make finalizers (i.e. the code that is run just before a value is collected by the GC) safe, performant, and ergonomic.
struct GcNode { value: u8, nbr: Option>>} impl Drop for GcNode { fn drop(&mut self) { println!("drop {}", self.value); } } fn main() { let mut gc1 = Gc::new(RefCell::new(GcNode { value: 1, nbr: None } )); gc1.borrow_mut().nbr = Some(gc1); let gc2 = Gc::new(RefCell::new(GcNode { value: 2, nbr: None } )); gc2.borrow_mut().nbr = Some(gc2); gc1 = gc2; force_gc(); println!("{} {}", gc1.borrow().value, gc1.borrow().nbr.unwrap().borrow().value); }
In this paper we introduce Alloy, a new GC for Rust: an example of its use is shown in Listing 1. Alloy uses conservative garbage collection (i.e. treating each reachable machine word as a potential pointer), which naturally solves the API challenge. However, the finalization challenge is much more involved: the causes of this challenge, and our solutions to it, occupy the bulk of this paper.
Normal Rust code uses destructors (i.e. code which is run just before a value is reclaimed by Rust's implicit memory management) extensively. Although finalizers and destructors may seem to be synonyms, existing GCs for Rust cannot reuse destructors as finalizers: the latter must be manually implemented for each type that needs it. Unfortunately, even this is trickier than it appears: it is not possible to implement a finalizer for Gc if T is an external library; some parts of destructors are automatically created by the Rust compiler, but hand-written finalizers must duplicate those parts manually; and users can accidentally cause a type's finalizer to be run more than once. In short, finalization in existing GCs for Rust is unpalatable.
GCs for Rust are not alone in requiring manually written finalizers. In a close cousin to our work, a GC proposal for C++, the reuse of destructors as finalizers was ruled out due to seemingly insoluble problems [8, p. 32], which we divide into four categories: (1) some safe destructors are not safe finalizers; (2) finalizers can be run prematurely; (3) running finalizers on the same thread as a paused mutator can cause race conditions and deadlocks; (4) and finalizers are prohibitively slower than destructors. All are, at least to some degree, classical GC problems; all are exacerbated in some way by Rust; and none, with the partial exception of #2, has existing solutions.
We show that it is possible to reuse most Rust destructors as finalizers in a satisfying way. We introduce novel solutions to the long-standing problems this implies by making use of some of Rust's unusual static guarantees. We thus gain a better GC for Rust and solutions to open GC problems. Our solutions, in order, are: (1) finalizer safety analysis extends Rust's static analyses to reject programs whose destructors are not provably safe to be used as finalizers; (2) premature finalizer prevention automatically inserts fences to prevent the GC from being 'tricked' into collecting values before they are dead; (3) we run finalizers on a separate thread; and (4) and finalizer elision statically optimises away finalizers if the underlying destructor duplicates the GC's work.
Alloy as an implementation is necessarily tied to Rust, though most of the novel techniques in this paper rely on general properties of affine types and ownership. While we do not wish to claim generality without evidence, it seems likely that many of the techniques in this paper will generalise to other ownership-based languages, as and when such emerge.
Although Alloy is not production ready, its performance is already reasonable: when we control for the (admittedly somewhat slow) conservative GC (BDWGC) Alloy currently uses, the performance of Alloy varies from 0.74× to, in the worst case, 1.17× that of reference counting. Alloy is also sufficiently polished (e.g. good quality error messages) in other ways for it to: show a plausible path forwards for those who may wish to follow it; and to allow others to evaluate whether GC for Rust is a good idea or not.
This paper is divided into four main parts: GC and Rust background (Section 2); Alloy's basic design (Section 3); destructor and finalizer challenges and solutions (Sections 4 to 7); and evaluation (Section 8). The first three parts have the challenge that our work straddles two areas that can seem mutually exclusive: GC and Rust. We have tried to provide sufficient material for readers expert in one of these areas to gain adequate familiarity with the other, without boring either, but we encourage readers to skip material they are already comfortable with.
2 Background
2.1 The Challenges of Shared Ownership in Rust
Rust uses affine types and ownership to statically guarantee that: a value has a single owner (e.g. a variable); an owner can move (i.e. permanently transfer the ownership of) a value to another owner; and when a value's owner goes out of scope, the value's destructor is run and its backing memory reclaimed. An owner can pass references to a value to other code, subject to the following static restrictions: there can be multiple immutable references (' & ') to a value or a single mutable reference (' &mut '); and references cannot outlast the owner. These rules allow many Rust programs to be as succinct as their equivalents in GCed languages. This suggests that the search for a good GC for Rust may be intellectually stimulating but of little practical value.
However, there are many programs which need to express data structures which do not fit into the restrictions of affine types and ownership. These are often described as 'cyclic data-structures', but in this paper we use the more abstract term 'shared ownership', which includes, but is not limited to, cyclic data-structures.
A common way of expressing shared ownership is to use the reference counting type Rc from Rust's standard library. For many data-structures, this is a reasonable solution, but some forms of shared ownership require juggling strong and weak counts. This complicates programs (see Listing 2) and can cause problems when values live for shorter or longer than intended.
struct RcNode { value: u8, nbr: Option>> } impl Drop for RcNode { fn drop(&mut self) { println!("drop {}", self.value); } } fn main() { let mut rc1 = Rc::new(RefCell::new(RcNode{value: 1, nbr: None})); rc1.borrow_mut().nbr = Some(Rc::downgrade(&rc1)); let rc2 = Rc::new(RefCell::new(RcNode{value: 2, nbr: None})); rc2.borrow_mut().nbr = Some(Rc::downgrade(&rc2)); rc1 = Rc::clone(&rc2); println!("{} {}", rc1.borrow().value, rc1.borrow().nbr.as_ref().unwrap().upgrade().unwrap().borrow().value); }
A different solution is to store values in a vector and use integer indices into that vector. Such indices are morally closer to machine pointers than normal Rust references: the indices can become stale, dangle, or may never have been valid in the first place. The programmer must also manually deal with issues such as detecting unused values, compaction, and so on. In other words, the programmer ends up writing a partial GC themselves. A variant of this idea are arenas, which gradually accumulate multiple values but free all of them in one go: values can no longer be reclaimed too early, though arenas tend to unnecessarily increase the lifetime of values.
A type-based approach is GhostCell [35], which uses branding to statically ensure that at any given point only one part of a program can access a shared ownership data-structure. This necessarily excludes common use cases where multiple owners (e.g. in different threads) need to simultaneously access disjoint parts of a data-structure.
Although it is easily overlooked, some workarounds (e.g. Rc ) rely on using unsafe Rust (i.e. parts of the language, often involving pointers, that are not fully statically checked by the compiler). Pragmatically, we assume that widely used code, even if technically unsafe, has been pored over sufficiently that it is trustworthy in practise. However, 'new' solutions that a programmer implements using unsafe Rust are unlikely to immediately reach the same level of trustworthiness.
While we do not believe that every Rust program would be improved by GC, the variety of workarounds already present in Rust code, and the difficultly of creating new ones, suggests that there is a subset that would benefit from GC.
2.2 GC Terminology
GC is a venerable field and has accumulated terminology that can seem unfamiliar or unintuitive. We mostly use the same terminology as Jones et al [ 19 ], the major parts of which we define here.
A program which uses GC is split between the mutator (the user's program) and the collector (the GC itself). At any given point in time, a thread is either running as a mutator or a collector. In our context, all threads run as a collector at least sometimes (for reasons that will become apparent later, some threads always run as a collector). Tracing and reclamation is performed during a collection phase. Our collections always stop-the-world, where all threads running mutator code are paused while collection occurs.
A tracing GC is one that scans memory looking for reachable values from a program's roots: values, including cycles of values, that are not reachable from the roots can then be reclaimed. In contrast, a pure reference counting GC does not scan memory, and thus cannot free values that form a cycle. Increasingly, GC implementations make use of multiple techniques (see [3]) but, for simplicity's sake, we assume that implementations wholly use one technique or another except otherwise stated. For brevity, we use 'GC' as a short-hand for 'tracing GC'; when we deal with other kinds of GC (e.g. reference counting), we explicitly name them.
We refer to memory which is allocated via Gc as being on the GC heap. We use the term 'GC value' to refer both to the pointer wrapped in a Gc and the underlying value on the GC heap, even though multiple pointers / wrappers can refer to a single value on the GC heap, unless doing so would lead to ambiguity.
We use 'Alloy' to refer to the combination of: our extension to the Rust language; our modifications to the rustc compiler; and our integration of the Boehm-Demers-Weiser GC (BDWGC) into the runtime of programs compiled with our modified rustc .
3 Alloy : Design and Implementation
In this section we outline Alloy's basic design and implementation choices – the rest of the paper then goes into detail on the more advanced aspects.
3.1 Basic Design
Alloy provides a Gc type that exposes an API modelled on the Rc type from Rust's standard library, because Rc : is conceptually similar to Gc ; widely used in Rust code, and its API familiar; and that API reflects long-term experience about what Rust programmers need.
When a user calls Gc::new(v) , the value v is moved to the GC heap: the Gc value returned to the user is a simple wrapper around a pointer to v 's new address. The same underlying GCed value may thus have multiple, partly or wholly overlapping, references active at any point. To avoid undermining Rust's ownership system, dereferencing a Gc produces an immutable (i.e. ' & ') reference to the underlying value. If the user wishes to mutate the underlying value, they must use other Rust types that enable interior mutability (e.g. RefCell or Mutex ).
One feature that Alloy explicitly supports is the ability in Rust to cast references to raw pointers and back again. This can occur in two main ways. Gc can be dereferenced to &T which can then, as with any other reference, be converted to *const T (i.e. a C-esque pointer to T). Gc also supports the common Rust functions ( into_raw and from_raw ) which wrap the value-to-pointer conversion in a slightly higher-level API. The ability to convert references to raw pointers is used in many places (e.g. Rust's standard C Foreign Function Interface (FFI)). We believe that a viable GC for Rust must allow the same conversions, but doing so has a profound impact because Rust allows raw pointers to be converted to the integer type usize and back.
Having acknowledged that pointers can be 'disguised' as integers, it is then inevitable that Alloy must be a conservative GC: if a machine word's integer value, when considered as a pointer, falls within a GCed block of memory, then that block itself is considered reachable (and is transitively scanned). Since a conservative GC cannot know if a word is really a pointer, or is a random sequence of bits that happens to be the same as a valid pointer, this over-approximates the live set (i.e. the blocks that the GC will not reclaim). Typically the false detection rate is very low (see e.g. a Java study which measures it at under 0.01% of the live set [28]).
Conservative GC occupies a grey zone in programming language semantics: in most languages, and most compiler's internal semantics, conservative GC is, formally speaking, unsound; and furthermore some languages (including Rust) allow arbitrary 'bit fiddling' on pointers, temporarily obscuring the address they are referring to. Despite this, conservative GC is widely used, including in the two most widespread web browsers: Chrome uses it in its Blink rendering engine [1] and Safari uses it in its JavaScript VM JavaScriptCore [26]. Even in 2025, we lack good alternatives to conservative GC: there is no cross-platform API for precise GC; and while some compilers such as LLVM provide some support for GC features [23], we have found them incomplete and buggy. Despite the potential soundness worries, conservative GC thus remains a widely used technique.
Conservative GC enables Alloy to make a useful ergonomic improvement over most other GCs for Rust whose Gc is only cloneable. Such types can be duplicated, but doing so requires executing arbitrary user code. To make the possible run-time cost of this clear, Rust has no direct syntax for cloning: users must explicitly call Rc::clone(&v) to duplicate a value v . In contrast, since Alloy's Gc is just a wrapper around a pointer it is not just cloneable but also copyable: duplication only requires copying bytes (i.e. no arbitrary user code need be executed). Copying is implied by assignment (i.e. w = v ), reducing the need for explicit cloning. This is not just a syntactic convenience but also reflects an underlying semantic difference: duplicating a Gc in Alloy is is a cheaper and simpler operation than most other GCs for Rust which which tend to rely, at least in part, on reference counting.
There is one notable limitation of Gc 's API relative to Rc . The latter, by definition, knows how many references there are to the underlying data, allowing the value stored inside it to be mutably borrowed at run-time if there is only a single reference to it (via get_mut and make_mut ). In contrast, Gc cannot know how many references there are to the underlying data. As we shall see in Section 8, some Rust programs are built around the performance advantages of this API (e.g. turning 'copy on write' into just 'write' in some important cases).
3.2 Basic Implementation
The most visible aspect of Alloy is its fork, and extension of, the standard Rust compiler rustc . We forked rustc 1.79.0, adding or changing approximately 5,500 Lines of Code (LoC) in the core compiler, and adding approximately 2,250 LoC of tests.
Alloy uses BDWGC [9] as the underlying conservative GC, because it is the most widely ported conservative GC we know of. We use BDWGC's GC_set_finalize_on_demand(1) API, which causes finalizers to be run on their own thread.
We had to make some minor changes to BDWGC to suit our situation. First, we disabled BDWGC's parallel collector because it worsens Alloy's performance. It is unclear to us why this happens: we observe significant lock contention within BDWGC during GC collections, but have not correlated this with a cause. Second, BDWGC cannot scan pointers stored in thread locals because these are platform dependent. Fortunately, rustc uses LLVM's thread local storage implementation, which stores such pointers in the PT_TLS segment of the ELF binary: we modified BDWGC to scan this ELF segment during each collection. Third, BDWGC dynamically intercepts thread creation calls so that it can can scan their stacks, but (for bootstrapping reasons) is unable to do so in our context: we explicitly changed Alloy to register new threads with BDWGC.
4 Destructors and Finalizers
In many GCed languages, 'destructor' and 'finalizer' are used as synonyms, as both terms refer to code run when a value's lifetime has ended. In existing GCs for Rust, these two terms refer to completely different hierarchies of code (i.e. destructors and finalizers are fundamentally different). In Alloy, in contrast, a reasonable first approximation is that finalizers are a strict subset of destructors. In this section we pick apart these differences, before describing the challenges of using destructors as finalizers.
When a value in Rust is dropped (i.e. at the point its owner goes out of lexical scope) its destructor is automatically run. Rust's destructors enable a style of programming that originated in C++ called RAII (Resource Acquisition Is Initialization) [30, Section 14.4]: when a value is dropped, the underlying resources it possesses (e.g. file handles or heap memory) are released. Destructors are used frequently in Rust code (to give a rough idea: approximately 15% of source-level types in our benchmark suite have destructors).
Rust destructors are formed of two parts, run in the following order: a user-defined drop method; and automatically inserted drop glue. Drop methods are optional and users can provide one for a type by implementing the Drop trait's drop method. Drop glue recursively calls destructors of contained types (e.g. fields in a struct). Although it is common usage to conflate 'destructor' in Rust with drop methods, drop glue is an integral part of a Rust destructor: we therefore use 'destructor' as the umbrella term for both drop methods and drop glue.
When considering finalizers for a GC for Rust, there are several layers of design choices. We will shortly see that finalizers cause a number of challenges (Section 4.1) and one choice would be to forbid finalizers entirely. However, this would mean that one could not sensibly embed types that have destructors in a Gc . While Rust does not always call destructors, the situations where this occurs are best considered 'exceptional': not calling destructors from Gc would completely undermine reasonable programmer expectations. Because of this, Alloy, and indeed virtually all GCs for Rust, support finalizers in some form.
However, existing GCs force distinct notions of destructors and finalizers onto the programmer. Where the former have the Drop trait, the latter typically have a Finalize trait. If a user type needs to be finalized then the user must provide an implementation of the Finalize trait. However, doing so introduces a number of problems: (1) external libraries are unlikely to provide finalizers, so they must be manually implemented by each consumer; (2) Rust's orphan rule [27, Section 6.12] prevents one implementing traits for types defined in external libraries (i.e. unless a library's types were designed to support Gc , those types cannot be directly GCed); (3) one cannot automatically replicate drop glue for finalizers; and (4) one cannot replicate rustc 's refusal to allow calls to the equivalent of Drop::drop .
Programmers can work around problems #1 and #2 in various ways. For example, they can wrap external library types in newtypes (zero-cost wrappers) and implement finalizers on those instead [20, Section 19.3]. Doing so is tedious but not conceptually difficult.
Problem #3 has partial solutions: for example, [14] uses the Trace macro to generate finalizer glue (the finalizer equivalent of drop glue) for struct fields. This runs into an unsolvable variant of problem #2: types in external libraries will not implement this trait and cannot be recursively scanned for finalizer glue.
Problem #4 is impossible to solve in Rust as-is. One cannot define a function that can never be called — what use would such a function have? A possible partial solution might seem to be for the finalize method take ownership of the value, but Drop::drop does not do so because that would not allow drop glue to be run afterwards.
4.1 General Challenges When Using Destructors as Finalizers
We have stated as our aim that Alloy should use destructors as finalizers. Above we explained some Rust-specific challenges — but there are several non-Rust-specific challenges too! Fundamentally, finalizers and destructors have different, and sometimes incompatible, properties. The best guide to these differences, and the resulting problems, is Boehm [6], supplemented by later work on support for GC in the C++ specification [8].
An obvious difference between destructors and finalizers is when both are run. While C++ and Rust define precisely when a destructor will be run, finalizers run at an unspecified point in time. This typically happens at some point after the equivalent destructor would run, though a program may exit before any given finalizer is run. There are, however, two situations which invert this. First, if a thread exits due to an error, and the program is either not compiled with unwinding, or the thread has crossed a non-unwinding ABI boundary, then destructors might not be run at all, where a GC will naturally run the equivalent finalizers: we do not dwell on this, as both behaviours are reasonable in their different contexts. Second, and more surprisingly, it is possible for finalizers in non-error situations to run prematurely, that is before the equivalent destructor [6, section 3.4].
A less obvious difference relates to where destructors and finalizers are run. Destructors run in the same thread as the last owner of a value. However, running finalizers in the same thread as the last owner of the value can lead to race conditions [24] and deadlocks [6, section 3.3] if a finalizer tries to access a resource that the mutator expects to have exclusive access too. When such problems affect destructors in normal Rust code, it is the clear result of programmer error, since they should have taken into account the predictable execution point of destructors. However, since finalizers do not have a predictable execution point, there is no way to safely access shared resources if they are run on the same thread. The only way to avoid this is to run finalizers on a non-mutator thread — but not all Rust types / destructors are safe to run on another thread.
There are several additional differences such as: finalizers can reference other GCed values that are partly, or wholly, 'finalized' and may have had their backing memory reused; and finalizers can resurrect values by copying the reference passed to the finalizer and storing it somewhere.
Over time, finalizers have thus come to be viewed with increasing suspicion. Java, for example, has deprecated, and intends eventually removing, per-type finalizers: instead it has introduced deliberately less flexible per-object 'cleaners', whose API prevents problems such as object resurrection and per-class finalization [13].
4.2 The Challenge of Finalizers for Alloy
At this point we hope to have convinced the reader that: a viable GC for Rust needs to be able to use existing destructors as finalizers whenever possible; but that finalizers, even in existing GCs, cause various problems.
It is our belief that some problems with finalizers are fundamental. For example, finalizers inevitably introduce latency between the last use of a value and its finalization.
Some problems with finalizers are best considered the accidental artefacts of older designs. Java's cleaners, for example, can be thought of as a more restrictive version of finalizers that allow most common use-cases but forbid by design many dangerous use cases. For example, per-class/struct finalization can easily be replaced by per-object/value finalization; and object resurrection can be prevented if object access requires a level of indirection. Alloy benefits from our better shared understanding of such problems and the potential solutions: it trivially addresses per-object/value finalization ( Gc::new_unfinalizable function turns finalization off for specific values) and, as we shall see later, via only slightly more involved means, object resurrection.
However, that leaves many problems that are potentially in the middle: they are not obviously fundamental, but there are not obvious fixes for them either. In our context, where we wish to use destructors as finalizers, four problems have hitherto been thought insoluble [8, p. 32]: (1) finalizers are prohibitively slower than destructors; (2) finalizers can run prematurely; (3) running finalizers on the same thread as a paused mutator can cause race conditions and deadlocks; (4) some safe destructors are not safe finalizers.
Fortunately for us, Rust's unusual static guarantees, suitably expanded by Alloy, allow us to address each problem in novel, satisfying, ways. In the following section, we tackle these problems in the order above, noting that we tackle problems #1 and #2 separately, and #3 and #4 together.
5 Finalizer Elision
As we shall see in Section 8, there is a correlation between the number of finalizers that are run and overhead from GC (with a worst case, albeit a definite outlier, in our experiment of 3.35× slowdown). In this section we show how to reduce the number of finalizers that are run, which helps reduce this overhead.
A variety of factors contribute to the finalizer performance overhead, including: a queue of finalizers must be maintained, whereas destructors can be run immediately; finalizers run some time after the last access of a value, making cache misses more likely; and finalizers can cause values (including values they own) to live for longer (e.g. leading to increased memory usage and marking overhead). Most of these factors are inherent to any GC and our experience of using and working on BDWGC– a mature, widely used GC – does not suggest that it is missing optimisations which would overcome all of this overhead.
Instead, whenever possible, Alloy elides finalizers so that they do not need to be run at all. We are able to do this because: (1) BDWGC is responsible for all allocations and will, if necessary GC allocations even if they are not directly wrapped in a Gc ; and (2) many Rust destructors only free memory which BDWGC would, albeit with some latency, do anyway.
Consider the standard Rust type Box which heap allocates space for a value; when a Box value is dropped, the heap allocation will be freed. We can then make two observations. First, Box 's drop method solely consists of a deallocate call. Second, while we informally say that Box allocates on the 'heap' and Gc allocates on the 'GC heap', all allocations in Alloy are made through BDWGC and stored in the same heap.
When used as a finalizer, Box 's drop method is thus unneeded, as the underlying memory will be freed by BDWGC anyway. This means that there is no need to run a finalizer for a type such as Gc> at all, and the finalizer can be statically elided. However, we cannot elide a finalizer for a type such as Gc>> because Rc 's drop method must be run for the reference count to be decremented. As this shows, we must consider the complete destructor, and not just the top-level drop method, when deciding whether a corresponding finalizer can be elided.
function NeedsFinalizer(T): if Impls(T, Drop) and not Impls(T, DropMethodFinalizerElidable) then return true; for each field ∈ T do if NeedsFinalizer(field) then return true; return false;
impl Gc { pub fn new(value: T) -> Self { if needs_finalizer::() { Gc::new_with_finalizer(value) } else { Gc::new_without_finalizer(value) } ... } }
5.1 Implementing Finalizer Elision
Finalizer elision statically determines which type's destructors do not require corresponding finalizers and elides them. It does so conservatively, and deals correctly with drop glue.
As shown in Algorithm 1, any type which implements the Drop trait requires finalization unless it also implements the new DropMethodFinalizerElidable marker trait (i.e. a trait without methods). This trait can be used by a programmer to signify that a type's drop method need not be called if the type is placed inside a Gc . The 'Drop' part of the trait name is deliberate (i.e. it is not a simplification of 'destructor') as it allows the programmer to reason about a type locally (i.e. without considering drop glue or concrete type paramaters). If the type has a transitively reachable field whose type implements the Drop trait but not the DropMethodFinalizerElidable trait, then then the top-level type still requires finalization.
Even though neither normal Rust destructors or Alloy finalizers are guaranteed to run, a program whose destructors or finalizers never run would probably not be usable (leaking resources such as memory, deadlocking, and so on). We therefore make DropMethodFinalizerElidable an unsafe trait, because implementing it inappropriately is likely to lead to undesired – though not incorrect! – behaviour at run-time.
Alloy modifies the standard Rust library to implement DropMethodFinalizerElidable on the following types: Box , Vec , RawVec , VecDeque , LinkedList , BTreeMap , BTreeSet , HashMap , HashSet , RawTable , and BinaryHeap . Fortunately, not only are these types' drop methods compatible with DropMethodFinalizerElidable , but they are extensively used in real Rust code: they enable significant numbers of finalizers to be elided.
Listing 3 shows the new const compiler intrinsic needs_finalizer we added to implement Algorithm 1. The intrinsic is evaluated at compile-time: its result can be inlined into Gc::new , allowing the associated conditional to be removed too. In other words – compiler optimisations allowing – the 'does this specific type require a finalizer?' check has no run-time overhead.
6 Premature Finalizer Prevention
Most of us assume that finalizers are always run later than the equivalent destructor would have run, but they can sometimes run before [6, section 3.4], undermining soundness. Such premature finalization is also possible in Alloy as described thus far (see Listing 4). In this section we show how to prevent premature finalization.
There are two aspects to premature finalization. First, language specifications often do not define, or do not precisely define, when the earliest point that a value can be finalized is. While this means that, formally, there is no 'premature' finalization, it seems unlikely that language designers anticipated some of the resulting implementation surprises (see e.g. this example in Java [29]). Second, compiler optimisations – at least in LLVM – are 'GC unaware', so optimisations such as scalar replacement can change the point in a program when GCed values appear to be finalizable.
struct S { value: u8 } impl Drop for S { fn drop(&mut self) { self.value = 0; } } fn main() { let root = Gc::new(Box::new(S{ value: 1 })); let inner: &u8 = &**root.value; force_gc(); println!("{}", *inner); }
In our context, it is natural to define premature finalization as a (dynamic) finalizer for a Gc value running before the (static) Gc owner has gone out of scope. Similar to the high-level proposal mooted in [7, Solution 1], we must ensure that the dynamic lifetime of a reference derived from a Gc matches or exceeds the lifetime of the Gc itself.
Our solution relies on adjusting Gc 's drop method to keep alive a GCed value for at least the static lifetime of the Gc itself. In other words, we ensure that the conservative GC will always see a pointer to a GCed value while the corresponding Gc is in-scope.
However, there is a major problem to overcome: copyable types such as Gc are forbidden from having destructors. The fundamental challenge we have to solve is that each copied value will have a destructor called on it, which has the potential for any shared underlying value to be destructed more than once. Alloy explicitly allows Gc – but no other copyable type – to have a destructor, but to ensure it doesn't cause surprises in the face of arbitrary numbers of copies, the destructor must be idempotent. Our task is made easier because Gc naturally has no drop glue from Rust's perspective: Gc consists of a field with a pointer type, and such types are opaque from a destruction perspective.
We therefore only need to make sure that Gc 's drop method is idempotent. Fortunately, this is sufficient for our purposes: we want the drop method to inhibit finalization but that does not require run-time side effects. To achieve this, we use a fence. These come in various flavours. What we need is a fence that prevents both: the compiler from reordering computations around a particular syntactic point; and the CPU from reordering computations around a particular address. We copy the platform specific code from the BDWGC GC_reachable_here macro into Gc 's drop method, which achieves the effect we require.
6.1 Optimising Premature Finalizer Prevention
The drop method we add to Gc fully prevents premature finalization. It also naturally solves a performance problem with the suggested solution for C++ in [7, Solution 1], which requires keeping alive all pointers, no matter their type, for their full scope. By definition, our solution only keeps alive Gc values: the compiler is free to optimise values of other types as it so wishes. However, on an artificial microbenchmark we observed a noticeable overhead from our fence insertion.
We thus implemented a simple optimisation: we only insert fences for a Gc if it has a finalizer. Intuitively, it seems that we should not generate drop methods in such cases, but this is difficult to do directly inside rustc . Instead, we suppress calls to the drop methods of such types: the two approaches are functionally equivalent, though ours does put an extra burden on dead-code elimination in the compiler tool-chain.
Alloy adds a new pass RemoveElidableDrops to rustc 's Mid-Level Intermediate Representation (MIR) processing. MIR is best thought of as the main IR inside rustc : it contains the complete set of functions in the program, where each function consists of a sequence of basic blocks. Simplifying somewhat, function and drop method calls are represented as different kinds of terminators on basic blocks. Terminators reference both a callee and a successor basic block.
The remove_elidable_drops pass iterates over a program's MIR, identifies drop method terminators which reference elidable finalizers, and turns them into 'goto' terminators to the successor basic basic block. Algorithm 4 in the Appendix presents a more formal version of this algorithm.
7 Finalizer Safety Analysis
In this section we address two high-level problems: running finalizers on the same thread as a paused mutator can cause race conditions and deadlocks; and some safe destructors are not safe finalizers. Addressing the former problem is conceptually simple – finalizers must be run on a separate thread – but we must ensure that doing so is sound. We therefore consider this a specific instance of the latter problem, treating both equally in this section.
We therefore introduce Finalizer Safety Analysis (FSA), which prevents unsafe (in the sense of 'not safe Rust') destructors being used as finalizers. As a first approximation, FSA guarantees that finalizers are memory safe, cycle safe (i.e. do not access already finalized objects), and thread safe. We present the three main components of FSA individually before bringing them together.
7.1 Rust References
Gc can store, directly or indirectly, normal Rust references (i.e. & and &mut ), at which point it is subject to Rust's normal borrow checker rules and cannot outlive the reference. However, finalizers implicitly extend the lifetime of a GCed value, including any stored references: accessing a reference in a finalizer could undermine Rust's borrow checking rules.
A simple way of avoiding this problem is to forbid Gc from storing, directly or indirectly, references. This might seem to be no great loss: storing references in a Gc largely nullifies the 'GCness' of Gc . However, we found the result hard to use, as it can make simple tasks such as gradually migrating existing code to use Gc painful.
A moderate, but in our experience insufficient, relaxation is to recognise that only types that need a finalizer can possibly have problems with references, and to forbid such types from storing references in Gc . For example, if there is no drop method for struct S{x: &u8} then its destructor is safe to use as a finalizer, since its non-drop aspects will not use the &u8 reference.
The eventual rule we alighted upon for FSA is that a destructor for a type T can be used as a finalizer provided the destructor's drop methods do not obtain references derived from T 's fields (including fields reachable from its attributes). Using Rust's terminology, we forbid projections (which include a struct's fields, indexes into a vector, and so on) in destructors from generating references. Any non-projection references that are used in a destructor are by definition safe to use, as they either exist only for the duration of the drop method (references to variables on the stack) or will exist for the remainder of the program (references to global variables).
This rule over-approximates the safe set of destructors. For example, a drop method that creates a new value and tries to obtain a reference to a field in it (i.e. a projection) cannot be a destructor under FSA, even though the reference cannot outlast the drop method. We found that attempting to relax our rule further to deal with such cases rapidly complicates exposition and implementation.
struct GcNode { value: u8, nbr: Option >> } impl Drop for GcNode { fn drop(&mut self) { self.value = 0; println!("{}", self.nbr.unwrap().borrow().value); } } fn main() { let mut gc1 = Gc::new(RefCell::new(GcNode{value: 1, nbr: None})); let gc2 = Gc::new(RefCell::new(GcNode{value: 2, nbr: None})); gc1.borrow_mut().nbr = Some(gc2); gc2.borrow_mut().nbr = Some(gc1); }
7.2 Cycles and Finalization
One of the main motivations for GCs is that they solve problems with cyclic data structures. However, finalizers can be unsound if they access state shared within members of a cycle. Listing 5 shows an example of undefined behaviour when two GCed values create a cycle and both their finalizers reference the other GCed value. Whichever order the finalizers are run in, at least one of the finalizers will see the other GCed value as partly or wholly 'finalized'.
Most languages and systems we are aware of assume that users either don't run into this problem (finalization cycles are considered rare in GCed languages [19, p. 229]) or know how to deal with it when they do (e.g. refactoring the types into parts that do and don't require finalization [6, p. 11]). There is no fully automatic solution to this problem. Some GCs offer weak references, which allow users to detect when finalization cycles have been broken, though they still have to deal with the consequences manually.
We wanted to provide users with static guarantees that their destructors will not behave unexpectedly when used as finalizers in a cycle. A first attempt at enforcing such a property might seem to be that a Gc cannot have, directly or indirectly, fields of type Gc . This would indeed prevent the mistakes we want to catch but also disallow shared ownership! We therefore check only that a type's destructor does not, directly or indirectly, access a Gc . This allows GCed types to express shared ownership so long as their destructor(s) do not access other GC types.
To make this check easier to implement, we introduce an auto trait [27, Section. 11], a kind of marker trait that the compiler propagates automatically. An auto trait A will be automatically implemented for a type T unless one of the following is true: there is an explicit negative implementation of A for T ; or T contains a field that is not itself A . Informally, we say that a negative implementation of an auto-trait pollutes containing types.
Our new auto trait is called FinalizerSafe , and we provide a single negative implementation impl !FinalizerSafe for Gc . This naturally handles transitively reachable code, allowing FSA itself to only check that a destructor's direct field accesses are FinalizerSafe .
7.3 Destructors Need to be Runnable on a Finalizer Thread
impl Drop for GcNode { fn drop(&mut self) { println!("drop {}", self.value.lock().unwrap()); } } fn main() { let counter = Rc::new(Mutex::new(0)); { let _ = Gc::new(GcNode { value: Rc::clone(&counter), nbr: None }); } let r1 = counter.lock().unwrap(); force_gc(); assert_eq!(*r1, 0); }
Running finalizers on the same thread as a mutator can cause problems when the finalizer accesses state shared with the mutator (see Section 4.1 for a general description and Listing 6 for a concrete example). The most general solution to this problem is to run finalizers on a separate finalizer thread that never runs mutator code.
We must therefore ensure that it is safe to run a type's destructor on the finalizer thread. A conservative definition is that Gc is safe to use if T implements both of Rust's existing Send (denoting a type that can be permanently moved from one thread to another) and Sync (denoting a type that can be safely accessed simultaneously by multiple threads) auto traits. However, requiring that finalization be restricted to types that implement both Send and Sync can be frustrating, particularly because more types implement Send than Sync .
It may seem sufficient for T to implement Send alone so that the value can be safely sent to the finalizer thread. However, this would not prevent a finalizer indirectly accessing state shared with a non-GCed value via a mechanism such as Arc , causing the very problems we are trying to avoid.
FSA thus ignores whether a type implements Send or Sync (or not) and instead examines the destructor directly. To pass FSA: the destructor must not access thread locals; and any types the destructor accesses via projections must implement both Send and Sync . Intuitively, this allows a non- Send -or- Sync type T to have a safe finalizer provided that T 's destructor only access the Send and Sync 'subset' of T .
This rule shows clearly that FSA is a form of abstract interpretation rather than a mere extension of the type system. After careful examination we believe this is compatible with Rust's semantics (and rustc and LLVM's implementations) at the time of writing, but it is worth knowing that this rule would be unsafe in other languages and implementations (for example our assumption would be unsafe in Java due to synchronisation removal [31]). We leave it as an open question to others as to whether Rust should deliberately permit or forbid such checks in its semantics.
The implementation of the finalization thread is fairly simple. For example, we do not need to explicitly synchronise memory between the mutator and finalization threads because BDWGC's stop-the-world collection phase already synchronises all memory between threads.
7.4 Putting it All Together
function FinalizerSafetyAnalysis(func): for each basic_block ∈ func do t ← basic_block.terminator; if not IsGcConstructorCall(t) then continue; ty ← GetTyOfGcValue(t); if isFinalizerUnchecked(ty) or not NeedsFinalizer(ty) then continue; for each drop_method ∈ GetDropGlue(ty) do if not IsMIRAvailable(drop_method) then EmitFinalizerUnsafeError(); CheckFunctionSafety(drop_method); function CheckFunctionSafety(drop): for each basic_block ∈ drop do for each statement ∈ basic_block do for each projection ∈ statement do if not IsFinalizerSafe(projection.element) then EmitFinalizerUnsafeError(); if IsFunctionCall(basic_block.terminator) then CheckFunctionSafety(basic_block.terminator); function IsFinalizerSafe(ty): return Impls(ty, Send) and Impls(ty, Sync) and Impls(ty, FinalizerSafe);
FSA integrates the seemingly separate components presented above into one. It iterates over every function in a Rust program analysing destructors of types that are used in Gc . Algorithm 2 shows the essence of FSA (for example eliding details of caching which Alloy uses to speed up compile times).
Because FSA is a form of abstract interpretation, we need to determine when to run FSA on a program. In essence, whenever a previously unchecked type T is used to create a new Gc , FSA is run. As well as the Gc::new constructor, Gc instances can be created with conversion traits such as From . We annotated each such entry point with a new rustc -only attribute rustc_fsa_entry_point : calls to functions with this attribute lead to FSA checks.
A naive implementation of FSA would be a notable cost, so Alloy uses several optimisations. As alluded to above, FSA caches the results of various checks to avoid pointlessly repeating work. We also extend FinalizerSafe with negative implementations for &T , and &mut T . If a type T implements all of FinalizerSafe , Send , and Sync , we know that there can be no unsafe projections used in a destructor, and can bypass most FSA checks entirely (though we still need to check for thread local accesses). Across our benchmark suite, FSA increases compilation time in release mode by a modest 0.8–1.6%.
Algorithm 2 also captures Alloy's approach to error messages. Rather than just inform a user that 'your drop method has not passed FSA', Alloy pinpoints which field or line in a drop method caused FSA to fail: EmitReferenceError informs the user when a reference in a type is used in a way that violates FSA (see Section 7.1); and EmitFinalizerUnsafeError when a drop method contains code which is unsafe (e.g. references a Gc type, an opaque function, etc.). Listing 7 shows an example of the errors reported by Alloy: note that it pinpoints the line within a drop method that caused an FSA error.
error: `RefCell::new(GcNode{value: 2, nbr: None})` cannot be safely finalized. --> finalization_cycle.rs:11:21 | 7 | fn drop(&mut self) { self.value = 0; println!("{}", self.nbr.unwrap().borrow().value); } | ^^^^^^^^ | a finalizer cannot safely dereference this | because it might have already been finalized. ... 11 | let gc2 = Gc::new(RefCell::new(GcNode{value: 2, nbr: None})); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | has a drop method which cannot be safely finalized.
7.4.1 Awkward Kinds of Functions
FSA can encounter two kinds of 'awkward' functions.
First, some functions (e.g. due to use of trait objects, or FFIs) do not have a body available when FSA runs: using such a function necessarily causes an FSA check to fail. One common class of functions which causes this are Rust intrinsics (e.g. min_align_of etc.): we audited the most frequently used of these and annotated those which are FSA-safe with a new rustc_fsa_safe_fn attribute. Other functions whose bodies are unknown cause FSA to fail.
Second, in most cases, FSA runs on Rust functions whose generic types have been replaced with concrete types (in Rust terminology, functions have been 'monomorphised'). Sometimes, however, FSA encounters functions (e.g. intrinsics or functions with certain annotations) whose generic types have not yet been replaced. FSA can still run on such functions, but will reject them unless all generic types imply the FinalizerSafe , Send , and Sync traits. Note that calling a method on a generically typed value will lead to FSA finding a method without a body: as in the first case above, this will cause FSA to fail.
The common theme to both is that we wish FSA to be sound, at which point we forego completeness. This can cause users frustration when FSA raises an error on code they know is FSA safe. As is common in Rust, we therefore provide an unsafe escape hatch which allows users to silence FSA errors when they can prove to their satisfaction that doing so does undermine correctness. We experimented with a per-type approach, but found that unduly restrictive: we therefore provide a per-value escape hatch with the unsafe FinalizerUnchecked type. Values wrapped in this type are considered safe to use at all points in FSA. Our aim is that users should rarely need to resort to this escape hatch, but, as is not uncommon in Rust, there are valid idioms of use where we found it necessary.
8 Evaluation
Version Description #benchmarks Binary Trees Debian CLBG Rust#2 Heap allocation microbenchmark 1 Regex-Redux Debian CLBG Rust#1 Regular expression matching 1 Alacritty v0.15.0-dev Terminal emulator 10 fd v9.0.0 Unix find replacement 7 grmtools v0.13.4 Lexer / parser library 4 Ripgrep v14.1.1 Fast grep replacement 13 som-rs-ast git #35b780 SOM AST VM 26 som-rs-bc git #35b780 SOM bytecode VM 26
In this section we explain our methodology and our experimental results.
8.1 Methodology
8.1.1 The Benchmark Suite
There is no existing benchmark suite for GCs for Rust. Even if such a suite did exist, it may not have been suitable for our purposes because in experiment E GCvs we want to compare programs using existing shared ownership approaches. We searched through roughly the 100 most popular Rust libraries on crates.io (the de facto standard Rust package system) looking for suitable candidates. In practise this meant we looked for crates using reference counting. In the interests of brevity, for the rest of this section we use ' Rc ' to cover both Rc and its thread-safe cousin Arc .
Gc Rc Weak Alacritty 107 9,450 1,970 Binary Trees 2 0 0 fd 7 421 1 grmtools 299 1,825 23 Regex-Redux 108 109 0 Ripgrep 104 249 4 som-rs-ast 206 35 0 som-rs-bc 464 39 0
Table 1 shows the resulting suite: note that, except for Binary Trees and Regex-Redux, the 'benchmarks' are themselves benchmark suites. Collectively, our suite contains – depending on whether you count the SOM implementations' (identical) benchmark suites collectively or separately – 62 or 88 benchmarks. Table 2 shows how often relevant types are used after porting. Table 3 shows the distribution of heap data at run-time. This shows that our suite contains benchmarks with a variety of memory patterns.
Binary Trees is allocation intensive and sufficiently simple that it can be easily and meaningfully ported to additional shared ownership strategies: Rust-GC, a user library for GC for Rust [14]; and Arena , a non-GC memory arena [10]. Alacritty, fd, and Ripgrep are well known Rust programs, all of which have their own benchmark suites. grmtools is a parsing library which uses Rc extensively in error recovery: we benchmarked it using 28KLoC of real Java source code, which we mutated with syntax errors.
SOM is a small, but complete, language in the Smalltalk mould, which has a wide variety of implementations. Our suite includes two of these: som-rs-ast (which represents programs as ASTs); and som-rs-bc (which represents programs as bytecode). Both are existing ports of a Java SOM VM into Rust and use Rc . We use the same SOM core-lib benchmarks for both, derived from git commit #afd5a6.
We were not able to port all parts of all programs. In particular, some programs make extensive use of the make_mut and get_mut functions in Rc , which allow a programmer to mutate their contents if, at run-time, they only have a single owner. There is not, and cannot be, equivalent functionality with a copyable Gc type. In some cases we were able to successfully use alternative mechanisms. In others we judged the usages to either be rare at run-time (i.e. not worth porting), or too difficult to port (i.e. too much of the program is built around the resulting assumptions). In a small number of cases we ended up introducing bugs. Alacritty's UTF-8 support is an example, resulting in deadlocks. Whenever we encountered a bug in our porting, we reverted back to Rc for that portion of the port.
Allocated (#) GC owned (%) Rc Box Gc Alacritty 125 8,770 2 2.70 Binary Trees 0 3,222,201 3,222,190 100.00 fd 17,821 306,902 61 1.23 grmtools 2,283 19,859,431 4,038,605 44.19 Regex-Redux 45 3,132 78 15.39 Ripgrep 12,786 521,366 26,069 17.97 som-rs-ast 15 8,586,976 1,533,728 76.95 som-rs-bc 15 2,397,931 1,530,325 99.71
8.1.2 What We Couldn't Include in the Benchmark Suite
We tried porting 10 other programs that are not included in our benchmark suite. To avoid readers wondering if we have 'cherry-picked' our eventual benchmark suite, we briefly report why those other programs have been excluded. All excluded benchmarks are shown in Table 4 in the Appendix.
Several programs (e.g. numbat, mini-moka, and salsa), once ported, turned out to be uninteresting from a GC benchmarking perspective. Irrespective of the number of source locations that reference memory allocation types, the benchmarks we could run from them allocated sufficiently little memory that there are no worthwhile differences between different allocation strategies. Put another way: these programs are in a sense 'the same' from our evaluation perspective.
Two programs (bevy and rust-analyzer) did not run correctly after porting. Both extensively use the make_mut or get_mut functions in Rc and reverting those changes made the benchmarks uninteresting.
We also ported RustPython, but were unable to adjust it to faithfully implement Python-level destructors. In essence, in RustPython's default configuration, its representation of objects is not compatible with FSA. This means that we can not run Python __del__ methods in the finalizer thread. Although technically this is still compatible with Python's semantics, we felt this would be a misleading comparison, as our port of RustPython would be doing less work.
8.1.3 Running the Experiment
Our experiment can be seen as a comparison of Alloy against 'normal' Rust. Fortunately, Alloy is a strict superset of 'normal' Rust: only if users explicitly opt into GC does Alloy really become a 'GC for Rust'. This allows us to use the same compiler, standard library, and so on, removing several potential confounding factor in our results. We compile two binaries: one without logging features compiled and one with. We only use the latter when reporting collector related metrics.
A challenge in our experiment is that different allocation strategies can use different underlying allocators. In particular, Alloy has to use BDWGC, but, for example, Rc can use a modern allocator such as jemalloc. Much has changed in the performance of allocators since BDWGC's 1980s roots: in Rc -only benchmarks, we observe an inherent overhead from BDWGC of 2–26% relative to jemalloc (see Table 6 in the Appendix), which is a significant, and variable, confounding factor. Fortunately, BDWGC can be used as a 'traditional' allocator that allocates and frees on demand (i.e. no conservative GC occurs): in the main experiment, we thus use BDWGC as the allocator for all benchmarks.
We want to understand the memory usage of different allocation strategies over the lifetime of a benchmark. However, there is no single metric which captures 'memory usage', nor even an agreed set of metrics [5]. We use two metrics to capture different facets: (1) heap footprint, the amount of live heap memory recorded by by Heaptrack [34] at every allocation and deallocation; and (2) Resident Set Size (RSS), the total physical memory in RAM used by the process (including memory-mapped files, stack, and code/text segments), sampled at 10Hz. The overhead of recording heap footprint is much greater than RSS, but it provides a more detailed view of memory usage.
Another pair of confounding factors are the initial and maximum sizes of the GC heap: too-small values can lead to frequent resizing and/or 'thrashing'; large values to unrealistically few collections. What 'small' and 'large' are varies by benchmark, and 'careful' (or thoughtless) choices can significantly distort one's view of performance. BDWGC uses an adaptive strategy by default, growing the heap size as it detects that it would benefit from doing so. To give some sense of whether a different strategy and/or heap size would make a difference, we ran our benchmarks with three different fixed heap sizes. Doing so either has little effect or speeds benchmarks up; when it does so, the impact is generally under 10% and is at most 28% (the detailed results are presented in Table 9 in the Appendix). Broadly speaking, this suggests that BDWGC's default heap sizing approach, at least in our context, is not significantly distorting our view of performance.
We ran each benchmark in our suite 30 times. We report wall-clock times as returned by the standard Unix time utility. The SOM benchmarks are run using its conventional rebench tool; we adjusted rebench to use time for consistency with our other benchmarks. We ran all benchmarks on an AMD EPYC 7773X 64-Core 3.5GHz CPU with 128GiB RAM, running Debian 12 ('bookworm'). We turned off turbo boost and hyper-threading, as both can colour results.
8.1.4 Data Presentation
Except where otherwise stated, we report means and 99% confidence intervals for all metrics. We use the arithmetic mean for individual benchmarks and the geometric mean for benchmark suites.
When plotting time-series (i.e. sampled) memory metrics, we face the challenge that different configurations of the same benchmark can execute at different speeds. We thus resample each benchmark's data to 1000 evenly spaced points using linear interpolation. We chose 1000 samples because it is considerably above the visual resolution of our plots. After normalization, we calculate the arithmetic mean of the memory footprint measurement at each grid point (and not the raw underlying data) across all runs of the same benchmark. We record 99% confidence intervals at each point and show the result as shaded regions around the mean.
8.2 Results
The main results for E GCvs can be seen in Fig. 1. Though there is variation, Alloy has an overhead on wall-clock time of 5% on our benchmark suite. The effect on memory is more variable though, unsurprisingly, Alloy typically has a larger average heap footprint (i.e. allocated memory lives for longer). This metric needs to treated with slight caution: benchmarks which allocate relatively small amounts of memory (see Table 3) can make the relative effect of average heap footprint seem much worse than it is in absolute terms.
Binary Trees is sufficiently simple that we also used it to compare against Arena and Rust-GC. The time-series data in Fig. 2 is particularly illuminating (for completeness, Table 5 in the Appendix has the raw timings). Alloy is around 3.5× slower than Arena . The time-series data for the latter shows it going through distinct phases: a (relatively long) allocation phase, a (relatively moderate) 'work' phase, and a (relatively short) deallocation phase. Put another way: these clear phases make Binary Trees a perfect match for an arena. In the other approaches, the 'work' phase occupies a much greater proportion of their execution, because it also incorporates allocator work. Alloy is around 1.3× faster than Rc , but both have similar memory profiles. Alloy is around 3× faster than Rust-GC and has an average heap footprint around 4× smaller, reflecting Alloy's advantage in not being a user-land library that relies in part on Rc . Although we caution against over-interpreting a single benchmark, this does give us at least some idea of the performance ceiling and floor for different approaches.
The time-series data in Fig. 2 helps explain other factors. For example, it shows that som-rs-bc leaks memory on the JSON Small benchmark (we suspect it also leaks in some other benchmarks, though rarely as visibly). This is because Rc keeps alive values in cycles; Alloy does not leak memory on som-rs-bc, as it naturally deals correctly with cycles.
We can see from the time-series data that Ripgrep has a complex heap footprint pattern. This may suggest a memory leak, but in fact it is a consequence of the inevitable delay in freeing memory in a GC. In general, GC notices that memory is unused later than reference counting, but this is exacerbated further by finalizers. Surprisingly, finalizers can lengthen or shorten an allocation's lifetime. GCed values with finalizers tend to have longer lifetimes, because they have to wait in the finalizer queue. However, when a finalizer calls free on indirectly owned values, those are immediately marked as not live, rather than having to wait until the next collection to be discovered as such. This, albeit indirectly, explains the seemingly random peaks and troughs in memory usage one can observe in Ripgrep's time-series data.
The results of E Elision are shown in Fig. 3. In general, there is a fairly clear correlation: the more finalizers are removed, and the greater the proportion of the overall heap the memory owned by Gc is, the better the metrics become. However, there are several caveats. First, when all finalizers are removed, BDWGC does not start a finalizer thread or invoke locking related to it, unduly flattering the time-based metrics. Second, the quantity of finalizers is only a partial proxy for cost: some finalizers free up large graphs of indirectly owned values, which can take some time to run. Third, some benchmarks change the work they do: grmtools speeds up so much that its error recovery algorithm has time to do more work, so while finalizer hugely benefits its GC pause time, its wall-clock time changes much less. Finally, since finalizers can cause indirectly owned allocations to be freed earlier than the GC itself does naturally, removing them can cause indirectly owned values to live for longer: Ripgrep's average heap footprint highlights this issue.
The results for E PremOpt are shown in Fig. 4. We created three configurations of Alloy. None has no fences, and thus is unsound, but allows us to approximate (allowing for possible vagaries from running unsound code!) the best possible outcome. Naive inserts all possible fences. Optimised inserts only necessary fences. Once confidence intervals are taken into account, there are no statistically significant results for this experiment. Although it is possible that benchmarking 'noise' is hiding a meaningful result, our data suggests that any such differences are likely to be minimal. To make up for this disappointment, the fact that there is no difference between any of these suggests that, on non-artificial benchmarks, premature finalizer prevention is not a noticeable cost.
Any performance judgements we make are necessarily contingent on our methodology the benchmark suite we chose, including the proportion of benchmarks that we ported, and the way we process and present data. For example, we did not port external libraries to use Gc so many benchmarks use a variety of allocation strategies. Even had we ported everything, we would not be able to say, for example, that finalizer elision will always improve performance by exactly the factor we see in our experiment: there undoubtedly exist reasonable, non-pathological, programs which will see performance changes outside the ranges that our results suggest.
Using BDWGC as the allocator for all benchmarks has the advantage of removing 'pure' allocator performance as a confounding factor, but does mean that some of the performance characteristics of benchmarks will be changed (e.g due to the portion of time we spend in the allocator; or BDWGC's adaptive heap sizing strategy). A generic, modern conservative GC, using the insights of recent non-GC allocators, would almost certainly give different – though we suspect not profoundly different – results. To the best of our knowledge there is currently no production-quality modern, generic conservative, GC we could use instead, though we are aware of at least one attempt to create such an alternative: it will be interesting to rerun our experiments if and when that arrives.
The RSS memory metric we collect is at Linux's whim: if it does not update as frequently as we expect, we will see artificially 'smoothed' data that may miss out peaks and troughs. Similar, our interpolation of time-series data onto a normalised grid can also smooth data. We manually checked a large quantity of data to ensure this was not a significant effect; by running benchmarks 30 times means it is also less more likely that peaks and troughs are caught at least sometimes.
10 Related Work
In this paper we hope to have given sufficient background on GC and the use of destructors and finalizers in general. In this section we mostly survey the major parts of the GC for Rust landscape more widely. Our survey is inevitably incomplete, in part because this is a rapidly evolving field (a number of changes have occurred since the most recent equivalent survey we are aware of [16]). We also cover some relevant non-Rust GC work not mentioned elsewhere.
Early versions of Rust had 'managed pointers' (using the @T syntax) which were intended to represent GC types [16]. The core implementation used reference counting though there were several, sometimes short-lived, cycle detectors [17]. Managed pointer support was removed around a year before the first stable release of Rust. This was not the end of the story for 'GC as a core part of Rust', with core Rust developers exploring the problem space in more detail [15, 21, 22]. Over time these efforts dwindled, and those interested in GC for Rust largely moved from anticipating rustc support to expecting to have to do everything in user-level libraries.
One of the earliest user-level GC for Rust libraries is Bacon-Rajan-CC [12]. This provides a type Cc which is similar in intention to Alloy's Gc . The mechanism by which objects are collected is rather different: they have a naive reference count, which causes objects outside a cycle to have deterministic destruction; and users can manually invoke a cycle detector, which uses trial deletion in the style of Bacon and Rajan [4] to identify objects in unused cycles. Cycle detection requires users manually implementing a Trace trait which traverses a type's fields. Destructors are used as finalizers: to avoid the problems with Rust references we solved in Section 7.1, Bacon-Rajan-CC imposes a T:'static lifetime bound on the type parameter passed to Cc . Simplifying slightly, this means that any references in such a type must be valid for the remaining lifetime of the program, a severe restriction. Unlike our approach to the access of already-finalized values (Section 7.2), it can only detect such accesses at runtime, leading to a (safe) Rust panic .
Probably the best known GC for Rust is Rust-GC [14] (partly covered in Section 4). Rust-GC's Gc provides a similar API to Alloy, with the notable exception that its Gc is not, and cannot be, copyable, thus always requiring calls to Gc::clone . Although, like Alloy, Rust-GC allows Gc values to be converted into pointers, its lack of conservative GC means that users must ensure that a Gc wrapper is kept alive for the entire lifetime of pointers derived from it. Similarly to Bacon-Rajan-CC, GCed values are reference counted, with occasional tracing sweeps to identify cycles, though Rust-GC performs cycle detection automatically (i.e. it doesn't require manual calls to a function such as collect_cycles ). Drop methods are not used as finalizers: if a finalizer is required, a manual implementation of the Finalize trait must be provided; finalizer glue can be largely, though not fully (see Section 4), automatically created by the provided Trace macro. Rust-GC detects accesses to already-finalized values dynamically at run-time, panicking if they occur. Unlike Bacon-Rajan-CC, these accesses are detected by recording what the collector's state is in: if the collector is in a 'sweep' phase, any access of a Gc leads to a panic. We have not yet verified whether cross-thread collection / sweeping can evade this check.
An example of moving beyond reference counting in a GC for Rust is Shifgrethor [2]. It requires Gc values to be created by a Root<'root> : the resulting Gc<'root, T> is then tied to the lifetime of the Root<'root> . This allows roots to be precisely identified, but requires explicitly having access to a Root<'root> whenever a Gc<'root, T> is used. As with Rust-GC, Shifgrethor requires users to manually implement a Finalize trait, though Shifgrethor's is more restrictive: not only can other GCed values not be accessed (implicitly solving the same problem as Section 7.2) but any other type without the same 'root lifetime as the GCed value is forbidden. This means that many seemingly safe finalizers require implementing the unsafe UnsafeFinalize trait. We view Shifgrethor as proof that accurately tracking GC roots in normal Rust without reference counting is possible, though it cannot deal with references being converted into pointers and usize s.
A different means of tackling the root-finding problem is GcArena [32], which uses branding in a similar way to GhostCell s (see Section 2). In essence, users provide a special 'root' type which is the only place where roots can be stored. Mutating the heap can only be done in the context of functions that are passed a branded reference to the GCed heap. Once such a function has completed, GcArena is in full control of the GC heap, and knows that only the root type needs to be scanned for roots. This leads to a precise guarantee about GC reference lifetimes. However, if code executes in an arena for too long, the system can find itself starved of resources, with no way of recovering, even if much of the arena is no longer used. GcArena was originally part of the Piccolo VM (which was itself previously called Luster), a Lua VM written in Rust. Such VMs have a frequently executed main loop which is a natural point for a program to relinquish references to the GCed heap, but this is not true of many other GCed programs.
One attempt to improve upon Rust-GC is Bronze [11], though it shows how challenging it can be to meaningfully improve GC for Rust: both of its main advances have subsequently been disabled because they are not just unsound but actively lead to crashes. First, Bronze tried to solve the root-finding problem by using LLVM's gc.root intrinsic at function entries to generate stack-maps (a run-time mechanism for accurately tracking active pointers). This rules out the false positives that are inevitable in conservative GC. However, Bronze could not track nested references: if a Gc was used as a field in a struct, it was not tracked by the GC. Second, Bronze tried to give GC in Rust similar semantics to non-ownership languages such as Java. It did this by allowing shared mutation, undermining Rust's borrow checker.
Chrome's rendering engine Blink uses the conservative GC Oilpan. It has the interesting property that it has two classes of finalizers. 'Full finalizers' are similar to finalizers in Alloy, running on a finalizer thread at an indeterminate future point, but with the difference that they can only reference parts of a GCed value. To mitigate this, 'pre-finalizers' are run by the collector on the same thread as mutator as soon as an object as recognised as unused, and can access all of a GCed value. Pre-finalizers are necessary, but not encouraged, because they implicitly pause the stop-the-world phase of the collector. This reflects the fact that latency is a fundamental concern for a rendering engine: Alloy currently makes no pretences to being low latency.
11 Conclusions
We introduced a novel design for GC in Rust that solves a number of outstanding challenges in GC for Rust, as well as – by taking advantage of Rust's unusual static guarantees – some classical GC finalizer problems. By making integration with existing Rust code easier than previous GCs for Rust, we hope to have shown a pragmatic route for partial or wholesale migration of Rust code that would benefit from GC.
Challenges and future opportunities remain. For example, Alloy is an 'all or nothing' cost: if you want to use Gc in a single location, you must pay the costs of the GC runtime and so on. Alloy's absolute speed is, we believe, limited by BDWGC: it is probable that using a semi-precise GC and/or a faster conservative GC could change our view of the absolute performance speed
In summary, we do not claim that Alloy is the ultimate design for GC in Rust – reasonable people may, for example, disagree on whether the costs of conservative GC are worth the gains – but it does show what can be achieved if one is willing to alter the language's design and rustc .
Data Availability Statement
The accompanying artefact [18] contains: the source code necessary to run this paper's experiment (including generating figures etc.) from scratch; and data from a run of the experiment that we used in this paper.
Acknowledgments
This work was funded by an EPSRC PhD studentship and the Shopify / Royal Academy of Engineering Research Chair in Language Engineering. We thank Steve Klabnik and Andy Wingo for comments.
References
[1] Mads Ager, Erik Corry, Vyacheslav Egorov, Kentaro Hara, Gustav Wibling, and Ian Zerny. 2013. Oilpan: tracing garbage collection for Blink. https://docs.google.com/document/d/1y7_0ni0E_kxvrah-QtnreMlzCDKN3QP4BN1Aw7eSLfY. Accessed on 2024-10-15. [2] Saoirse Aronson. 2018. shifgrethor. https://github.com/withoutboats/shifgrethor/. Accessed: 2024-10-15. [3] David F Bacon, Perry Cheng, and VT Rajan. 2004. A unified theory of garbage collection. In OOPSLA. 50–68. [4] David F Bacon and Vadakkedathu T Rajan. 2001. Concurrent cycle collection in reference counted systems. In ECOOP. 207–235. doi:10.1007/3-540-45337-7_12 [5] Stephen M. Blackburn, Zixian Cai, Rui Chen, Xi Yang, John Zhang, and John Zigman. 2025. Rethinking Java Performance Analysis. In ASPLOS. 940–954. doi:10.1145/3669940.3707217 [6] Hans-J Boehm. 2003. Destructors, finalizers, and synchronization. In POPL. doi:10.1145/604131.604153 [7] Hans-J Boehm and Mike Spertus. 2007. Optimization-robust finalization. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2261.html. Accessed: 2024-08-08. [8] Hans-J Boehm and Mike Spertus. 2009. Garbage collection in the next C++ standard. In ISMM. 30–38. doi:10.1145/1542431.1542437 [9] Hans-Juergen Boehm and Mark Weiser. 1988. Garbage collection in an uncooperative environment. SPE 18, 9 (Sept. 1988), 807–820. doi:10.1002/spe.4380180902 [10] Thom Chiovoloni. 2015. Typed Arena. https://github.com/thomcc/rust-typed-arena/. Accessed: 2025-03-25. [11] Michael Coblenz, Michelle Mazurek, and Michael Hicks. 2022. Does the Bronze Garbage Collector Make Rust Easier to Use? A Controlled Experiment. In ICSE. doi:10.1145/3510003.3510107 [12] Nick Fitzgerald. 2015. bacon-rajan-cc: A reference counted type with cycle collection for Rust. https://github.com/fitzgen/bacon-rajan-cc. Accessed: 2024-10-15. [13] Brian Goetz and Mikael Vidstedt. 2021. JEP 421: Deprecate Finalization for Removal. https://openjdk.org/jeps/421. Accessed: 2024-10-15. [14] Manish Goregaokar. 2015. rust-gc. https://github.com/Manishearth/rust-gc/. Accessed: 2024-10-15. [15] Manish Goregaokar. 2016. GC support in Rust: API design. https://manishearth.github.io/blog/2016/08/18/gc-support-in-rust-api-design/. Accessed: 2024-10-15. [16] Manish Goregaokar. 2021. A tour of safe tracing GC designs in Rust. https://manishearth.github.io/blog/2021/03/15/arenas-in-rust/. Accessed: 2024-10-15. [17] Graydon Hoare. 2022. Reply to 'What should be included in a history of the Rust language?'. https://www.reddit.com/r/rust/comments/za5lh5/comment/iyp0ptm/. [18] Jacob Hughes and Laurence Tratt. 2025. Reproduction Package for Article 'Garbage Collection for Rust: The Finalizer Frontier'. Zenodo. doi:10.5281/zenodo.17013382 [19] Richard Jones, Antony Hosking, and Eliot Moss. 2023. The Garbage Collection Handbook: the Art of Automatic Memory Management (second ed.). Chapman and Hall/CRC. doi:10.1201/9781003276142 [20] Steve Klabnik and Carol Nichols. 2018. The Rust Programming Language. No Starch Press. doi:10.5555/3271463 [21] Felix S. Klock. 2015. GC and Rust: specifying the problem. http://blog.pnkfx.org/blog/2015/11/10/gc-and-rust-part-1-specing-the-problem/. Accessed: 2024-10-15. [22] Felix S. Klock. 2016. GC and Rust: The roots of the problem. http://blog.pnkfx.org/blog/2016/01/01/gc-and-rust-part-2-roots-of-the-problem/. Accessed: 2024-10-15. [23] LLVM. 2014. Garbage collection with LLVM. https://llvm.org/docs/GarbageCollection.html. Accessed: 2024-10-15. [24] Niko Matsakis. 2013. Destructors and finalizers in Rust. http://smallcultfollowing.com/babysteps/blog/2013/01/17/destructors-and-finalizers-in-rust/. Accessed: 2024-10-15. [25] Benjamin C Pierce. 2004. Advanced topics in Types and Programming Languages. MIT press. doi:10.7551/mitpress/1104.001.0001 [26] Filip Pizlo. 2017. Introducing Riptide: WebKit's retreating wavefront concurrent garbage collector. https://webkit.org/blog/7122/introducing-riptide-webkits-retreating-wavefront-concurrent-garbage-collector/. Accessed: 2024-10-15. [27] Rust. 2014. The Rust language reference. https://doc.rust-lang.org/reference/. Accessed: 2024-10-01. [28] Rifat Shahriyar, Stephen M. Blackburn, and Kathryn S. McKinley. 2014. Fast conservative garbage collection. In OOPSLA. 121–139. doi:10.1145/2660193.2660198 [29] Alekesy Shpilëv. 2020. JVM Anatomy Quark #8: Local Variable Reachability. https://shipilev.net/jvm/anatomy-quarks/8-local-var-reachability/. Accessed: 2024-10-08. [30] Bjarne Stroustrup. 1997. The C++ Programming Language (third ed.). Addison-Wesley. [31] Lei Wang and Xikun Sun. 2006. Escape analysis for synchronization removal. In SAC. 1419–1423. doi:10.1145/1141277.1141607 [32] Catherine West. 2019. gc-arena: An experimental system for Rust garbage collection. https://github.com/kyren/gc-arena/. Accessed: 2024-10-15. [33] Jason Williams. 2018. Boa: an experimental JavaScript lexer, parser and interpreter written in Rust. https://github.com/boa-dev/boa. Accessed: 2024-10-15. [34] Milian Wolff. 2014. Heaptrack - A Heap Memory Profiler for Linux. https://milianw.de/blog/heaptrack-a-heap-memory-profiler-for-linux.html. Accessed: 2025-03-25. [35] Joshua Yanovski, Hoang-Hai Dang, Ralf Jung, and Derek Dreyer. 2021. GhostCell: separating permissions from data in Rust. 5 (Aug. 2021), 1–30. doi:10.1145/3473597
Appendix
function RemoveElidableDrops(func): for each basic_block ∈ func do if IsDropTerminator(basic_block.terminator.kind) then ty ← GetTypeOfDroppedValue(block.terminator); if IsGcType(ty) then if not RequiresFinalizer(ty) then ReplaceTerminator(basic_block); function ReplaceTerminator(basic_block): drop_func ← GetDropFunc(basic_block.terminator); last_block ← GetLastBasicBlock(drop_func); block.terminator ← last_block.terminator;
Additional Experimental Data
Benchmark Description Reason for exclusion bevy ECS game engine in Rust Unable to port successfully (see Section 8.1.2) dyon Scripting language in Rust Unable to port successfully (see Section 8.1.2) jiff A datetime library for Rust Too few allocations to measure mini-moka Concurrent in-memory cache library Too few allocations to measure numbat Math search engine Too few allocations to measure rkyv Zero-copy deserialization framework Insufficient Gc coverage in benchmarks RustPython Python interpreter written in Rust Difficulty retro-fitting __del__ semantics (see Section 8.1.2) rust-analyzer Language server for Rust Unable to port successfully (see Section 8.1.2) salsa Incremental recomputation library Too few allocations to measure WLambda Scripting language written in Rust Insufficient Gc coverage in benchmarks
Wall-clock time (s) Gc Rc Gc ( Rust-GC ) Arena Alacritty 0.41 [0.39, 0.45] 0.40 [0.38, 0.44] – – Binary Trees 0.11 [0.11, 0.11] 0.15 [0.14, 0.15] 0.33 [0.32, 0.33] 0.03 [0.03, 0.04] fd 0.33 [0.29, 0.38] 0.31 [0.26, 0.37] – – grmtools 3.06 [3.00, 3.14] 3.24 [3.17, 3.31] – – Regex-Redux 0.47 [0.47, 0.47] 0.45 [0.45, 0.46] – – Ripgrep 1.61 [1.55, 1.69] 1.52 [1.45, 1.59] – – som-rs-ast 0.92 [0.88, 0.95] 0.79 [0.76, 0.82] – – som-rs-bc 0.28 [0.27, 0.29] 0.29 [0.28, 0.30] – –
Wall-clock time (s) Ratio jemalloc BDWGC Alacritty 0.36 [0.33, 0.40] 0.40 [0.38, 0.44] 1.11 Binary Trees 0.12 [0.12, 0.12] 0.15 [0.14, 0.15] 1.26 fd 0.30 [0.25, 0.36] 0.31 [0.26, 0.37] 1.02 grmtools 3.09 [3.01, 3.17] 3.24 [3.17, 3.31] 1.05 Regex-Redux 0.45 [0.44, 0.45] 0.45 [0.45, 0.46] 1.01 Ripgrep 1.46 [1.40, 1.53] 1.52 [1.45, 1.59] 1.04 som-rs-ast 0.77 [0.74, 0.80] 0.79 [0.76, 0.82] 1.02 som-rs-bc 0.28 [0.27, 0.29] 0.29 [0.28, 0.30] 1.02
Wall-clock time (s) jemalloc BDWGC Rc Gc Rc Alacritty ▶ Cur. Motion 0.65 ±0.02 0.66 ±0.01 0.66 ±0.01 Dense Cells 2.16 ±0.03 2.16 ±0.02 2.14 ±0.04 Light Cells 0.37 ±0.01 0.37 ±0.01 0.38 ±0.01 Scroll 0.25 ±0.01 0.26 ±0.02 0.26 ±0.01 Scroll (fullscreen) 0.38 ±0.01 0.39 ±0.01 0.38 ±0.01 Scroll Btm 0.32 ±0.00 0.33 ±0.00 0.32 ±0.01 Scroll Btm (small) 0.32 ±0.01 0.33 ±0.00 0.32 ±0.01 Scroll Top 0.25 ±0.02 0.26 ±0.02 0.24 ±0.02 Scroll Top (small) 0.32 ±0.01 0.32 ±0.01 0.33 ±0.00 Unicode 0.23 ±0.02 0.27 ±0.01 0.27 ±0.02 fd ▶ Cmd Exec. 1.29 ±0.01 1.28 ±0.01 1.29 ±0.02 Cmd Exec. (large) 1.24 ±0.01 1.25 ±0.03 1.26 ±0.01 File Extension 0.13 ±0.00 0.15 ±0.01 0.13 ±0.00 File Type 0.12 ±0.00 0.13 ±0.01 0.13 ±0.00 No Pattern 0.18 ±0.00 0.29 ±0.02 0.22 ±0.01 Simple 0.27 ±0.01 0.33 ±0.01 0.32 ±0.00 Simple (-HI) 0.12 ±0.00 0.15 ±0.00 0.13 ±0.00 som-rs-ast ▶ Bounce 0.62 ±0.00 0.96 ±0.02 0.80 ±0.00 BubbleSort 0.56 ±0.01 0.84 ±0.01 0.71 ±0.00 DeltaBlue 0.77 ±0.00 1.09 ±0.01 0.98 ±0.00 Dispatch 0.59 ±0.01 0.97 ±0.00 0.77 ±0.01 Fannkuch 0.65 ±0.00 1.01 ±0.06 0.81 ±0.00 Fibonacci 0.83 ±0.00 1.41 ±0.01 1.08 ±0.00 FieldLoop 0.94 ±0.00 1.20 ±0.01 1.07 ±0.01 GraphSearch 0.31 ±0.00 0.43 ±0.01 0.38 ±0.01 IntegerLoop 0.52 ±0.01 0.84 ±0.00 0.67 ±0.01 JsonSmall 0.86 ±0.00 1.22 ±0.01 1.10 ±0.00 List 0.38 ±0.00 0.53 ±0.01 0.48 ±0.00 Loop 0.53 ±0.01 0.86 ±0.01 0.70 ±0.00 Mandelbrot 0.46 ±0.00 0.69 ±0.01 0.57 ±0.00 NBody 0.31 ±0.00 0.38 ±0.01 0.34 ±0.00 PageRank 0.31 ±0.00 0.44 ±0.00 0.39 ±0.00 Permute 0.67 ±0.00 0.90 ±0.02 0.82 ±0.00 Queens 0.77 ±0.01 1.18 ±0.02 0.98 ±0.00 QuickSort 0.98 ±0.00 1.27 ±0.02 1.26 ±0.01 Recurse 0.60 ±0.00 0.92 ±0.00 0.79 ±0.00 Richards 2.48 ±0.02 3.72 ±0.11 3.13 ±0.01 Sieve 0.67 ±0.00 0.94 ±0.01 0.87 ±0.00 Storage 0.56 ±0.00 0.70 ±0.01 0.69 ±0.00 Sum 0.52 ±0.01 0.83 ±0.01 0.67 ±0.00 Towers 0.31 ±0.00 0.46 ±0.01 0.38 ±0.01 TreeSort 0.32 ±0.00 0.41 ±0.01 0.40 ±0.00 WhileLoop 0.48 ±0.00 0.72 ±0.00 0.60 ±0.00 grmtools ▶ Ripgrep ▶ Alternates 1.18 ±0.01 1.29 ±0.02 1.21 ±0.01 Alternates (-i) 1.31 ±0.01 1.42 ±0.01 1.33 ±0.01 Literal 1.12 ±0.01 1.23 ±0.01 1.15 ±0.01 Literal (-i) 1.19 ±0.00 1.27 ±0.01 1.20 ±0.01 Literal (default) 1.08 ±0.01 1.19 ±0.01 1.10 ±0.01 Literal (mmap) 1.64 ±0.01 1.75 ±0.02 1.67 ±0.01 Literal (mmap, -i) 1.67 ±0.01 1.80 ±0.01 1.72 ±0.01 Literal (regex) 1.13 ±0.01 1.25 ±0.01 1.15 ±0.01 UTF Greek 3.29 ±0.01 3.41 ±0.01 3.33 ±0.01 UTF Greek (-i) 3.30 ±0.01 3.42 ±0.03 3.33 ±0.01 UTF Word 1.09 ±0.01 1.21 ±0.01 1.14 ±0.00 UTF Word (alt.) 1.11 ±0.01 1.21 ±0.01 1.13 ±0.00 Word 1.12 ±0.00 1.22 ±0.02 1.14 ±0.01 som-rs-bc ▶ Bounce 0.25 ±0.00 0.28 ±0.00 0.30 ±0.00 BubbleSort 0.23 ±0.00 0.27 ±0.00 0.27 ±0.00 DeltaBlue 0.30 ±0.00 0.34 ±0.00 0.36 ±0.00 Dispatch 0.25 ±0.00 0.28 ±0.00 0.29 ±0.01 Fannkuch 0.25 ±0.00 0.29 ±0.00 0.29 ±0.00 Fibonacci 0.30 ±0.01 0.37 ±0.00 0.36 ±0.00 FieldLoop 0.49 ±0.00 0.43 ±0.00 0.54 ±0.01 GraphSearch 0.12 ±0.00 0.12 ±0.00 0.15 ±0.00 IntegerLoop 0.22 ±0.00 0.24 ±0.00 0.26 ±0.00 JsonSmall 0.37 ±0.00 0.39 ±0.01 0.44 ±0.00 List 0.18 ±0.00 0.22 ±0.00 0.21 ±0.01 Loop 0.23 ±0.00 0.26 ±0.00 0.27 ±0.00 Mandelbrot 0.20 ±0.00 0.22 ±0.01 0.23 ±0.00 NBody 0.11 ±0.00 0.12 ±0.00 0.13 ±0.00 PageRank 0.13 ±0.00 0.13 ±0.00 0.14 ±0.00 Permute 0.24 ±0.00 0.27 ±0.00 0.27 ±0.00 Queens 0.28 ±0.01 0.33 ±0.00 0.33 ±0.00 QuickSort 0.36 ±0.00 0.38 ±0.00 0.43 ±0.00 Recurse 0.25 ±0.00 0.31 ±0.00 0.30 ±0.01 Richards 0.92 ±0.00 1.20 ±0.01 1.10 ±0.01 Sieve 0.25 ±0.00 0.26 ±0.00 0.30 ±0.01 Storage 0.22 ±0.00 0.21 ±0.00 0.25 ±0.00 Sum 0.22 ±0.00 0.25 ±0.00 0.27 ±0.00 Towers 0.13 ±0.00 0.15 ±0.00 0.15 ±0.00 TreeSort 0.11 ±0.00 0.12 ±0.00 0.13 ±0.00 WhileLoop 0.22 ±0.00 0.23 ±0.00 0.25 ±0.00
User time (s) jemalloc BDWGC Rc Gc Rc Alacritty ▶ Cur. Motion 1.08 ±0.05 1.09 ±0.04 1.09 ±0.04 Dense Cells 6.66 ±0.13 6.62 ±0.12 6.58 ±0.15 Light Cells 0.34 ±0.03 0.34 ±0.03 0.34 ±0.03 Scroll 0.11 ±0.01 0.11 ±0.01 0.12 ±0.02 Scroll (fullscreen) 0.24 ±0.02 0.28 ±0.03 0.22 ±0.02 Scroll Btm 0.14 ±0.01 0.14 ±0.01 0.14 ±0.01 Scroll Btm (small) 0.14 ±0.01 0.15 ±0.01 0.14 ±0.01 Scroll Top 0.12 ±0.01 0.12 ±0.02 0.11 ±0.01 Scroll Top (small) 0.14 ±0.01 0.14 ±0.01 0.15 ±0.01 Unicode 0.10 ±0.01 0.13 ±0.01 0.12 ±0.01 fd ▶ Cmd Exec. 1.05 ±0.01 1.12 ±0.02 1.12 ±0.02 Cmd Exec. (large) 1.02 ±0.01 1.10 ±0.03 1.11 ±0.02 File Extension 0.04 ±0.00 0.04 ±0.01 0.04 ±0.01 File Type 0.03 ±0.01 0.04 ±0.01 0.04 ±0.01 No Pattern 0.16 ±0.01 0.29 ±0.02 0.23 ±0.01 Simple 0.14 ±0.01 0.18 ±0.01 0.16 ±0.01 Simple (-HI) 0.03 ±0.01 0.04 ±0.01 0.04 ±0.01 som-rs-ast ▶ Bounce 0.62 ±0.00 0.96 ±0.02 0.79 ±0.00 BubbleSort 0.56 ±0.00 0.83 ±0.01 0.71 ±0.00 DeltaBlue 0.77 ±0.00 1.07 ±0.01 0.97 ±0.00 Dispatch 0.59 ±0.01 0.96 ±0.01 0.76 ±0.01 Fannkuch 0.64 ±0.00 0.98 ±0.05 0.80 ±0.00 Fibonacci 0.82 ±0.00 1.40 ±0.01 1.07 ±0.00 FieldLoop 0.94 ±0.00 1.20 ±0.01 1.07 ±0.01 GraphSearch 0.30 ±0.00 0.41 ±0.01 0.37 ±0.00 IntegerLoop 0.52 ±0.01 0.83 ±0.00 0.66 ±0.01 JsonSmall 0.85 ±0.00 1.21 ±0.01 1.10 ±0.00 List 0.38 ±0.00 0.52 ±0.01 0.47 ±0.00 Loop 0.52 ±0.01 0.85 ±0.01 0.70 ±0.00 Mandelbrot 0.45 ±0.00 0.69 ±0.01 0.57 ±0.00 NBody 0.31 ±0.00 0.38 ±0.01 0.34 ±0.00 PageRank 0.30 ±0.00 0.43 ±0.01 0.38 ±0.00 Permute 0.66 ±0.00 0.89 ±0.02 0.82 ±0.00 Queens 0.77 ±0.00 1.18 ±0.02 0.97 ±0.00 QuickSort 0.98 ±0.00 1.26 ±0.02 1.26 ±0.00 Recurse 0.60 ±0.00 0.91 ±0.00 0.79 ±0.00 Richards 2.47 ±0.02 3.72 ±0.11 3.13 ±0.01 Sieve 0.66 ±0.00 0.94 ±0.01 0.87 ±0.00 Storage 0.55 ±0.00 0.69 ±0.01 0.68 ±0.00 Sum 0.52 ±0.00 0.82 ±0.01 0.67 ±0.00 Towers 0.31 ±0.00 0.46 ±0.01 0.38 ±0.00 TreeSort 0.31 ±0.00 0.40 ±0.01 0.40 ±0.00 WhileLoop 0.47 ±0.00 0.72 ±0.00 0.59 ±0.00 grmtools ▶ Ripgrep ▶ Alternates 0.42 ±0.02 0.51 ±0.02 0.43 ±0.02 Alternates (-i) 0.54 ±0.02 0.65 ±0.02 0.57 ±0.02 Literal 0.34 ±0.02 0.44 ±0.01 0.36 ±0.01 Literal (-i) 0.41 ±0.02 0.50 ±0.02 0.44 ±0.02 Literal (default) 0.30 ±0.02 0.40 ±0.01 0.32 ±0.01 Literal (mmap) 0.43 ±0.02 0.52 ±0.02 0.46 ±0.02 Literal (mmap, -i) 0.48 ±0.02 0.58 ±0.02 0.50 ±0.02 Literal (regex) 0.35 ±0.02 0.44 ±0.02 0.36 ±0.01 UTF Greek 2.55 ±0.02 2.64 ±0.02 2.58 ±0.03 UTF Greek (-i) 2.56 ±0.03 2.66 ±0.03 2.56 ±0.02 UTF Word 0.35 ±0.02 0.42 ±0.02 0.37 ±0.02 UTF Word (alt.) 0.33 ±0.02 0.43 ±0.02 0.35 ±0.01 Word 0.35 ±0.02 0.43 ±0.02 0.36 ±0.01 som-rs-bc ▶ Bounce 0.25 ±0.00 0.28 ±0.00 0.29 ±0.00 BubbleSort 0.22 ±0.00 0.27 ±0.00 0.27 ±0.00 DeltaBlue 0.30 ±0.00 0.34 ±0.00 0.35 ±0.01 Dispatch 0.24 ±0.00 0.28 ±0.00 0.29 ±0.00 Fannkuch 0.24 ±0.00 0.28 ±0.00 0.28 ±0.00 Fibonacci 0.30 ±0.00 0.36 ±0.00 0.36 ±0.00 FieldLoop 0.48 ±0.00 0.42 ±0.01 0.53 ±0.01 GraphSearch 0.12 ±0.00 0.11 ±0.00 0.14 ±0.00 IntegerLoop 0.22 ±0.00 0.24 ±0.00 0.26 ±0.00 JsonSmall 0.36 ±0.00 0.39 ±0.01 0.44 ±0.00 List 0.17 ±0.00 0.22 ±0.00 0.21 ±0.01 Loop 0.22 ±0.00 0.25 ±0.00 0.27 ±0.00 Mandelbrot 0.19 ±0.00 0.21 ±0.01 0.23 ±0.00 NBody 0.11 ±0.00 0.12 ±0.00 0.13 ±0.00 PageRank 0.12 ±0.00 0.12 ±0.00 0.14 ±0.00 Permute 0.23 ±0.00 0.26 ±0.00 0.27 ±0.00 Queens 0.27 ±0.01 0.33 ±0.00 0.33 ±0.00 QuickSort 0.35 ±0.00 0.38 ±0.00 0.42 ±0.00 Recurse 0.25 ±0.00 0.31 ±0.00 0.30 ±0.01 Richards 0.92 ±0.00 1.20 ±0.01 1.10 ±0.01 Sieve 0.25 ±0.00 0.26 ±0.00 0.30 ±0.01 Storage 0.22 ±0.00 0.20 ±0.00 0.25 ±0.00 Sum 0.22 ±0.00 0.24 ±0.00 0.26 ±0.00 Towers 0.12 ±0.00 0.15 ±0.00 0.14 ±0.00 TreeSort 0.11 ±0.00 0.11 ±0.00 0.12 ±0.00 WhileLoop 0.21 ±0.00 0.22 ±0.00 0.25 ±0.00
Heap Size (MiB) Relative wall-clock time Benchmarks failed Alacritty 16 0.96 [0.91, 0.99] 32 0.98 [0.95, 1.02] 64 0.94 [0.89, 0.98] Binary Trees 4 0.88 [0.82, 1.02] 8 0.90 [0.80, 1.01] 16 0.87 [0.82, 0.94] fd 16 0.94 [0.90, 0.99] 32 0.94 [0.88, 0.98] 64 0.94 [0.89, 1.00] grmtools 1024 1.01 [1.00, 1.02] 2/4 (Eclipse, Jenkins) 2048 1.00 [1.00, 1.01] 4096 1.01 [1.00, 1.02] Regex-Redux 256 0.94 [0.92, 0.95] 512 0.93 [0.90, 0.94] 1024 0.96 [0.92, 1.07] Ripgrep 32 0.96 [0.95, 0.96] 64 0.95 [0.94, 0.95] 128 0.94 [0.93, 0.95] som-rs-ast 64 0.72 [0.71, 0.74] 2/4 (Fannkuch, TreeSort) 96 0.74 [0.73, 0.75] 2/4 (Fannkuch, TreeSort) 128 0.75 [0.74, 0.76] som-rs-bc 32 0.79 [0.78, 0.80] 64 0.79 [0.77, 0.80] 128 0.84 [0.83, 0.86]
Fin. elided (%) Alacritty ▶ Cur. Motion 0.00 Dense Cells 0.00 Light Cells 0.00 Scroll 0.00 Scroll Btm 0.00 Scroll Btm (small) 0.00 Scroll (fullscreen) 0.00 Scroll Top 0.00 Scroll Top (small) 0.00 Unicode 0.00 fd ▶ Cmd Exec. 8.93 Cmd Exec. (large) 9.09 File Extension 72.73 File Type 24.54 No Pattern 0.04 Simple 61.54 Simple (-HI) 61.54 som-rs-ast ▶ Bounce 100.00 BubbleSort 100.00 DeltaBlue 100.00 Dispatch 100.00 Fannkuch 100.00 Fibonacci 100.00 FieldLoop 100.00 GraphSearch 99.99 IntegerLoop 100.00 JsonSmall 100.00 List 100.00 Loop 100.00 Mandelbrot 100.00 NBody 99.99 PageRank 99.99 Permute 100.00 Queens 100.00 QuickSort 100.00 Recurse 100.00 Richards 100.00 Sieve 100.00 Storage 100.00 Sum 100.00 Towers 99.99 TreeSort 99.99 WhileLoop 100.00 grmtools ▶ Ripgrep ▶ Alternates 79.04 Alternates (-i) 79.04 Literal 79.04 Literal (-i) 79.04 Literal (mmap, -i) 79.04 Literal (default) 79.04 Literal (mmap) 79.04 Literal (regex) 79.04 UTF Greek 79.04 UTF Greek (-i) 79.04 UTF Word 79.04 UTF Word (alt.) 79.04 Word 79.04 som-rs-bc ▶ Bounce 72.50 BubbleSort 76.34 DeltaBlue 73.86 Dispatch 77.78 Fannkuch 72.74 Fibonacci 70.37 FieldLoop 83.33 GraphSearch 76.56 IntegerLoop 75.00 JsonSmall 71.85 List 79.41 Loop 74.82 Mandelbrot 70.93 NBody 86.17 PageRank 70.95 Permute 79.23 Queens 79.04 QuickSort 69.33 Recurse 82.61 Richards 71.17 Sieve 73.15 Storage 73.84 Sum 75.00 Towers 78.29 TreeSort 65.19 WhileLoop 74.98
Wall-clock time (s) Before elision After elision Alacritty ▶ Cur. Motion 0.66 ±0.01 0.66 ±0.01 Dense Cells 2.17 ±0.03 2.17 ±0.02 Light Cells 0.39 ±0.01 0.38 ±0.01 Scroll 0.27 ±0.01 0.27 ±0.01 Scroll (fullscreen) 0.39 ±0.01 0.39 ±0.01 Scroll Btm 0.33 ±0.00 0.33 ±0.00 Scroll Btm (small) 0.33 ±0.00 0.33 ±0.00 Scroll Top 0.25 ±0.01 0.24 ±0.04 Scroll Top (small) 0.33 ±0.00 0.33 ±0.01 Unicode 0.26 ±0.02 0.26 ±0.02 fd ▶ Cmd Exec. 1.29 ±0.02 1.29 ±0.02 Cmd Exec. (large) 1.25 ±0.01 1.25 ±0.02 File Extension 0.15 ±0.00 0.14 ±0.01 File Type 0.13 ±0.01 0.13 ±0.01 No Pattern 0.28 ±0.02 0.29 ±0.02 Simple 0.34 ±0.01 0.35 ±0.00 Simple (-HI) 0.14 ±0.01 0.14 ±0.01 som-rs-ast ▶ Bounce 7.68 ±0.40 1.03 ±0.01 BubbleSort 5.27 ±0.29 0.92 ±0.01 DeltaBlue 7.84 ±0.19 1.17 ±0.01 Dispatch 6.52 ±0.46 1.04 ±0.01 Fannkuch 6.76 ±0.09 1.15 ±0.03 Fibonacci 22.16 ±1.87 1.54 ±0.01 FieldLoop 5.14 ±0.16 1.25 ±0.01 GraphSearch 2.82 ±0.06 0.47 ±0.00 IntegerLoop 5.88 ±0.11 0.90 ±0.01 JsonSmall 11.37 ±0.77 1.33 ±0.01 List 4.06 ±0.06 0.58 ±0.00 Loop 6.02 ±0.14 0.90 ±0.01 Mandelbrot 4.67 ±0.11 0.74 ±0.01 NBody 2.91 ±0.16 0.40 ±0.00 PageRank 2.44 ±0.05 0.47 ±0.00 Permute 6.69 ±0.35 0.97 ±0.02 Queens 9.07 ±0.34 1.28 ±0.02 QuickSort 10.86 ±0.66 1.39 ±0.02 Recurse 7.66 ±0.24 0.98 ±0.00 Richards 97.40 ±10.46 3.95 ±0.08 Sieve 7.06 ±0.13 1.01 ±0.01 Storage 5.06 ±0.10 0.75 ±0.01 Sum 5.91 ±0.12 0.89 ±0.01 Towers 3.20 ±0.14 0.50 ±0.01 TreeSort 4.65 ±0.32 0.45 ±0.00 WhileLoop 5.03 ±0.21 0.76 ±0.01 grmtools ▶ Ripgrep ▶ Alternates 1.44 ±0.03 1.39 ±0.01 Alternates (-i) 1.55 ±0.02 1.50 ±0.01 Literal 1.38 ±0.03 1.32 ±0.01 Literal (-i) 1.45 ±0.02 1.38 ±0.01 Literal (default) 1.38 ±0.02 1.30 ±0.01 Literal (mmap) 1.90 ±0.02 1.81 ±0.01 Literal (mmap, -i) 1.90 ±0.03 1.84 ±0.01 Literal (regex) 1.36 ±0.03 1.33 ±0.01 UTF Greek 3.53 ±0.01 3.50 ±0.02 UTF Greek (-i) 3.51 ±0.02 3.52 ±0.02 UTF Word 1.39 ±0.02 1.30 ±0.01 UTF Word (alt.) 1.36 ±0.02 1.31 ±0.02 Word 1.36 ±0.02 1.31 ±0.01 som-rs-bc ▶ Bounce 3.03 ±0.04 0.31 ±0.00 BubbleSort 2.64 ±0.04 0.29 ±0.00 DeltaBlue 4.19 ±0.08 0.37 ±0.00 Dispatch 3.80 ±0.36 0.30 ±0.00 Fannkuch 2.86 ±0.23 0.31 ±0.00 Fibonacci 4.34 ±0.20 0.40 ±0.00 FieldLoop 2.45 ±0.16 0.45 ±0.01 GraphSearch 1.15 ±0.03 0.13 ±0.00 IntegerLoop 3.00 ±0.16 0.26 ±0.00 JsonSmall 4.96 ±0.80 0.42 ±0.01 List 2.69 ±0.09 0.24 ±0.01 Loop 3.07 ±0.18 0.28 ±0.00 Mandelbrot 2.21 ±0.15 0.23 ±0.00 NBody 1.23 ±0.06 0.13 ±0.00 PageRank 1.06 ±0.01 0.14 ±0.00 Permute 2.72 ±0.09 0.29 ±0.00 Queens 3.19 ±0.07 0.35 ±0.00 QuickSort 3.62 ±0.05 0.41 ±0.00 Recurse 4.27 ±1.91 0.33 ±0.00 Richards 15.51 ±0.89 1.31 ±0.01 Sieve 2.57 ±0.07 0.28 ±0.00 Storage 2.45 ±0.05 0.22 ±0.00 Sum 3.06 ±0.37 0.27 ±0.00 Towers 1.59 ±0.05 0.16 ±0.00 TreeSort 1.19 ±0.02 0.13 WhileLoop 3.26 ±0.08 0.24 ±0.01
User time (s) Before elision After elision Alacritty ▶ Cur. Motion 1.08 ±0.06 1.12 ±0.05 Dense Cells 6.70 ±0.12 6.68 ±0.13 Light Cells 0.34 ±0.04 0.33 ±0.04 Scroll 0.12 ±0.01 0.12 ±0.01 Scroll (fullscreen) 0.27 ±0.03 0.26 ±0.03 Scroll Btm 0.15 ±0.01 0.15 ±0.01 Scroll Btm (small) 0.15 ±0.01 0.14 ±0.01 Scroll Top 0.12 ±0.01 0.12 ±0.06 Scroll Top (small) 0.15 ±0.01 0.14 ±0.01 Unicode 0.13 ±0.02 0.13 ±0.02 fd ▶ Cmd Exec. 1.13 ±0.01 1.13 ±0.02 Cmd Exec. (large) 1.10 ±0.02 1.11 ±0.02 File Extension 0.05 ±0.01 0.04 ±0.00 File Type 0.04 ±0.00 0.04 ±0.01 No Pattern 0.27 ±0.02 0.30 ±0.02 Simple 0.19 ±0.01 0.20 ±0.01 Simple (-HI) 0.04 ±0.01 0.04 ±0.01 som-rs-ast ▶ Bounce 11.15 ±0.41 1.02 ±0.01 BubbleSort 7.88 ±0.40 0.90 ±0.01 DeltaBlue 11.72 ±0.21 1.15 ±0.01 Dispatch 9.15 ±0.63 1.03 ±0.01 Fannkuch 9.60 ±0.12 1.12 ±0.03 Fibonacci 26.41 ±1.82 1.53 ±0.01 FieldLoop 6.42 ±0.17 1.25 ±0.01 GraphSearch 3.92 ±0.07 0.45 ±0.01 IntegerLoop 8.23 ±0.13 0.90 ±0.01 JsonSmall 16.08 ±0.75 1.32 ±0.01 List 5.79 ±0.09 0.57 ±0.00 Loop 8.51 ±0.17 0.89 ±0.01 Mandelbrot 6.62 ±0.12 0.73 ±0.01 NBody 3.98 ±0.15 0.40 ±0.00 PageRank 3.73 ±0.07 0.47 ±0.00 Permute 9.96 ±0.34 0.96 ±0.02 Queens 13.04 ±0.33 1.27 ±0.02 QuickSort 16.43 ±0.65 1.38 ±0.02 Recurse 11.18 ±0.28 0.98 ±0.00 Richards 109.49 ±10.34 3.95 ±0.08 Sieve 10.49 ±0.18 1.00 ±0.01 Storage 7.70 ±0.12 0.74 ±0.01 Sum 8.27 ±0.14 0.89 ±0.01 Towers 4.61 ±0.14 0.49 ±0.01 TreeSort 5.90 ±0.35 0.44 ±0.00 WhileLoop 6.74 ±0.20 0.76 ±0.01 grmtools ▶ Ripgrep ▶ Alternates 0.58 ±0.02 0.50 ±0.02 Alternates (-i) 0.72 ±0.02 0.64 ±0.02 Literal 0.53 ±0.02 0.44 ±0.02 Literal (-i) 0.59 ±0.02 0.51 ±0.01 Literal (default) 0.48 ±0.02 0.40 ±0.02 Literal (mmap) 0.63 ±0.02 0.53 ±0.02 Literal (mmap, -i) 0.65 ±0.02 0.57 ±0.02 Literal (regex) 0.52 ±0.02 0.43 ±0.01 UTF Greek 2.70 ±0.02 2.66 ±0.03 UTF Greek (-i) 2.71 ±0.03 2.67 ±0.02 UTF Word 0.53 ±0.02 0.42 ±0.02 UTF Word (alt.) 0.52 ±0.02 0.43 ±0.02 Word 0.51 ±0.02 0.44 ±0.02 som-rs-bc ▶ Bounce 4.34 ±0.04 0.30 ±0.00 BubbleSort 3.76 ±0.05 0.29 ±0.00 DeltaBlue 5.76 ±0.11 0.36 ±0.00 Dispatch 4.91 ±0.35 0.29 ±0.00 Fannkuch 3.85 ±0.29 0.30 ±0.00 Fibonacci 6.15 ±0.17 0.40 ±0.00 FieldLoop 3.14 ±0.15 0.45 ±0.01 GraphSearch 1.54 ±0.03 0.12 ±0.00 IntegerLoop 3.94 ±0.15 0.26 ±0.00 JsonSmall 6.70 ±0.73 0.41 ±0.01 List 3.72 ±0.09 0.24 ±0.01 Loop 4.11 ±0.17 0.28 ±0.00 Mandelbrot 2.91 ±0.13 0.23 ±0.00 NBody 1.72 ±0.06 0.12 ±0.00 PageRank 1.51 ±0.02 0.13 ±0.00 Permute 3.84 ±0.09 0.28 ±0.00 Queens 4.53 ±0.07 0.35 ±0.00 QuickSort 5.28 ±0.09 0.40 ±0.00 Recurse 5.84 ±1.85 0.33 ±0.00 Richards 20.23 ±0.81 1.30 ±0.01 Sieve 3.76 ±0.12 0.27 ±0.01 Storage 3.23 ±0.05 0.22 ±0.00 Sum 4.03 ±0.36 0.26 ±0.00 Towers 2.20 ±0.05 0.16 ±0.00 TreeSort 1.71 ±0.03 0.13 ±0.00 WhileLoop 3.90 ±0.09 0.24 ±0.01