Tech News
← Back to articles

Engineering a fixed-width bit-packed integer vector in Rust

read original related products more articles

Posted on: September 24, 2025

If you’ve ever worked with massive datasets, you know that memory usage can quickly become a bottleneck. While developing succinct data structures, I found myself needing to store large arrays of integers—values with no monotonicity or other exploitable patterns, that I knew came from a universe much smaller than their type’s theoretical capacity.

In this post, we will explore the engineering challenges involved in implementing an efficient vector-like data structure in Rust that stores integers in a compressed, bit-packed format. We will focus on achieving O(1) random access performance while minimizing memory usage. We will try to mimic the ergonomics of Rust’s standard Vec as closely as possible, including support for mutable access and zero-copy slicing.

All the code can be found on github: compressed-intvec

This is also published as a crate on crates.io: compressed-intvec

Memory Waste in Standard Vectors

In Rust, the contract of a Vec (where T is a primitive integer type like u64 or i32 ) is simple: O(1) random access in exchange for a memory layout that is tied to the static size of T . This is a good trade-off, until it isn’t. When the dynamic range of the stored values is significantly smaller than the type’s capacity, this memory layout leads to substantial waste.

Consider storing the value 5 within a Vec . Its 8-byte in-memory representation is:

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000101

Only 3 bits are necessary to represent the value, leaving 61 bits as zero-padding. The same principle applies, albeit less dramatically, when storing 5 in a u32 or u16 . At scale, this overhead becomes prohibitive. A vector of one billion u64 elements consumes 10^9 * std::mem::size_of::() , or approximately 8 GB of memory, even if every element could fit within a single byte.

... continue reading