Posted on: September 24, 2025
If you’ve ever worked with massive datasets, you know that memory usage can quickly become a bottleneck. While developing succinct data structures, I found myself needing to store large arrays of integers—values with no monotonicity or other exploitable patterns, that I knew came from a universe much smaller than their type’s theoretical capacity.
In this post, we will explore the engineering challenges involved in implementing an efficient vector-like data structure in Rust that stores integers in a compressed, bit-packed format. We will focus on achieving O(1) random access performance while minimizing memory usage. We will try to mimic the ergonomics of Rust’s standard Vec
All the code can be found on github: compressed-intvec
This is also published as a crate on crates.io: compressed-intvec
Memory Waste in Standard Vectors
In Rust, the contract of a Vec
Consider storing the value 5 within a Vec
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000101
Only 3 bits are necessary to represent the value, leaving 61 bits as zero-padding. The same principle applies, albeit less dramatically, when storing 5 in a u32 or u16 . At scale, this overhead becomes prohibitive. A vector of one billion u64 elements consumes 10^9 * std::mem::size_of::
... continue reading