Libwce: The entropy layer of a wavelet codec, on its own

Most image codecs you know about such as JPEG, JPEG 2000, JPEG XS, WebP are like layer cakes. You have transform sitting on top, entropy coding at the bottom, and rate control floats somewhere in the middle. And then there's a metadata layer wrapping it all up. The interesting bits are hidden under tons of framing code, profile parsers, and standards plumbing. If you just want to see how wavelet coefficients become bits, you have to dig deep into the guts of the codec.

I wrote libwce as a bare-bones implementation, consisting of only a single lib.rs file, weighing in at 500 lines. It just implements a patent-clean Bit-Plane Count (BPC)-style entropy layer in the spirit of JPEG XS, and nothing else. There is no boilerplate or dependencies with the library relying solely on stdlib.

A two-minute primer on BPC coding

A raw video stream is basically a grid of pixels, most of which share very similar color and brightness values with their immediate neighbors. Storing every pixel individually wastes a ton of bandwidth since there is a lot of repeated data in the stream. Codecs are used to compress this information by transforming the image from the spatial domain into the frequency domain. Rather than tracking individual pixels, a codec uses mathematical frequencies to describe color changes across the image. Older formats like standard JPEG end up chopping the image into squares and applying a discrete cosine transform, leading to the blocky artifacts we all know and love.

A wavelet is a newer approach that solves the problem by applying the transform process to the whole image at once, splitting the signal into low-frequency structural data and high-frequency detail data across multiple scales. After the wavelet transform, you end up with a 2D array of signed integer coefficients, most of which are near zero, with a long Laplacian tail. The purpose of the entropy layer is to compress this array down to a small number of significant bits.

BPC coding is done using groups of four coefficients at a time. For each group, you have to determine the smallest bpc such that every coefficient can be held. This is the bit-plane count representing the index above which all coefficient bits in the group are zero. In libwce, all the bpc values are written first into a single bitstream, then for each group the four coefficients are emitted coeff-major. These are the magnitude bits of each coefficient followed immediately by a single sign bit when that coefficient is nonzero. That takes care of all the data processing you need to do. Then, you get to the actual compression when you go to encode these bpc values. Neighboring groups tend to have similar sizes, so instead of writing each bpc as a raw 6-bit number, you can estimate it from its neighbors and, instead, write a small residual which tends to be tiny.

Here, libwce uses RUNNING (DPCM delta vs the previous group's bpc , zigzag-mapped and Rice-coded) and ZERO (unsigned residual against lossy_bits ) predictors which can be optionally combined with a 1-bit-per-8-group sparse-block flag that short-circuits all-deadzone blocks. That leaves you with four predictor × flag combinations, and the encoder sweeps Rice-k across seven values inside each, picking the best per band via a single-pass cost search. All combinations give the same decoded result, but they produce different types of bitstreams. Each one works best for different pixel distribution such as textured regions, flat parts, or sub-bands which are mostly zeros.

What it looks like to use

Here's a complete decoder for one sub-band:

let mut coeffs = vec![0i32; N]; let lossy_bits = decode(buf, &mut coeffs).unwrap(); dequantize_optimal(&mut coeffs, lossy_bits, scale_b);

... continue reading