Tech News
← Back to articles

HyAB k-means for color quantization

read original related products more articles

HyAB k-means for color quantization

Color quantization in CIELAB space, visualized. The input is converted to CIELAB space and a special “HyAB” distance formula is used when clustering. This in theory should result in better image quality.

I’ve been obsessing over color quantization algorithms lately, and when I learned that an image conversion app called Composite did its so-called pixel mapping step in CIELAB space, I instantly thought of the “HyAB” color distance formula I’d seen in the FLIP error metric paper from 2020.

By “pixel mapping” I mean choosing for each pixel the closest color in a fixed palette. This results in a compressed version of the original image. The closest color is the one with the shortest Euclidean distance, computed in 3D for the RGB coordinates. Unfortunately it’s not perceptually uniform, so two colors with a small computed color difference might actually look very different to the human eye.

Computing color differences in sRGB space is surprisingly decent but there are better options. The CIELAB aka L*a*b* space represents colors as their perceived brightness component L, the “luminance”, and two components a and b that encode a “chrominance” coordinate in a 2D (red-green, blue-yellow) plane. Notably the L component is still gamma corrected here but with gamma=3 instead of 2.2 like in sRGB.

Given two colors (L^*_1, a^*_1, b^*_1) and (L^*_2, a^*_2, b^*_2) in the CIELAB color space, it’s simple to compute their difference, again, by their Euclidean distance:

\text{Euclidean} = \sqrt{(L^*_1 - L^*_2)^2 + (a^*_1 - a^*_2)^2 + (b^*_1 - b^*_2)^2}

This is also known as the “CIE 1976” formula and apparently the reason CIELAB’s existence. Unfortunately it breaks down in large differences and also fails in some shades of blue. Wikipedia lists three later, increasingly complex, supposedly better-behaved formulas from years 1984, 1994, and 2001. But what if there’s a simpler fix?

Now we get to the point. In a 2019 paper Distance metrics for very large color differences, Saeedeh Abasi and colleagues suggest the following “CD1” formula large color distances:

\text{CD1} = |L^*_1 - L^*_2| + \sqrt{(a^*_1 - a^*_2)^2 + (b^*_1 - b^*_2)^2}

... continue reading