Schubfach: The smallest floating point double-to-string impleme

Converting floating-point numbers to their shortest decimal representations has long been a surprisingly challenging problem. For decades, developers relied on algorithms such as Dragon4 that used multi-precision arithmetic. More recently, a new generation of provably correct and high-performance algorithms emerged, including Ryu, Schubfach and Dragonbox.

In my earlier post, double to string conversion in 150 lines of code, I explored how compact a basic implementation can be while still maintaining correctness. The Schubfach algorithm pushes this idea much further: it enables fully proven, high-performance dtoa with an amount of code comparable to the naïve algorithm. This post gives a brief overview of Schubfach and then walks through the minimal C++ implementation in https://github.com/vitaut/schubfach.

A brief overview of the Schubfach algorithm

The Schubfach algorithm, introduced by Raffaello Giulietti, is a method to convert binary floating-point numbers into the shortest, correctly rounded decimal string representation with a round-trip guarantee. It is based on two key insights:

1. The non-iterative search guided by the pigeonhole principle

The algorithm is founded on the Schubfachprinzip (the pigeonhole principle), which gives the algorithm its name, to efficiently determine the correct decimal exponent ($k$) for the optimal shortest decimal representation ($d_v$).

Instead of iteratively searching for the shortest decimal representation, Schubfach determines a unique integer exponent $k$ based on comparing the distance between adjacent decimal values ($10^k$) in a given scale ($10_k$) against the width of the floating-point number’s rounding interval ($\Vert R_v \Vert$).

The exponent $k$ is calculated such that $10^k \leq \Vert R_v \Vert < 10^{k+1}$. This comparison guarantees that the set of decimals rounding back to the original floating-point value $v$ ($R_k$) contains at least one element, and the set at the next finest resolution ($R_{k+1}$) contains at most one element.

This approach makes the algorithm non-iterative.

2. Guaranteed accuracy through fixed-precision integer estimates

... continue reading