Schanuel's Conjecture and the Semantics of Triton's FPSan

I’ve been spending some of my time recently developing a tool called FPSan in collaboration with Pawel Szczerbuk. It’s implemented as a Triton compiler pass, but has none of the desirable properties expected of a compiler pass: in particular, it doesn’t preserve functionality, it makes things slower, and it’s hitherto completely undocumented. (On the latter point, Pawel has an open PR adding documentation.)

Its purpose is to make it easier to verify algebraic equivalence of programs written in Triton that involve floating-point arithmetic. The key problem is that, in floating-point arithmetic, algebraic laws such as associativity do not hold exactly: in general, (a + b) + c need not equal a + (b + c). As such, if you rewrite a program to take advantage of this, e.g. to replace a sequential summation loop with a parallel tree-shaped reduction, the program will no longer behave completely identically.

FPSan can be viewed as an idempotent function on the space of programs that replaces all floating-point operations with (completely different!) integer operations, such that if f and g are algebraically equivalent programs then FPSan(f) and FPSan(g) produce identical results when given identical inputs.

More formally, conditional on the real version of Schanuel’s conjecture, this holds provided that the programs f and g have the following properties:

each program implements an arithmetic circuit on its floating-point inputs, and the control flow is independent of those floating-point inputs;

the arithmetic circuit only consists of inputs, outputs, the constants {-1.0, 0.0, +1.0}, the ring operations {−, +, ×}, and the exponential function exp.

These operations may seem somewhat restrictive, but it already encompasses a vast range of the more common GPU kernels involved in machine learning: matrix multiplications and [the bulk of] self-attention are covered by FPSan’s guarantees.

The proof is deferred to the end of this article to avoid derailing the discussion. This is quite possibly the only compiler sanitiser whose correctness depends on an extremely difficult unsolved problem in transcendental number theory.

Implementation

Specifically, FPSan constructs a bijective ’embedding function’ φ from the set of IEEE-754 single-precision floats (there are 2^32 of them) to the ring of integers modulo 2^32. The function φ is implemented as follows:

... continue reading