The curious case of adjacent difference If you have ever tried using the std::adjacent_difference algorithm in c++, I’m sure it left you puzzled. As the name suggests, this algorithm computes differences between adjacent elements of the input sequence, but it does one more thing: it copies the first element of the input sequence into the output sequence unmodified. The following example demonstrates how to apply the algorithm to delta-compress a postings list of document identifiers that contain a search term (the example is contrived since Google developed much more sophisticated posting list compression techniques). ⊕ Delta-compressing a posting list using the std::adjacent_difference algorithm. The compressed version might require less memory when encoded using variable-length integers. #include
Three central problems of calculus According to Steven Strogatz Steven Strogatz, Infinite Powers , p. 144 , calculus has three central problems: given a curve, find its slope everywhere (the forward problem), given a curve’s slope everywhere, find the curve (the backward problem), given a curve, find an area under it (the area problem). The fundamental theorem of calculus connects all three problems and states that the area under a slope of a curve on an interval is the difference of the curve height evaluated at the ends of this interval: ∫ a b f ' ( x ) d x = f ( b ) - f ( a ) Calculus deals with continuous curves, but we will tackle these problems through the lens of discrete value sequences. We’ll view a sequence f as a function over natural number and denote the discrete derivative of f as D f . and its antiderivative as S f , where D f ( i ) = f ( i + 1 ) - f ( i ) S f ( i ) = ∑ k = 0 i f ( k ) D f corresponds to differences between adjacent elements in the source sequence, and S f corresponds to partial sums. The discrete variant of the fundamental theorem becomes: ∑ i = k n D f ( i ) = f ( n ) - f ( k - 1 ) This formula has an off-by-one issue: if k = 0 , we have to access the minus first element, so, for convenience, we define f ( - 1 ) = 0 . I like how Kyne Santos explained this result: imagine you’re hiking up a mountain, going from point A to point B, and you want to find out the overall change in elevation. So let’s say that point A, the starting point, is 500 meters above sea level. In the first hour, you ascend 100 meters. In the second hour, you descend 50 meters, and in the third and final hour you ascend 200 meters to arrive at point B, which is 750 meters above sea level. The question is, what’s the overall change in elevation? Well, there’s two ways to go about it. You can find the final elevation, which is 750 meters, and just subtract the starting elevation, which was 500 and the difference between 750 and 500 is 250 meters. Or you can add up the little changes along the way. So in the first hour, we climbed 100 meters, and then we descended 50, and then we climbed another 200 so 100 minus 50 plus 200 is 250 meters. And these two approaches represent the two sides of the equation in the fundamental theorem of calculus. Kyne Santos, episode 95 of the My Favorite Theorem podcast Finding slopes with adjacent differences The slope between two discrete values is their difference, so taking the difference between neighboring points of a sequence solves the forward problem, yielding N - 1 slopes for a sequence of N elements. sequence 1 3 5 7 9 slopes 2 2 2 2 This operation loses information: we cannot recover the original sequence from its slopes. In calculus, taking a derivative is also a lossy operation since functions f ( x ) and g ( x ) = f ( x ) + c (where c is an arbitrary constant) have the same derivative f ' ( x ) . Recovering the original with partial sums The backward problem is the most interesting of the three. Since the sequence of slopes had one less item than the source sequence, there is not enough information to recover the latter. We know the changes between points, but not the first value to apply the changes to. Once we know it, we can compute the rest using partial sums. slopes 2 2 2 2 sequence C C+2 C+4 C+6 C+8 The lossiness of differentiation is the reason for introducing an arbitrary constant C when computing an indefinite integral in calculus: ∫ f ' ( x ) d x = f ( x ) + C In the discrete case, this constant corresponds to the first value in the original sequence; that’s why std::adjacent_difference preserves the first element in its output. Finding areas with partial sums The area of a sequence is the sum of its elements. The sequence of partial sums solves this problem because it allows us to sum the items on any subinterval of the original sequence in a single step: ∑ i = k n f ( i ) = S f ( n ) - S f ( k - 1 ) (as I previously mentioned, we define S f ( - 1 ) to be zero). sequence 1 3 5 7 9 partial sums 1 4 9 16 25 Partial sums don’t introduce or lose any information. We can always recover the original sequence using the stl definition of adjacent differences: sequence 1 3 5 7 9 partial sums 1 4 9 16 25 adjacent diff 1 3 5 7 9
Symmetry vs pragmatism The connection between std::partial_sums and std::adjacent_difference is aesthetically pleasing, but I find the design of the latter algorithm unfortunate: std::adjacent_difference is significantly less generic than its lossy version (i.e., a version that doesn’t copy the first element) would be, as it forces the output element type to match the input element type. The few times I needed to compute pairwise differences, the semantics of std::adjacent_difference stood in the way, and I ended up writing a custom loop. I’m not alone. Luckily, the pairwise_transform adapter from c++23 doesn’t make the extra copy. Derivatives in calculus are lossy, so the definition of std::adjacent_difference doesn’t exactly correspond to a derivative. Forcing the symmetry between discrete algorithms breaks the symmetry with their continuous counterparts. I disagree with Stepanov’s design choice, but I’m glad it made me question his intentions and find these hidden connections. api design is hard, especially when you try to express novel ideas, such as efficient generic programming, in a new programming language, as c++ was at the time. Just like Einstein’s cosmological constant, Stepanov’s extra copy turned out to be useful after all.
Appendix: deltas in q Just like std::adjacent_difference , the deltas function from the q programming language preserves the first item of its input: deltas 1 4 9 16 1 3 5 7 However, the deltas function operates slightly differently from its c++ cousin: Instead of copying the first item verbatim, it prepends a seed value of zero to the sequence before computing pairwise differences. deltas 1 4 9 16 == (1 - 0) (4 - 1) (9 - 4) (16 - 9) == 1 3 5 7 q defines deltas in terms of a more general Each Prior operator as -': . This direct form picks the seed based on the operation (zero for subtraction, one for multiplication, etc.) and allows the caller to override the default through its left argument. (*':) 2 3 4 / 1 is the identity for *. 2 6 12 1950 -': 1952 1954 1960 / Use 1950 as the seed instead of zero. 2 2 6 This design preserves symmetry with partial sums but avoids the type mismatch between the input and output sequences, achieving both elegance and pragmatism.
Similar articles