Extending That XOR Trick to Billions of Rows
Can we extend the XOR trick for finding one or two missing numbers in a list to finding thousands of missing IDs in a billion-row table? Yes, we can! This is possible using a data structure called an Invertible Bloom Filter (IBF) that compares two sets with space complexity based only on the size of the difference. Using a generalization of the XOR trick [1], all the values that are identical cancel out, so the size of this data structure depends only on the size of the difference. Most explan