Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length.
My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision.
(“p” now stands for prefix.)
A p-fast trie stores a lexicographically ordered set of names.
A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters.
Every unique prefix of every name is added to a hash table.
An entry in the hash table contains:
A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name.
The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length.
A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix.
The basic algorithm is a longest-prefix match.
Look up the query string in the hash table. If there’s a match, great, done.
Otherwise proceed by binary chop on the length of the query string.
If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.)
If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again.
Otherwise, this prefix is the answer.
Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought.
During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix).
If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again.
Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match.
If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop.
When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut.
In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie.
I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches.
A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!)
However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.