Faster substring search with SIMD in Zig

I’ve been learning a lot about low-level programming languages lately, and for a long time there has been one thing that has interested me: SIMD (or ‘single instruction, multiple data’) code. I’ve seen a lot of articles about having massive performance gains by utilizing SIMD and wanted to learn how to do it myself.

This article is a journey into implementing ~60% faster substring searching compared to Zig’s std.mem.indexOf using a SIMD-friendly algorithm.

This is the baseline function that we will be comparing against:

fn find_substr ( needle : [] const u8 , haystack : [] const u8 ) ? usize { return std . mem . indexOf ( u8 , haystack , needle ); }

It’s the closest thing to a substring search function from Zig’s standard library. It returns the first index of a subsequence – or null if not found.

SIMD algorithm #

This algorithm is taken directly from Wojciech Muła’s fantastic article SIMD-friendly algorithms for substring searching, which seems to have the best algorithms for finding substrings in a large body of text.

Here’s how the algorithm works: say that we want to find the index of the word “blue” (the needle ) in “It was a beautiful, bounteous, blue day” (the haystack ). First, we extract the first and last character of the needle (‘b’ and ’e’) and store them in a variable.

Then we will loop through all of the characters in haystack , loading the next 32 characters (bytes) from memory into a SIMD register and comparing each character (byte) in the register with ‘b’. This will result in a mask containing 32 bytes, 1 if the character is ‘b’ and 0 in all other cases.

... continue reading