When Compilers Surprise You

When compilers surprise you

Written by me, proof-read by an LLM.

Details at end.

Every now and then a compiler will surprise me with a really smart trick. When I first saw this optimisation I could hardly believe it. I was looking at loop optimisation, and wrote something like this simple function that sums all the numbers up to a given value:

So far so decent: GCC has done some preliminary checks, then fallen into a loop that efficiently sums numbers using lea (we’ve seen this before). But taking a closer look at the loop we see something unusual:

.L3: lea edx , [ rdx + 1 + rax * 2 ] ; result = result + 1 + x*2 add eax , 2 ; x += 2 cmp edi , eax ; x != value jne .L3 ; keep looping

The compiler has cleverly realised it can do two numbers at a time using the fact it can see we’re going to add x and x + 1 , which is the same as adding x*2 + 1 . Very cunning, I think you’ll agree!

If you turn the optimiser up to -O3 you’ll see the compiler works even harder to vectorise the loop using parallel adds. All very clever.

This is all for GCC. Let’s see what clang does with our code:

This is where I nearly fell off my chair: there is no loop! Clang checks for positive value , and if so it does:

... continue reading