I never understood how traceroute discovers each hop. Turns out it's a clever TTL trick, and about 80 lines of Rust.
Previously, I wrote about setting up a Tailscale exit node and appreciated how traffic gets wired to my home network. I wanted to understand traceroute a bit. I’ve never contemplated how it works, and now feels like as good a time as any to do just that. I mean, now’s the time to rewrite it in Rust.
What does traceroute do?#
I’ve just used traceroute to investigate how my query is travelling from my computer to my router and to the internet, finally reaching the end server.
1 2 3 4 5 6 7 8 9 10 11 12 $ traceroute -m 15 -w 2 8.8.8.8 traceroute to 8.8.8.8 (8.8.8.8), 15 hops max, 40 byte packets 1 <tailscale-gw> (<tailscale-ip>) 6.553 ms 5.323 ms 5.384 ms 2 <home-router> (<router-ip>) 7.183 ms 6.271 ms 4.607 ms 3 * <isp-gateway> (<isp-gateway-ip>) 7.189 ms * 4 * * * 5 * * * 6 * * * 7 <isp-hop-1> (<isp-hop-1-ip>) 284.000 ms 229.201 ms 257.805 ms 8 72.14.223.26 (72.14.223.26) 11.642 ms 12.643 ms 12.868 ms 9 * * * 10 dns.google (8.8.8.8) 12.268 ms 11.907 ms 11.766 ms
At a cursory level, it looks like it’s asking “where is this IP” at each level, and I’m not sure how it does that.
Traceroute doesn’t actually ask this “where is this IP.” It uses a TTL trick.
But to understand it, let’s write some code.
Every IP packet has a TTL (Time To Live) field - a counter that starts at some value (usually 64) Every router that forwards the packet decrements TTL by 1. When a router decrements TTL to 0, it drops the packet and sends back an ICMP “Time Exceeded” message to the sender. That ICMP message contains the router’s IP address.
So if we send packets with TTL=1, the first router replies. TTL=2, the second router replies. And so on, until we reach the destination. That’s traceroute.
... continue reading