Learn how to optimize Go data structures for modern CPU architectures. We'll explore cache lines, false sharing, and data-oriented design to achieve significant performance improvements in real-world applications.
Key Takeaways Cache misses can slow down your code by 60x compared to L1 cache hits
False sharing occurs when multiple cores update different variables in the same cache line
Proper data structure padding can improve performance by 5-10x in specific scenarios
Data-oriented design beats object-oriented for high-performance systems
Always measure with benchmarks - cache effects are hardware-specific
The Numbers That Matter
L1 Cache: 4 cycles (~1ns) 32KB L2 Cache: 12 cycles (~3ns) 256KB L3 Cache: 40 cycles (~10ns) 8MB RAM: 200+ cycles (~60ns) 32GB Cache line size: 64 bytes (on x86_64)
Reading from RAM is approximately 60x slower than L1 cache. One cache miss equals 60 cache hits. This is why cache-friendly code can run significantly faster - often 5-10x in specific scenarios.
False Sharing: The Silent Killer
... continue reading