Tech News
← Back to articles

CPU Counters on Apple Silicon: article + tool

read original related products more articles

Last time I wrote about profiling in Zig on Apple Silicon, I touched on PMU counter profiling. This time I decided to go further and create my own tool to fetch all available counters for Apple Silicon processors (M1, M2, and later).

Brief explanation of PMU counters#

PMU (Performance Monitoring Unit) counters are hardware counters that track microarchitectural events inside the CPU, e.g. executed instructions, retired operations, branches, cache misses, and more.

CPUs usually expose a mix of fixed and programmable counters. Fixed counters represent predefined events (often things like cycles and instructions), while programmable counters can be configured to track a selected set of events.

Using PMU counters, developers can better understand the performance characteristics of their applications, e.g. the number of cache misses, branch mispredictions, instruction mix, and other low-level metrics.

One of the solutions for fetching these counters was the poop tool written by Andrew Kelly and the PR to his repository that adds the ability to fetch CPU counters on Apple Silicon by tensorush.

The main problem with this PR is that it was gracefully rejected by Andrew, and I fully get that decision, since it’s hard to support additional implementation, especially if you don’t use it.

I’ve created a fork that you can use. It’s actually a good solution if you need to fetch several predefined PMU counters.

But I’ve decided to go a bit further and implement another tool for Apple Silicon Macs that can fetch all counters supported by Apple Silicon. Since that required understanding how it works, my tool implementation quickly became a research project about Apple’s private kperf API.

Basically, this article is a journey through how the research went.

... continue reading