Running local LLMs offline on a ten-hour flight

I flew from London to Google Cloud Next 2026 in Las Vegas. Ten hours with no in-flight wifi. I used the time to test how far a modern MacBook can carry engineering work on local LLMs alone.

Setup

A week old MacBook Pro M5 Max, 128GB unified memory, 40-core GPU.

Gemma 4 31B and Qwen 4.6 36B via LM Studio.

Top 100 most common docker images, top programming languages alongside with enough dependencies to build function sites with rich visualisations.

Countless CLIs - with opencode, rtk, instantgrep and duckdb being most used.

What I built

A billing analytics tool covering two years of loveholidays cloud spend. DuckDB underneath, with a custom UI for slicing the data along dimensions the standard dashboards don’t expose. It surfaced patterns and cross-service correlations that had been hard to uncover.

I was interested in exploring this area for a while, but I could never prioritise it against whirlwind of my other responsibilities. With 10 hours to spare, top of the range hardware and OSS model I decided to give it a go.

Alongside that, I processed roughly 4M tokens on smaller tasks: refactors, CLI scaffolding, documentation. For tight-scope work, Gemma and Qwen produced output comparable to the frontier models I normally use.

... continue reading