Skip to content
Tech News
← Back to articles

Running Local LLMs Offline on a Ten-Hour Flight

read original get Portable AI Model Server → more articles
Why This Matters

This article highlights the feasibility of running local large language models (LLMs) on a high-end MacBook during long flights, demonstrating how consumers and developers can leverage powerful hardware for offline AI tasks. It underscores the growing accessibility of advanced AI tools outside cloud environments, which can enhance productivity and data privacy for users. However, it also reveals hardware limitations like power consumption, heat, and processing constraints that need addressing as local LLM adoption grows.

Key Takeaways

I flew from London to Google Cloud Next 2026 in Las Vegas. Ten hours with no in-flight wifi. I used the time to test how far a modern MacBook can carry engineering work on local LLMs alone.

Setup

A week old MacBook Pro M5 Max, 128GB unified memory, 40-core GPU.

Gemma 4 31B and Qwen 4.6 36B via LM Studio.

Top 100 most common docker images, top programming languages alongside with enough dependencies to build function sites with rich visualisations.

Countless CLIs - with opencode, rtk, instantgrep and duckdb being most used.

What I built

A billing analytics tool covering two years of loveholidays cloud spend. DuckDB underneath, with a custom UI for slicing the data along dimensions the standard dashboards don’t expose. It surfaced patterns and cross-service correlations that had been hard to uncover.

I was interested in exploring this area for a while, but I could never prioritise it against whirlwind of my other responsibilities. With 10 hours to spare, top of the range hardware and OSS model I decided to give it a go.

Alongside that, I processed roughly 4M tokens on smaller tasks: refactors, CLI scaffolding, documentation. For tight-scope work, Gemma and Qwen produced output comparable to the frontier models I normally use.

... continue reading