Building a Mac app with Claude code

I recently shipped Context, a native macOS app for debugging MCP servers. The goal was to build a useful developer tool that feels at home on the platform, powered by Apple's SwiftUI framework. I've been building software for the Mac since 2008, but this time was different: Context was almost 100% built by Claude Code1. There is still skill and iteration involved in helping Claude build software, but of the 20,000 lines of code in this project, I estimate that I wrote less than 1,000 lines by hand2.

This is a long post explaining my journey, how I chose my tools, what those tools are good at and bad at (for now), and how you can leverage them to maximize the quality of your generated code output, especially if you're building a native app like I am.

1. Copilot to Claude Code

My first experience with AI coding tools was when I tried GitHub Copilot, built into VS Code. This was the first tool of its kind, and I was pretty amazed: at the time, it was just autocomplete, but it was surprisingly effective—instead of only autocompleting symbol names or function signatures like a typical editor, it could autocomplete entire function implementations based on the context around it. This was a great productivity boost but it still felt like you were doing most of the work.

Then things started to move fast: Cursor took off, they added Agent Mode, and new competitors like Windsurf entered the space. All of the products were leaning into the "agentic" mode of development, where instead of using one-shot LLM responses for autocomplete, an LLM calls various tools in a loop to accomplish more complex tasks: gathering context on your code base, reading web pages and documentation, compiling your program, running tests, iterating on build/test failures, etc.

I had not tried any of these new tools extensively because I wasn't actively working on a side project at the time, but in February 2025, an interesting contender emerged out of nowhere: Claude Code was not a VS Code fork like the others, but was an IDE that was designed to be used entirely in the terminal. It had no traditional code editing capabilities or an overwhelming UI with lots of features: it put the agentic loop front and center. A text box to enter a prompt and not much else. Instead of augmenting your IDE with AI, it replaced your IDE. I wasn't entirely convinced that this was the ideal UX, but the idea was refreshing enough compared to what already existed that I decided I had to give it a try.

2. Starting Yet Another Side Project

Like many engineers who have demanding day jobs, I have a large graveyard of side projects that never shipped. Building working prototypes is doable, but the last 20% takes so much time and effort that I had not been able to ship a side project for 6 years.

At this point, I was starting to play around with Claude Code and its support for MCP (Model Context Protocol) servers. Anthropic designed MCP as an open standard to allow agents to access tools and other context to accomplish specific tasks. For example, the Sentry MCP server exposes tools that allow an agent to fetch issues containing stack traces and other useful debugging context, and even invoke Sentry's own bug fixing agent.

However, the experience of building and testing MCP servers was cumbersome: MCP servers communicate with clients over standard input/output streams, or over HTTP with Server-Sent Events (SSE) to give servers the ability to stream responses to clients. It wasn't as simple as invoking a CLI or using curl to send requests to a service. There is a first-party tool called MCP Inspector that allows developers to test server functionality, but as a long-time macOS & iOS developer, I wanted to try building a native app to solve this problem. I figured it would be a great learning experience to push the boundaries of AI agents, and hoped to come out of it with a useful product.

... continue reading