6 Weeks of Claude Code Jul 30, 2025 - Orta Therox It is wild to think that it has been only a handful of weeks. Claude Code has considerably changed my relationship to writing and maintaining code at scale. I still write code at the same level of quality, but I feel like I have a new freedom of expression which is hard to fully articulate. Claude Code has decoupled myself from writing every line of code, I still consider myself fully responsible for everything I ship to Puzzmo, but the ability to instantly create a whole scene instead of going line by line, word by word is incredibly powerful. I believe with Claude Code, we are at the “introduction of photography” period of programming. Painting by hand just doesn’t have the same appeal anymore when a single concept can just appear and you shape it into the thing you want with your code review and editing skills. If this feels like an intimidating line-of-thought then welcome to the mid-2020s, nothing is stable anymore and change is the only constant. Sorry. I didn’t make it so nearly all of culture’s changes are bad, and I think that LLMs are already doing social damage and will do much worse in the future - but this genie is fully out of the bottle and it is substantially going to change what we think of as programming. A Retrospective on the last 6 Weeks# This article builds on “On Coding with Claude” which I wrote after using Claude for a week. If you think that I am AI-pilled, you can get my nuanced take on LLMs at the start of that post. That said, this is transformative and I want to try give you some perspective from the last 6 weeks of activity in the Puzzmo engineering space to try and show you what I’ve been seeing. Maintenance is Significantly Cheaper# I have been on many projects with people which have taken weeks full-time to perform some sort of mundane task: “converting this JS codebase to TypeScript”, “Update to Swift X”, “Switch to a monorepo” they’re the kind of things which are delicate migrations which require a gazillion rebases. Here is a list of things which I have completed, solo, since getting access to Claude Code: Converting hundreds of React Native components to just React Replaced 3 non-trivial RedwoodJS systems with home-grown or mature, supported replacements Built complex REPLs for multiple internal and external projects Switched almost every db model to have a consistent ‘flags’ system for booleans Converted from Jest to Vitest Created our front-end testing strategies for React Moved many things defined in code to run via the CMS Made significant headway on server-side rendering Re-wrote the iOS app’s launch system due to deprecations Built a suite of LLM created (and framed as such, hen hand annotated) documentation for systems like leaderboards, dailies etc Converted a significant amount of our design system primitives to use base-ui Migrated significant code from inline styles to stylex Converted all animations in puzzmo.com to use the same techniques as games Fixed multiple bugs which have been around since the start of Puzzmo Updated all Vite integrations Migrate all Puzzmo production projects to node 22 Convert the games repo to a real monorepo Built iPad support for the Puzzmo app None of these projects are the “actual work” which I need to do on a day to day basis as the ‘bizdev’ guy on Puzzmo for this year. These are literally side-projects which I did on my own while working on something else. For clarity in the back because this is shocking to me, while I was still working on the existing roadmap I had prior to Claude Code over the last 6 weeks, I accomplished all of these things on my own. Mostly in the background (and then with a polish pass day for some of the larger ones). I didn’t go from working ~10 hour days to working ~16 hours or anything like that either. This was years of “tech debt” / “tech innovation” backlog for me! Done in just over a month and a half. If you understand what you are doing, the capacity for building and handling the breadth of tasks which typically live within the remit of “technical debt” do not need to be treated as debt and you can just do it as you are working on other things. ‘carving out some time on the schedule’ is now so incredibly cheap that getting started and making a serious dint is something you can prime before going into a meeting, then deciding if you thought it was the right thing after. Mind-blowing. Write First, Decide Later# A habit I have been trying to form is to give an idea a shot before I fully shoot it down. For example, since day 1 on Puzzmo I had been waiting on figuring out a testing strategy for the front-end because I wanted to be able to hire someone to fully own “puzzmo.com” and a part of that is figuring out how to not do as many regressions as we get. Figuring out a testing strategy for the front-end isn’t pretty, and I have seen a lot of really bad test suites which over-test and become brittle things that engineers don’t like to work with. The mix of networking, react, the scope of contexts, the dom, flakiness in tooling just leads to answers where you are looking for the least bad solution which you’ve used yourself and feel comfortable maintaining. I wondered if I needed to wait for someone else, so instead of just “adding a test suite” - I opted to have Claude Code write tests for every pull request I made to the front end over the course of two weeks. Then, after seeing the tests, I deleted them. It added an extra 5m to my process, but gave me an insight each time into different ways in which other projects deal with the problem. After weeks of this, I was ready to start looking at that problem systemically. The idea of writing tests for every pull request and then deleting it would just be so much time, there would be no way I’d be OK with doing. Or a recent example from slack where I just vibed for half a day in the background on trying to make an abstraction for CRUD resources in our CMS tools: Did it work? Nope, was it worth an exploration - sure. Living the Two Clones lifestyle# Anthropic have information about how to use worktrees - I would like to argue for a simpler approach. Two clones, different VS Code profiles. This means you can work in each independently and still visually recognize the differences in you workspaces by having a different theme: My best argument is simply that each clone represents a single pull request that you can work on at a time. If you are writing pull requests and collaborating with others then that is still pretty important. I made it so that our dev servers close any processes using the ports you want and its trivial to jump between the two clones as Claude Code is working stuff out before you are looking build. Game design collaboration# Since Puzzmo was created this was the process of creating a game: We create some prototypes using all sorts of technologies Collectively we work through the prototypes with feedback We decide if this game is worth shipping The game team re-write the from scratch in our tech stack, and with puzzmo’s system integrations The process of this is weeks before any production code is written, if at all. At our current throughput, we roughly release a game a quarter at the level of polish we want to achieve. In a post-Claude Code world, this model can be simplified greatly and it is a space we are exploring. I created a new Puzzmo monorepo (that’s three now, “app”, “games” and this new one “prototypes”) which emulates the infrastructure of the games repo but has significantly different expectations on the type of code being shipped. With this repo, a game designer can go from an idea to something running on puzzmo.com for admins in a couple of hours, you write the code, then go into our admin CMS and click a few buttons and it’s done. To go from “this is good for the team” to “we should make this public” takes a bit of hands-on work from me and Saman, but it’s a different ball park of effort compared to our current production pipeline. We released Missing Link using this technique, which seems to be a hit. This… actually is a bit of a problem for us. I am happy for us to have a game designer’s code running on Puzzmo for a time-gated experiment, but I am not OK with this turning into Puzzmo canon with the rest of the games. The flexibility which allows a game designer to make a prototype is the part that makes it un-suitable for writing long-term production code. This leaves us with a few options: Finish the experiment and stop having the game on the site Re-write the game as production code Declare some games as not quite having every Puzzmo integration feature Explore making it more possible to write ‘production worthy’ code in prototypes Extend the experiment to give ourselves time to figure another option All of these have trade-offs, and it isn’t obvious what the right idea is. The problem is novel because prior to Claude Code it wasn’t worth the effort of integrating prototype code with Puzzmo’s systems — now it’s trivial and accomplishable by anyone on the team. We can really deliver on the idea of ’experimental’ games that we launched with, which means we have to be much more thoughtful about the risk of launching too many games that people want us to keep around. Taking a Shot During Triage# One thing I have been experimenting with during our weekly triage of all raised GitHub issues is asking the Claude Code GitHub action to take a stab at a pull request while we are talking about what we think as a group of engineers: Or one where I was the one providing enough context myself in the issue: As I am the one responsible for getting that Pull Request into production, that’s the first few steps ready and for smaller things I’ve found it to be a solid one-shot now that the repo is very well set up. Who has been successful using it internally?# I think it’s worth noting here that we offered Claude Code to everyone on the team from the moment it because obvious how powerful of a tool it was for me personally. I would say from our team, the sort of people who have used and found value the most are people with both product, technical skills and agency to feel like they can try things. One of them said that Claude Code freed them from the anxiety of the first step in programming constantly. Justin Searls did an interesting write-up where he described an idea of full-breadth developers where-in he argues that: Up until a few months ago, the best developers played the violin. Today, they play the orchestra. Which I think is correct, within the Puzzmo team the people whose skill-sets are being self-driven, run their own verticals and feel like they have the freedom to explore and push those boundaries are doing really cool work. It bursts out of any explicit job role boundaries and it becomes a pleasure to collab at a larger/faster scale on ideas than before. So I will double-down on saying that everything in Justin’s post echoes what is happening inside the Puzzmo engineering team and his post is really worth musing over. What Do I Think Makes It Successful in our Codebases# We use monorepos. I was lucky to have spent the time a year ago to take every project and move it into a two main environments. This was originally done to reflect the working processes of the engineering teams. My goal was to make it possible to go from db schema change to front-end components in a single pull request. A monorepo is perfect for working with an LLM, because it can read the file which represents our schema, it can read the sdl files defining the public GraphQL API, read the per-screen requests and figure out what you’re trying to do. Having a single place with so much context means that I as user of Claude Code do not need to tell it that sort of stuff and a vague message like “Add a xyz field to the user model in the db and make it show in this screen” is something that Claude Code can do. My tech choices were made a decade ago. This video of a conference talk I gave from 2018 is still the way I introduce people to the Puzzmo codebase and the mentality behind these tech choices. React, Relay, GraphQL, TypeScript and (now StyleX) are boring and very explicit technologies. There are compilation steps in all of these systems which means everything has to be available locally and correct to run, this makes it a bit of a curve to learn but often when you have got it right - you know you have got it right. For our admin tools, its even more boring/mature, I’m still using Bootstrap! For an LLM, these technologies are very well baked into its training set and Claude Code knows to do things like “run the Relay compiler” (when I saw Claude Code first do that, I knew I was in for a wild ride) which gives it incremental ways to be validating the changes it has done are working. This isn’t novel work. Most of the stuff we’re doing on a day to day basis is pretty normal down-to-earth CRUD style apps. These codebases aren’t that big, nor that old. Nothing is older than 2021 and while I keep things up-to-date, I try to have a long-tail of support / backwards compatibility. Our business is literally the test suite / benchmark for these models. For example, on the 28th of June, two days before posting this GLM-4.5 came out. Offering a way to run an ~80% as good as Claude Code on your computer locally. How do they measure that 80%? Here is the table from their benchmarks of what they use: Puzzmo’s day-to-day work is represented in ~(39/52)% of their testing infrastructure! Quantifying the Change is Hard# I thought I would see a pretty drastic change in terms of Pull Requests, Commits and Line of Code merged in the last 6 weeks. I don’t think that holds water though: