Learning by doing
Ben O’Mahony is Principal AI Engineer at Thoughtworks. He is a results-driven AI/Engineering leader with a track record of building high-performing teams and shipping business-critical AI, ML and data products and platforms at scale. He has deep expertise across the full Engineering and Data lifecycle from research to production deployment. Ben is adept at defining technical strategy, driving execution and partnering cross-functionally to deliver measurable impact. Recently Ben has been intensely focused on building Generative AI platforms, models and agents.
CLI coding agents are a fundamentally different tool to chatbots or autocomplete tools - they're agents that can read code, run tests, and update a codebase. While commercial tools are impressive, they don't understand the particular context of our environment and the eccentricities of our specific project. Instead we can build our own coding agent by assembling open source tools, using our specific development standards for: testing, documentation production, code reasoning, and file system operations.
The wave of CLI Coding Agents If you have tried Claude Code, Gemini Code, Open Code or Simon Willison’s LLM CLI, you’ve experienced something fundamentally different from ChatGPT or Github Copilot. These aren’t just chatbots or autocomplete tools - they’re agents that can read your code, run your tests, search docs and make changes to your codebase async. But how do they work? For me the best way to understand how any tool works is to try and build it myself. So that’s exactly what we did, and in this article I’ll take you through how we built our own CLI Coding Agent using the Pydantic-AI framework and the Model Context Protocol (MCP). You’ll see not just how to assemble the pieces but why each capability matters and how it changes the way you can work with code. Our implementation leverages AWS Bedrock but with Pydantic-AI you could easily use any other mainstream provider or even a fully local LLM.
Why Build When You Can Buy? Before diving into the technical implementation, let's examine why we chose to build our own solution. The answer became clear very quickly using our custom agent, while commercial tools are impressive, they’re built for general use cases. Our agent was fully customised to our internal context and all the little eccentricities of our specific project. More importantly, building it gave us insights into how these systems work and the quality of our own GenAI Platform and Dev Tooling. Think of it like learning to cook. You can eat at restaurants forever but understanding how flavours combine and techniques work makes you appreciate food differently - and lets you create exactly what you want.
The Architecture of Our Development Agent At a high level, our coding assistant consists of several key components: Core AI Model: Claude from Anthropic accessed through AWS Bedrock
Pydantic-AI Framework: provides the agent framework and many helpful utilities to make our Agent more useful immediately
MCP Servers: independent processes that give the agent specialised tools, MCP is a common standard for defining the servers that contain these tools.
CLI Interface: how users interact with the assistant The magic happens through the Model Context Protocol (MCP), which allows the AI model to use various tools through a standardized interface. This architecture makes our assistant highly extensible - we can easily add new capabilities by implementing additional MCP servers, but we’re getting ahead of ourselves.
Starting Simple: The Foundation We started by creating a basic project structure and installing the necessary dependencies: uv init uv add pydantic_ai uv add boto3 Our primary dependencies include: pydantic-ai : Framework for building AI agents
... continue reading