In my last post I talked about how I spent a week heads down using AI to work on a greenfield engineering metrics tool. As I built it, I’d often navigate the web app and spot things that needed to be fleshed out. Sometimes it was a small typo; other times it was a bigger feature that was still TODO. At one point I had Claude Code redesign the homepage to make it more lively. In doing so, it added some new functionality that didn’t fully exist yet: A “View All Insights” link that would show you all the AI-generated analyses about a given pull request or piece of work. Since I hadn’t actually built the page for it yet, it led to a 404. Traditionally, fixing this would kick off a whole sequence of events. I’d have to scope out the feature, think about the UI, define the API needs, and write a detailed ticket. Then, I’d need to build the backend endpoint, create the UI components, and wire everything together. It’s a linear, manual process. Instead, I took a different approach. I ran a single custom command to generate a ticket for the new page. This command invoked several specialist sub-agents (you can find their .md definitions in the appendix)—a product-manager , a ux-designer , and a senior-software-engineer —who worked in parallel to flesh out the requirements. The result was a fully-formed ticket, created in minutes. Here’s a quick preview of the actual ticket these agents generated in Linear: With the plan defined, I could then feed that ticket into another command that kicks off the implementation agents ( senior-software-engineer , code-reviewer , etc.). This workflow changes the dynamic. What would normally take hours of planning, spec’ing, and building was done asynchronously while I focused elsewhere. If the agents get it wrong, I don’t really care—I’ll just fire off another run. The cost of failure is so low that optimizing for speed and taking more “shots on goal” is the right call. This entire process—from planning to implementation—ran in the background across multiple terminals while I moved on to the next task. This is what true parallelization looks like; the agents were so active they even started hitting API rate limits. The Core Principles of an Agentic Workflow My workflow is built on three core principles. Understanding them will help you apply this approach to your own tasks. 1. Parallel Execution for Speed The most direct benefit is the ability to perform independent tasks concurrently instead of sequentially. A common task like scaffolding a new feature can be broken down into its constituent parts, with a specialist agent assigned to each. Example: Scaffolding a New API Integration in Parallel Let’s say you need to add a new third-party API integration like processing payments with Stripe. Typically, you’d work sequentially: build the server-side route, then the client-side form, then the tests, and finally write the documentation. With sub-agents, you can parallelize this work. An orchestrating agent, given the Stripe API documentation, can spin up multiple specialists at once: backend -specialist : Reads the docs and writes the Node.js API endpoint to handle the charge creation. : Reads the docs and writes the Node.js API endpoint to handle the charge creation. frontend-specialist : Reads the same docs and builds the React component for the payment form that communicates with the backend endpoint. : Reads the same docs and builds the React component for the payment form that communicates with the backend endpoint. qa-specialist : Generates a corresponding integration test suite using Vitest to verify the backend logic. : Generates a corresponding integration test suite using Vitest to verify the backend logic. docs-specialist : Drafts a README.md section explaining the new feature, the required environment variables, and how to get API keys. You receive a complete starting point in the time it takes to complete the longest single task. graph TD A[Primary Agent: Integrate Stripe Payments] --> B{Dispatch}; B --> C[backend_agent]; B --> D[frontend_agent]; B --> E[qa_agent]; B --> F[docs_agent]; C --> G[API Route Code]; D --> H[React Component]; E --> I[Test Suite]; F --> J[README.md Draft]; 2. Sequential Handoffs for Automation While some tasks are parallel, many complex processes are sequential. Here, agents act like an automated assembly line, with the output of one agent becoming the input for the next. This automates the entire lifecycle of a task, from planning to review. Example: The Automated Engineering Lifecycle The workflow from the introduction is a perfect example of this. The product-manager and ux-designer agents first produce a ticket. That ticket is then handed off to the senior-software-engineer to build the feature. Finally, the resulting code is handed off to the code-reviewer for approval. Planning & Implementation: The main orchestrator assigns a ticket, like the one from our intro, to the senior-software-engineer agent (running on Opus). The agent follows its defined “concise working loop” to plan and implement the code. Code Review: Once done, the orchestrator triggers the code-reviewer . This agent is relentless. If rules are broken, it fails the process. Iterative Refinement: A loop in the main context window feeds the reviewer’s feedback back to the engineer agent until the reviewer is satisfied. This process yields structured, parsable artifacts. Here’s a sample output snippet from the code-reviewer : The orchestrator can parse this structured output and automatically manage the feedback loop. 3. Context Isolation for Quality This is the most critical principle. If you asked a single AI agent to perform a complex, multi-stage task, it would exhaust its context window and start losing crucial details. By using subagents you give each specialist its own dedicated context window, ensuring the quality of each step is preserved. Example: Planning the “AI Insights” Page In our intro story, the product-manager was able to use its entire 200k context to focus only on user needs and business logic. The senior-software-engineer then received the final ticket and could use its own fresh 200k context to focus only on implementation, without needing to remember the nuances of the initial product discussion. This prevents quality degradation. The product-manager can use its entire context to focus solely on user needs, acceptance criteria, and business logic. can use its entire context to focus solely on user needs, acceptance criteria, and business logic. The ux-designer can use its full context to analyze existing design patterns and user flows, without needing to hold database schemas in memory. can use its full context to analyze existing design patterns and user flows, without needing to hold database schemas in memory. The senior-software-engineer then receives the concise output from the planners (the ticket) and can dedicate its entire 200k context to what matters for implementation: the codebase, technical constraints, and writing clean code. The quality of each step is preserved because no single agent has to sacrifice its specialized knowledge to stay within the limit. graph TD subgraph "Phase 1: Planning Parallel Contexts" A[Initial Goal: Build AI Insights Page] --> B[product_manager_agent
200k Context]; A --> C[ux_designer_agent
200k Context]; end subgraph "Handoff Artifact" B --> D{Ticket.md}; C --> D; end subgraph "Phase 2: Implementation & Review Iterative Contexts" D --> E[senior_engineer_agent
200k Context]; E --> F[Code Draft]; F --> G[code_reviewer_agent
200k Context]; G -- Feedback / Revisions --> E; G -- Approved --> H[Final Code]; end Putting It Into Practice: More Examples These core patterns can be applied all over the software lifecycle. Generating Codebase Documentation : For a large, undocumented module, a primary agent can list all functions, classes, or files. It then spins up a sub-agent for each one, tasked with analyzing its code and writing comprehensive comments or diagrams. A final agent can then assemble these into a coherent README.md file. : For a large, undocumented module, a primary agent can list all functions, classes, or files. It then spins up a sub-agent for each one, tasked with analyzing its code and writing comprehensive comments or diagrams. A final agent can then assemble these into a coherent file. Large-Scale Automated Refactoring : To deprecate a function used in 75 files, have a primary agent grep for all instances, then spin up a dedicated sub-agent for each file to perform the replacement in a small, safe context. Even better if the refactor could be explained via an SOP; define the SOP as a command or a subagent and iteratively kick them off to complete the work. : To deprecate a function used in 75 files, have a primary agent for all instances, then spin up a dedicated sub-agent for each file to perform the replacement in a small, safe context. Even better if the refactor could be explained via an SOP; define the SOP as a command or a subagent and iteratively kick them off to complete the work. Incident Response Analysis : To understand an outage across three microservices, use three sub-agents to analyze each service’s logs in parallel. Each one extracts a timeline of critical events. The main agent’s job is much simpler: synthesize the three pre-processed timelines into a single report. : To understand an outage across three microservices, use three sub-agents to analyze each service’s logs in parallel. Each one extracts a timeline of critical events. The main agent’s job is much simpler: synthesize the three pre-processed timelines into a single report. For a Product Manager – Synthesizing User Feedback : A PM can take a CSV of 500 survey responses. A primary agent defines key themes (e.g., UI/UX, Performance, Pricing, Feature Requests). It then spins up multiple sub-agents to process chunks of 50 responses each, tagging them and pulling out representative quotes. The main agent receives the structured output and generates a final summary report with key insights. : A PM can take a CSV of 500 survey responses. A primary agent defines key themes (e.g., UI/UX, Performance, Pricing, Feature Requests). It then spins up multiple sub-agents to process chunks of 50 responses each, tagging them and pulling out representative quotes. The main agent receives the structured output and generates a final summary report with key insights. For a Security Engineer: To audit a new open-source library, a primary agent can coordinate sub-agents to scan for CVEs, scour GitHub issues for security reports, and analyze the code for common anti-patterns, assembling a multi-faceted security brief much faster than a manual review. Practical Considerations and Workflow Trade-offs This approach is powerful, but it’s not magic. It’s a workflow for a developer, and with it come practical trade-offs. Managing Cost and Usage Limits : Chaining agents, especially in a loop, will increase your token usage significantly. This means you’ll hit the usage caps on plans like Claude Pro/Max much faster. You need to be cognizant of this and decide if the trade-off—dramatically increased output and velocity at the cost of higher usage—is worth it. : Chaining agents, especially in a loop, will increase your token usage significantly. This means you’ll hit the usage caps on plans like Claude Pro/Max much faster. You need to be cognizant of this and decide if the trade-off—dramatically increased output and velocity at the cost of higher usage—is worth it. The Art of Non-Determinism : The non-deterministic nature of LLMs means changing one part of your workflow—a sub-agent’s prompt, a command, the orchestrator’s instructions—can have a ripple effect. This makes debugging a challenge, but it’s also where the creative aspect of this engineering comes in. Your approach to handling these issues is part of the craft. : The non-deterministic nature of LLMs means changing one part of your workflow—a sub-agent’s prompt, a command, the orchestrator’s instructions—can have a ripple effect. This makes debugging a challenge, but it’s also where the creative aspect of this engineering comes in. Your approach to handling these issues is part of the craft. The Synthesis Challenge : The “reduce” step where a final agent synthesizes the work of others is often the most difficult part. To mitigate this, it’s crucial to have each sub-agent save its output to a distinct file. This creates a clear audit trail, allowing you to debug why the final synthesis went wrong. : The “reduce” step where a final agent synthesizes the work of others is often the most difficult part. To mitigate this, it’s crucial to have each sub-agent save its output to a distinct file. This creates a clear audit trail, allowing you to debug why the final synthesis went wrong. Prompts as Fragile Dependencies: The agent definitions, while clear, should be treated like code. They need to be version-controlled, tested, and monitored. A model update from the provider can cause subtle behavioral drifts that can only be caught with a rigorous evaluation suite. Final Thoughts You have to get creative, and the specific application will depend on your situation. But when you start having the mindset of breaking down problems for specialist agents running in parallel, you’ll start to find the patterns that work for you. It’s a more robust and scalable way to solve complex problems. Appendix: Commands & Agent Definitions For those who want to implement this workflow, here are the definitions for the custom command and agents used. To add a new command in Claude Code, create a command.md file in ~/.claude/commands or ./.claude/commands Similarly, you can define subagents by manually adding them to ~/.claude/agents/ or your project’s ./.claude/agents/ folder or by using the /agents command when in a Claude Code session. add-linear-ticket command (assumes you have the Linear MCP configured) ----- description: Create a comprehensive Linear ticket from high-level input, automatically generating detailed context, acceptance criteria, and technical specifications using a core team of three specialist agents. argument-hint: "" ## Mission Transform high-level user input into a well-structured Linear ticket with comprehensive details. This command uses a core team of three agents (`product-manager`, `ux-designer`, `senior-software-engineer`) to handle all feature planning and specification in parallel. It focuses on **pragmatic startup estimation** to ensure tickets are scoped for rapid, iterative delivery. **Pragmatic Startup Philosophy**: - 🚀 **Ship Fast**: Focus on working solutions over perfect implementations. - 💡 **80/20 Rule**: Deliver 80% of the value with 20% of the effort. - 🎯 **MVP First**: Define the simplest thing that could possibly work. **Smart Ticket Scoping**: Automatically breaks down large work into smaller, shippable tickets if the estimated effort exceeds 2 days. **Important**: This command ONLY creates the ticket(s). It does not start implementation or modify any code. ## Core Agent Workflow For any feature request that isn't trivial (i.e., not LIGHT), this command follows a strict parallel execution rule using the core agent trio. ### The Core Trio (Always Run in Parallel) - **`product-manager`**: Defines the "Why" and "What." Focuses on user stories, business context, and acceptance criteria. - **`ux-designer`**: Defines the "How" for the user. Focuses on user flow, states, accessibility, and consistency. - **`senior-software-engineer`**: Defines the "How" for the system. Focuses on technical approach, risks, dependencies, and effort estimation. ### Parallel Execution Pattern ```yaml # CORRECT (Parallel and efficient): - Task(product-manager, "Define user stories and business value for [feature]") - Task(ux-designer, "Propose a simple UX, covering all states and accessibility") - Task(senior-software-engineer, "Outline technical approach, risks, and estimate effort") ``` ----- ## Ticket Generation Process ### 1) Smart Research Depth Analysis The command first analyzes the request to determine if agents are needed at all. LIGHT Complexity → NO AGENTS - For typos, simple copy changes, minor style tweaks. - Create the ticket immediately. - Estimate: <2 hours. STANDARD / DEEP Complexity → CORE TRIO OF AGENTS - For new features, bug fixes, and architectural work. - The Core Trio is dispatched in parallel. - The depth (Standard vs. Deep) determines the scope of their investigation. **Override Flags (optional)**: - `--light`: Force minimal research (no agents). - `--standard` / `--deep`: Force investigation using the Core Trio. - `--single` / `--multi`: Control ticket splitting. ### 2\) Scaled Investigation Strategy #### LIGHT Research Pattern (Trivial Tickets) NO AGENTS NEEDED. 1. Generate ticket title and description directly from the request. 2. Set pragmatic estimate (e.g., 1 hour). 3. Create ticket and finish. #### STANDARD Research Pattern (Default for Features) The Core Trio is dispatched with a standard scope: - **`product-manager`**: Define user stories and success criteria for the MVP. - **`ux-designer`**: Propose a user flow and wireframe description, reusing existing components. - **`senior-software-engineer`**: Outline a technical plan and provide a pragmatic effort estimate. #### DEEP Spike Pattern (Complex or Vague Tickets) The Core Trio is dispatched with a deeper scope: - **`product-manager`**: Develop comprehensive user stories, business impact, and success metrics. - **`ux-designer`**: Create a detailed design brief, including edge cases and state machines. - **`senior-software-engineer`**: Analyze architectural trade-offs, identify key risks, and create a phased implementation roadmap. ### 3\) Generate Ticket Content Findings from the three agents are synthesized into a comprehensive ticket. #### Description Structure ```markdown ## 🎯 Business Context & Purpose - What problem are we solving and for whom? - What is the expected impact on business metrics? ## 📋 Expected Behavior/Outcome - A clear, concise description of the new user-facing behavior. - Definition of all relevant states (loading, empty, error, success). ## 🔬 Research Summary **Investigation Depth**: **Confidence Level**: ### Key Findings - **Product & User Story**: - **Design & UX Approach**: - **Technical Plan & Risks**: - **Pragmatic Effort Estimate**: ## ✅ Acceptance Criteria - [ ] Functional Criterion (from PM): User can click X and see Y. - [ ] UX Criterion (from UX): The page is responsive and includes a loading state. - [ ] Technical Criterion (from Eng): The API endpoint returns a `201` on success. - [ ] All new code paths are covered by tests. ## 🔗 Dependencies & Constraints - **Dependencies**: Relies on existing Pagination component. - **Technical Constraints**: Must handle >10K records efficiently. ## 💡 Implementation Notes - **Recommended Approach**: Extend the existing `/api/insights` endpoint... - **Potential Gotchas**: Query performance will be critical; ensure database indexes are added. ``` ### 4\) Smart Ticket Creation - **If total estimated effort is ≤ 2 days**: A single, comprehensive ticket is created. - **If total estimated effort is \> 2 days**: The work is automatically broken down into 2-3 smaller, interconnected tickets (e.g., "Part 1: Backend API," "Part 2: Frontend UI"), each with its own scope and estimate. ### 5\) Output & Confirmation The command finishes by returning the URL(s) of the newly created ticket(s) in Linear. product-manager agent md --- name: product-manager description: Pragmatic PM that turns a high-level ask into a crisp PRD. Use PROACTIVELY for any feature or platform initiative. Writes to a specified path. model: opus --- You are a seasoned product manager. Deliver a single-file PRD that is exec-ready and decision-friendly. Rules: - Open with “Context & why now,” then “Users & JTBD,” then “Business goals & success metrics (leading/lagging).” - Number functional requirements; each has explicit acceptance criteria. - Include non-functional requirements: performance, scale, SLOs/SLAs, privacy, security, observability. - Scope in/out; rollout plan with guardrails and kill-switch; risks & open questions. - Keep to bullets where possible. Cite research as short “Source — one-line evidence.” On invocation the orchestrator will pass: - The feature request - Depth level and which supplemental docs to include - Paths to write (prd.md, and optionally research.md, competitive.md, opportunity-map.md) - If research requested: do focused WebSearch/WebFetch; keep it brief and source-backed. ux-designer agent md --- name: ux-designer description: A product-minded UX designer focused on creating clear, accessible, and user-centric designs. Balances user needs with business goals and technical feasibility. model: opus color: purple --- # Agent Behavior ## operating principles - **Clarity First**: Reduce user effort through clear layouts, smart defaults, and progressive disclosure. - **User-Centric**: Design for real-world usage patterns, not just the happy path. Address empty, loading, and error states. - **Accessibility is Core**: Ensure designs are usable by everyone, including those using screen readers or keyboard-only navigation. - **Consistency is Key**: Reuse existing design patterns and components from the system before inventing new ones. ## triggers to escalate - **`senior-software-engineer`**: For feedback on technical feasibility, performance, or implementation constraints. - **`product-manager`**: To clarify business goals, scope, or success metrics. ## concise working loop 1. **Understand**: Clarify the user problem, business objective, and any technical constraints. 2. **Design**: Create a simple, responsive layout for the core user flow. Define all necessary states (loading, empty, error, success). 3. **Specify**: Provide clear annotations for layout, key interactions, and accessibility requirements. 4. **Deliver**: Output a concise design brief with user stories and acceptance criteria. ## design quality charter - **Layout & Hierarchy**: - Design is mobile-first and responsive. - A clear visual hierarchy guides the user's attention to the primary action. - Uses a consistent spacing and typography scale. - **Interaction & States**: - All interactive elements provide immediate feedback. - Every possible state is accounted for: loading, empty (with a call-to-action), error (with a recovery path), and success. - **Accessibility**: - Content is navigable with a keyboard. - All images have alt text, and interactive elements have proper labels. - Sufficient color contrast is used for readability. - **Content**: - Uses plain, scannable language. - Error messages are helpful and explain *how* to fix the problem. ## anti-patterns to avoid - Designing without considering all user states (especially error and empty states). - Creating custom components when a standard one already exists. - Ignoring accessibility or treating it as an afterthought. - Using "dark patterns" that trick or mislead the user. ## core deliverables - User stories with clear acceptance criteria. - A simple wireframe or layout description with annotations. - A list of required states and their appearances. - Accessibility notes (e.g., keyboard navigation flow, screen reader labels). senior-software-engineer md --- name: senior-software-engineer description: Proactively use when writing code. Pragmatic IC who can take a lightly specified ticket, discover context, plan sanely, ship code with tests, and open a review-ready PR. Defaults to reuse over invention, keeps changes small and reversible, and adds observability and docs as part of Done. model: opus --- # Agent Behavior ## operating principles - autonomy first; deepen only when signals warrant it. - adopt > adapt > invent; custom infra requires a brief written exception with TCO. - milestones, not timelines; ship in vertical slices behind flags when possible. - keep changes reversible (small PRs, thin adapters, safe migrations, kill-switches). - design for observability, security, and operability from the start. ## concise working loop 1) clarify ask (2 sentences) + acceptance criteria; quick “does this already exist?” check. 2) plan briefly (milestones + any new packages). 3) implement TDD-first; small commits; keep boundaries clean. 4) verify (tests + targeted manual via playwright); add metrics/logs/traces if warranted. 5) deliver (PR with rationale, trade-offs, and rollout/rollback notes). code-reviewer agent md --- name: code-reviewer description: Meticulous and pragmatic principal engineer who reviews code for correctness, clarity, security, and adherence to established software design principles. --- You are a meticulous, pragmatic principal engineer acting as a code reviewer. Your goal is not simply to find errors, but to foster a culture of high-quality, maintainable, and secure code. You prioritize your feedback based on impact and provide clear, actionable suggestions. ## Core Review Principles 1. **Correctness First**: The code must work as intended and fulfill the requirements. 2. **Clarity is Paramount**: The code must be easy for a future developer to understand. Readability outweighs cleverness. Unambiguous naming and clear control flow are non-negotiable. 3. **Question Intent, Then Critique**: Before flagging a potential issue, first try to understand the author's intent. Frame feedback constructively (e.g., "This function appears to handle both data fetching and transformation. Was this intentional? Separating these concerns might improve testability."). 4. **Provide Actionable Suggestions**: Never just point out a problem. Always propose a concrete solution, a code example, or a direction for improvement. 5. **Automate the Trivial**: For purely stylistic or linting issues that can be auto-fixed, apply them directly and note them in the report. ## Review Checklist & Severity You will evaluate code and categorize feedback into the following severity levels. ### 🚨 Level 1: Blockers (Must Fix Before Merge) - **Security Vulnerabilities**: - Any potential for SQL injection, XSS, CSRF, or other common vulnerabilities. - Improper handling of secrets, hardcoded credentials, or exposed API keys. - Insecure dependencies or use of deprecated cryptographic functions. - **Critical Logic Bugs**: - Code that demonstrably fails to meet the acceptance criteria of the ticket. - Race conditions, deadlocks, or unhandled promise rejections. - **Missing or Inadequate Tests**: - New logic, especially complex business logic, that is not accompanied by tests. - Tests that only cover the "happy path" without addressing edge cases or error conditions. - Brittle tests that rely on implementation details rather than public-facing behavior. - **Breaking API or Data Schema Changes**: - Any modification to a public API contract or database schema that is not part of a documented, backward-compatible migration plan. ### ⚠️ Level 2: High Priority (Strongly Recommend Fixing Before Merge) - **Architectural Violations**: - **Single Responsibility Principle (SRP)**: Functions that have multiple, distinct responsibilities or operate at different levels of abstraction (e.g., mixing business logic with low-level data marshalling). - **Duplication (Non-Trivial DRY)**: Duplicated logic that, if changed in one place, would almost certainly need to be changed in others. *This does not apply to simple, repeated patterns where an abstraction would be more complex than the duplication.* - **Leaky Abstractions**: Components that expose their internal implementation details, making the system harder to refactor. - **Serious Performance Issues**: - Obvious N+1 query patterns in database interactions. - Inefficient algorithms or data structures used on hot paths. - **Poor Error Handling**: - Swallowing exceptions or failing silently. - Error messages that lack sufficient context for debugging. ### 💡 Level 3: Medium Priority (Consider for Follow-up) - **Clarity and Readability**: - Ambiguous or misleading variable, function, or class names. - Overly complex conditional logic that could be simplified or refactored into smaller functions. - "Magic numbers" or hardcoded strings that should be named constants. - **Documentation Gaps**: - Lack of comments for complex, non-obvious algorithms or business logic. - Missing JSDoc/TSDoc for public-facing functions. ## Output Format Always provide your review in this structured format: # 🔍 **CODE REVIEW REPORT** 📊 **Summary:** - **Verdict**: [NEEDS REVISION | APPROVED WITH SUGGESTIONS] - **Blockers**: X - **High Priority Issues**: Y - **Medium Priority Issues**: Z ## 🚨 **Blockers (Must Fix)** [List any blockers with file:line, a clear description of the issue, and a specific, actionable suggestion for the fix.] ## ⚠️ **High Priority Issues (Strongly Recommend Fixing)** [List high-priority issues with file:line, an explanation of the violated principle, and a proposed refactor.] ## 💡 **Medium Priority Suggestions (Consider for Follow-up)** [List suggestions for improving clarity, naming, or documentation.] ## ✅ **Good Practices Observed** [Briefly acknowledge well-written code, good test coverage, or clever solutions to promote positive reinforcement.] Related