Grounding AI in Reality: How Vector Search on Our Codebase Transformed Our SDLC Automation Antony Brahin 6 min read · 1 day ago 1 day ago -- Listen Share
By: Antony Brahin
In software development, the process of turning a user story into detailed documentation and actionable tasks is critical for success. However, this manual process can often be a source of inconsistency and a significant time investment. I was driven to see if I could streamline and elevate it.
The journey from a user story to a set of well-defined, actionable tasks is critical. It’s also often one of the most time-consuming, repetitive, and inconsistent parts of our workflow. That administrative grind isn’t just tedious; it’s where inconsistency creeps in and valuable time is lost. I was convinced we could automate it.
In this post, I’ll walk you through how I built a complete, end-to-end automation that takes a user story in Azure DevOps (ADO) and, using a sophisticated chain of AI prompts with Google’s Gemini and a vector search of our codebase, outputs a full requirements document, a technical specification, a test plan, and a complete set of ready-to-work tasks.
Why Build This When Commercial Tools Exist?
I know this is a hot space. Big players like GitHub and Atlassian are building integrated AI, and startups are offering specialized platforms. My goal wasn’t to compete with them, but to see what was possible by building a custom, “glass box” solution using the best tools for each part of the job, without being locked into a single ecosystem.
What makes this approach different is the flexibility and full control. Instead of a pre-packaged product, this is a resilient workflow built on Power Automate, which acts as the orchestrator for a sequence of API calls to multiple platforms. This allowed me to fine-tune every step of the process to our exact needs.
The Architecture: A High-Level View
The entire solution is a Power Automate cloud flow that orchestrates a series of API calls. It’s triggered by an ADO user story update and uses a combination of Gemini AI for generation, Retrieval-Augmented Generation (RAG) for code context, and direct ADO API calls for execution.
Here’s the complete architecture of the flow:
Press enter or click to view image in full size
A User Story in Azure DevOps triggers the flow. AI generates Concise Requirements. A Vector Search (RAG) of our codebase retrieves relevant technical context. AI generates the Technical Specification (incorporating code context). AI generates a comprehensive Testing Strategy (based on requirements and spec). AI breaks down the spec and code context into Structured Tasks. Finally, Power Automate saves the requirements, tech spec, and test strategy to an ADO Wiki and creates the individual ADO Tasks.
The Battlefield: Tackling Specific Challenges and Finding Solutions
Building this wasn’t a straight line; it was a series of fascinating debugging sessions and prompt engineering refinements. Here are some of the key battles I fought:
Challenge 1: AI Generating Generic Solutions Without Code Context
The Problem: Initially, my AI-generated technical specs and tasks were generic, often suggesting new implementations for features that already partially existed.
Initially, my AI-generated technical specs and tasks were generic, often suggesting new implementations for features that already partially existed. The Solution: I integrated a Retrieval-Augmented Generation (RAG) step using Azure AI Search. By performing a vector search on our codebase and injecting relevant code snippets directly into the prompt for the technical specification and task generation, I successfully grounded the AI in our actual application. This dramatically improved the relevance and accuracy of the generated solutions, steering it towards modifications rather than reinventions.
Challenge 2: Finding the Right Approach for Providing Code Context
The Problem: Before I could even perform a vector search, I had to figure out the best way to make our entire codebase “readable” to the AI. My initial ideas were naive and quickly hit roadblocks. I tried combining all source files into a single massive text file to be stored in SharePoint for the AI to read.
Before I could even perform a vector search, I had to figure out the best way to make our entire codebase “readable” to the AI. My initial ideas were naive and quickly hit roadblocks. I tried combining all source files into a single massive text file to be stored in SharePoint for the AI to read. The Solution: I quickly realized this approach was not ideal due to token limits and the lack of structure. This led me down the path of true vectorization. The solution involved a multi-step engineering process:
Identify the right tools: I settled on using Azure AI Search for its robust indexing and vector search capabilities. Chunk the data: I broke down the source code into smaller, logical chunks (e.g., by class or function). Vectorize and Index: I then processed each chunk, using an Azure OpenAI model to convert it into a vector embedding, and stored it in a searchable index in Azure AI Search. This created a rich, queryable knowledge base of our application.
Challenge 3: The Hidden Challenge of Iterative Prompt Engineering
The Problem: My first prompts were simple and direct, but the AI’s output was often unpredictable, verbose, or in a format that was difficult for the automation to handle. Getting reliable, structured output was a significant challenge.
My first prompts were simple and direct, but the AI’s output was often unpredictable, verbose, or in a format that was difficult for the automation to handle. Getting reliable, structured output was a significant challenge. The Solution: I treated prompt creation as a true engineering discipline, not just a matter of asking a question. The process involved several key iterations:
Assigning Personas: I discovered that giving the AI a role (e.g., “You are an expert Tech Lead”) dramatically improved the tone, quality, and focus of its responses. Enforcing Strict Structure: The biggest breakthrough was shifting from asking for text to demanding a specific output format. This evolved from structured markdown to, finally, a rigid JSON schema. Providing Examples: I learned to include a concrete example of the desired output (like a sample JSON object) directly in the prompt. This “few-shot” learning technique was the key to achieving consistent formatting. Using Negative Constraints: I refined the prompts to explicitly tell the AI what not to do (e.g., “Do not add any commentary,” “Omit this key for non-coding tasks”), which was crucial for getting clean, machine-readable data.
Challenge 4: Orchestrating a High-Volume, Multi-Platform API Workflow
The Problem: This isn’t a single AI call; it’s a symphony of carefully sequenced API interactions. The final workflow involves five distinct calls to the Gemini API for content generation, one call to Azure OpenAI for embeddings, one call to Azure AI Search to retrieve context, and numerous calls to the Azure DevOps REST API for wiki pages and work items.
This isn’t a single AI call; it’s a symphony of carefully sequenced API interactions. The final workflow involves for content generation, for embeddings, to retrieve context, and for wiki pages and work items. The Solution: The challenge was one of pure orchestration. I had to architect the Power AutomATE flow to manage this complex chain, ensuring that the output of one call was correctly formatted and passed as input to the next. This involved robust error handling for each API call and managing authentication for multiple services (including a PAT for ADO). It transformed the project from a series of prompts into a true systems integration solution.
Challenge 5: Overcoming Platform Limitations with the Azure DevOps Wiki
The Problem: A key requirement was to save the generated documents as a single source of truth in our ADO Wiki. However, I discovered that the standard Azure DevOps connectors in Power Automate were problematic and lacked the functionality needed to reliably create and update pages.
A key requirement was to save the generated documents as a single source of truth in our ADO Wiki. However, I discovered that the standard Azure DevOps connectors in Power Automate were problematic and lacked the functionality needed to reliably create and update pages. The Solution: Instead of giving up, I bypassed the standard connectors and used the generic HTTP connector in Power Automate to call the Azure DevOps REST API directly. This required creating a Personal Access Token (PAT) for secure authentication and carefully constructing the API requests. This approach gave me the full power and flexibility of the ADO API, allowing me to overcome the connector’s limitations.
The Success: What I Achieved
By tackling these challenges head-on, I’ve transformed a bottleneck into a streamlined accelerator. The system produces incredibly consistent and context-aware documents and tasks in minutes.
The generated Technical Specification is a complete document, automatically saved to our wiki, with a fully rendered Mermaid diagram for the architecture.
Press enter or click to view image in full size
The final Azure DevOps Tasks are clean, detailed, and ready for our developers to begin work immediately.
Press enter or click to view image in full size
This project has been a journey into the practical application of AI, proving that with meticulous prompt engineering and smart orchestration, we can build powerful tools that genuinely enhance developer productivity. It’s not just about what AI can do, but what you make it do through careful design and persistent problem-solving.