Six Principles for Production AI Agents

Every now and then, people ask me:

“I am new to agentic development, I’m building something, but I feel like I'm missing some tribal knowledge. Help me catch up!”.

I’m tempted to suggest some serious stuff like multiweek courses (e.g. by HuggingFace or Berkeley), but not everyone is interested in that level of diving.

So I decided to gather six simple empirical learnings that helped me a lot during app.build development. This post is somewhat inspired by Design Decisions Behind app.build, but is generalized and aimed to be a quick guideline for newcomers in agentic engineering.

Principle 1: Invest in your system prompt

I’ve been skeptical about prompt engineering for a long time, it seemed more like shaman rituals rather than anything close to engineering. All those approaches “I will tip you $100” or “My grandmother is dying and needs this” or “Be 100% accurate or else” could be useful as local fluctuation leveraging local model inefficiency, but never worked in the longer run.

I changed my mind regarding prompt / context engineering when I realized a simple thing: modern LLMs just need direct detailed context, no tricks, but clarity and lack of contradictions. That’s it, no manipulation needed. Models are good at instruction following, and the problem is often just the ambiguous nature of the instructions.

All LLM providers have educational resources on best practices on how to prompt their models (e.g., one by Anthropic and one by Google). Just follow them and ensure your instructions are direct and detailed, no smart-pants tricks required. For example, here is a system prompt we use to make Claude generate rules for ast-grep - nothing tricky, just details on how to use the tool that the agent barely knows.

One trick we like is to bootstrap the initial system prompt with the draft created by Deep Research-like variants of LLM. It typically needs human improvements, but is a solid baseline.

Keeping a shared part of the context is beneficial for the prompt caching mechanisms. Technically, one can cache user messages too, but structuring context so that the system part is large and static and user one is small and dynamic works great.

... continue reading