Show HN: Kelet – Root Cause Analysis agent for your LLM apps

FAQ

What does Kelet actually do? Kelet reads your production AI agent traces and signals, clusters failure patterns across thousands of sessions, and surfaces root causes with evidence — so you ship fixes instead of hypotheses. Think of it as a detective that investigates every failure automatically.

What kinds of AI agents and LLM applications does Kelet work with? Any agent or LLM application where you own the code — agentic loops, multi-step workflows, RAG pipelines, chatbots, autonomous agents. If you built it and you ship it, Kelet can help you improve it. That includes agents built with LangChain, LangGraph, PydanticAI, Mastra, CrewAI, AutoGen, LlamaIndex, Haystack, Semantic Kernel, or directly on the OpenAI, Anthropic, or Gemini APIs. Two situations where Kelet is not the right fit: If you use AI tools built by others (Cursor, Claude Code, Copilot as a developer), you're a user, not a builder — Kelet isn't designed for your use case. Similarly, if you're building a skill or plugin inside an existing agentic platform, you're extending infrastructure you don't control, and Kelet can't instrument that. But if you're building your own agent using any LLM SDK or framework — you own that agent, and Kelet is exactly for you.

How long does integration take? Five minutes. Install via the Kelet installer skill — or `pip install kelet` / `npm install kelet` if you prefer to do it manually — add two lines to your agent code, and traces start flowing. Kelet is fully OpenTelemetry-compliant — any OTEL-instrumented agent works out of the box, no infrastructure changes needed.

Where does Kelet actually run? On Kelet's servers. Once you install Kelet — via the SDK or the installer skill — traces and signals start flowing to our infrastructure automatically. It's SOC 2 certified and runs 24/7, continuously ingesting your traces, finding failure patterns, building hypotheses, and proposing targeted fixes. The LLM tokens powering that analysis don't touch your model API bill — Kelet covers them. You pay Kelet based on usage. See kelet.ai/pricing.

Is Kelet a skill or a service? A service. Kelet is an agent that runs on Kelet's servers around the clock — not a plugin you invoke, not something you run manually. The installer skill is just how you connect it. Once connected, Kelet works continuously: reading your traces, clustering failure patterns across thousands of sessions, building root cause hypotheses, and proposing targeted fixes. You don't run it. It runs for you.

What are "signals" and why do they matter? Signals are probabilistic hints that something went wrong in a session: a thumbs-down rating, a user editing AI output, an abandoned conversation, or a synthetic LLM-as-judge check you configure. They tell Kelet where to look in your traces — not verdicts, but clues that guide the investigation.

How is Kelet different from Langfuse, Arize, Logfire, or other observability tools? Those tools show you traces. Kelet reads them for you. Observability platforms are thermometers — they report symptoms. Kelet is the doctor that diagnoses root causes and generates targeted prompt patches. You no longer need to scroll thousands of traces manually.

How does Kelet actually find root causes? Kelet works like a detective. Every session leaves a trail — LLM calls, tool invocations, retrieval steps, every agent hop. Kelet uses signals as clues: a thumbs-down, an edited AI response, an abandoned conversation, a synthetic LLM-judge flag. It follows each thread through your traces, cross-references patterns across thousands of sessions, and builds a root cause hypothesis backed by evidence. Same process a senior engineer would run manually — automated, at scale, on every failure at once.

Do I need a lot of traffic to get value? No. Teams typically see their first real failure patterns with as few as 200+ sessions and 3+ signals configured. Not sure which signals to set up? Kelet's AI walks you through it — no guesswork, no manual configuration. And if you're starting from zero, synthetic signal presets (LLM-as-judge evaluators) generate signal from day one, before real user feedback accumulates.

... continue reading