Find Related products on Amazon

Shop on Amazon

Launch HN: Confident AI (YC W25) – Open-source evaluation framework for LLM apps

Published on: 2025-07-12 14:23:56

Hi HN - we're Jeffrey and Kritin, and we're building Confident AI ( https://confident-ai.com ). This is the cloud platform for DeepEval ( https://github.com/confident-ai/deepeval ), our open-source package that helps engineers evaluate and unit-test LLM applications. Think Pytest for LLMs. We spent the past year building DeepEval with the goal of providing the best LLM evaluation developer experience, growing it to run over 600K evaluations daily in CI/CD pipelines of enterprises like BCG, AstraZeneca, AXA, and Capgemini. But the fact that DeepEval simply runs, and does nothing with the data afterward, isn’t the best experience. If you want to inspect failing test cases, identify regressions, or even pick the best model/prompt combination, you need more than just DeepEval. That’s why we built a platform around it. Here’s a quick demo video of how everything works: https://youtu.be/PB3ngq7x4ko Confident AI is great for RAG pipelines, agents, and chatbots. Typical use cases involve al ... Read full article.