Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: e2e Clear Filter

Using Claude Code SDK to reduce E2E test time

End-to-end (E2E) tests sit at the top of the test pyramid because they're slow, fragile, and expensive. But they're also the only tests that completely verify complete user workflows actually work across systems. Due to time constraints, most teams run E2E nightly to avoid CI bottlenecks. However, this means bugs can slip through to production and be harder to fix because there are so many changes to isolate the root cause. But what if we could run only the relevant E2E tests for specific code

Evaluating Agents

“Models constantly change and improve but evals persist” Look at the data No amount of evals will replace the need to look at the data, once you have a evals good coverage you’ll be able to decrease the time but it’ll be always a must to just look at the agent traces to identify possible issues or things to improve. Starting, end to end evals You must create evals for your agents, stop relying solely on manual testing. Not sure where to start? Add e2e evals, define a success criteria (

Topics: agent data e2e end evals