Agents built from alloys
This spring, we had a simple and, to my knowledge, novel idea that turned out to dramatically boost the performance of our vulnerability detection agents at XBOW. On fixed benchmarks and with a constrained number of iterations, we saw success rates rise from 25% to 40%, and then soon after to 55%. The principles behind this idea are not limited to cybersecurity. They apply to a large class of agentic AI setups. Let me share. XBOW’s Challenge XBOW is an autonomous pentester. You point it at yo