MCP-Universe benchmark shows GPT-5 fails more than half of real-world orchestration tasks
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now The adoption of interoperability standards, such as the Model Context Protocol (MCP), can provide enterprises with insights into how agents and models function outside their walled confines. However, many benchmarks fail to capture real-life interactions with MCP. Salesforce AI Research developed a new open-source benchmark it calls MCP-Un