Claude Code just got updated with one of the most-requested user features

Anthropic's open source standard, the Model Context Protocol (MCP), released in late 2024, allows users to connect AI models and the agents atop them to external tools in a structured, reliable format. It is the engine behind Anthropic's hit AI agentic programming harness, Claude Code, allowing it to access numerous functions like web browsing and file creation immediately when asked.But there was one problem: Claude Code typically had to "read" the instruction manual for every single tool available, regardless of whether it was needed for the immediate task, using up the available context that could otherwise be filled with more information from the user's prompts or the agent's responses.At least until last night. The Claude Code team released an update that fundamentally alters this equation. Dubbed MCP Tool Search, the feature introduces "lazy loading" for AI tools, allowing agents to dynamically fetch tool definitions only when necessary. It is a shift that moves AI agents from a brute-force architecture to something resembling modern software engineering—and according to early data, it effectively solves the "bloat" problem that was threatening to stifle the ecosystem.The 'Startup Tax' on AgentsTo understand the significance of Tool Search, one must understand the friction of the previous system. The Model Context Protocol (MCP), released in 2024 by Anthropic as an open source standard was designed to be a universal standard for connecting AI models to data sources and tools—everything from GitHub repositories to local file systems.However, as the ecosystem grew, so did the "startup tax."Thariq Shihipar, a member of the technical staff at Anthropic, highlighted the scale of the problem in the announcement."We've found that MCP servers may have up to 50+ tools," Shihipar wrote. "Users were documenting setups with 7+ servers consuming 67k+ tokens."In practical terms, this meant a developer using a robust set of tools might sacrifice 33% or more of their available context window limit of 200,000 tokens before they even typed a single character of a prompt, as AI newsletter author Aakash Gupta pointed out in a post on X.The model was effectively "reading" hundreds of pages of technical documentation for tools it might never use during that session.Community analysis provided even starker examples. Gupta further noted that a single Docker MCP server could consume 125,000 tokens just to define its 135 tools."The old constraint forced a brutal tradeoff," he wrote. "Either limit your MCP servers to 2-3 core tools, or accept that half your context budget disappears before you start working."How Tool Search WorksThe solution Anthropic rolled out — which Shihipar called "one of our most-requested features on GitHub" — is elegant in its restraint. Instead of preloading every definition, Claude Code now monitors context usage.According to the release notes, the system automatically detects when tool descriptions would consume more than 10% of the available context. When that threshold is crossed, the system switches strategies. Instead of dumping raw documentation into the prompt, it loads a lightweight search index.When the user asks for a specific action—say, "deploy this container"—Claude Code doesn't scan a massive, pre-loaded list of 200 commands. Instead, it queries the index, finds the relevant tool definition, and pulls only that specific tool into the context."Tool Search flips the architecture," Gupta analyzed. "The token savings are dramatic: from ~134k to ~5k in Anthropic’s internal testing. That’s an 85% reduction while maintaining full tool access."For developers maintaining MCP servers, this shifts the optimization strategy. Shihipar noted that the `server instructions` field in the MCP definition—previously a "nice to have"—is now critical. It acts as the metadata that helps Claude "know when to search for your tools, similar to skills."'Lazy Loading' and Accuracy GainsWhile the token savings are the headline metric—saving money and memory is always popular—the secondary effect of this update might be more important: focus.LLMs are notoriously sensitive to "distraction." When a model's context window is stuffed with thousands of lines of irrelevant tool definitions, its ability to reason decreases. It creates a "needle in a haystack" problem where the model struggles to differentiate between similar commands, such as `notification-send-user` versus `notification-send-channel`.Boris Cherny, Head of Claude Code, emphasized this in his reaction to the launch on X: "Every Claude Code user just got way more context, better instruction following, and the ability to plug in even more tools."The data backs this up. Internal benchmarks shared by the community indicate that enabling Tool Search improved the accuracy of the Opus 4 model on MCP evaluations from 49% to 74%. For the newer Opus 4.5, accuracy jumped from 79.5% to 88.1%.By removing the noise of hundreds of unused tools, the model can dedicate its "attention" mechanisms to the user's actual query and the relevant active tools.Maturing the StackThis update signals a maturation in how we treat AI infrastructure. In the early days of any software paradigm, brute force is common. But as systems scale, efficiency becomes the primary engineering challenge.Aakash Gupta drew a parallel to the evolution of Integrated Development Environments (IDEs) like VSCode or JetBrains. "The bottleneck wasn’t 'too many tools.' It was loading tool definitions like 2020-era static imports instead of 2024-era lazy loading," he wrote. "VSCode doesn’t load every extension at startup. JetBrains doesn’t inject every plugin’s docs into memory."By adopting "lazy loading"—a standard best practice in web and software development—Anthropic is acknowledging that AI agents are no longer just novelties; they are complex software platforms that require architectural discipline.Implications for the EcosystemFor the end user, this update is seamless: Claude Code simply feels "smarter" and retains more memory of the conversation. But for the developer ecosystem, it opens the floodgates.Previously, there was a "soft cap" on how capable an agent could be. Developers had to curate their toolsets carefully to avoid lobotomizing the model with excessive context. With Tool Search, that ceiling is effectively removed. An agent can theoretically have access to thousands of tools—database connectors, cloud deployment scripts, API wrappers, local file manipulators—without paying a penalty until those tools are actually touched.It turns the "context economy" from a scarcity model into an access model. As Gupta summarized, "They’re not just optimizing context usage. They’re changing what ‘tool-rich agents’ can mean."The update is rolling out immediately for Claude Code users. For developers building MCP clients, Anthropic recommends implementing the `ToolSearchTool` to support this dynamic loading, ensuring that as the agentic future arrives, it doesn't run out of memory before it even says hello.