Salesforce Agentforce Observability lets you watch your AI agents think in real time

Salesforce launched a suite of monitoring tools on Thursday designed to solve what has become one of the thorniest problems in corporate artificial intelligence: Once companies deploy AI agents to handle real customer interactions, they often have no idea how those agents are making decisions.The new capabilities, built into Salesforce's Agentforce 360 Platform, give organizations granular visibility into every action their AI agents take, every reasoning step they follow, and every guardrail they trigger. The move comes as businesses grapple with a fundamental tension in AI adoption — the technology promises massive efficiency gains, but executives remain wary of autonomous systems they can't fully understand or control."You can't scale what you can't see," said Adam Evans, executive vice president and general manager of Salesforce AI, in a statement announcing the release. The company says businesses have increased AI implementation by 282% recently, creating an urgent need for monitoring systems that can track fleets of AI agents making real-world business decisions.The challenge Salesforce aims to address is deceptively simple: AI agents work, but no one knows why. A customer service bot might successfully resolve a tax question or schedule an appointment, but the business deploying it can't trace the reasoning path that led to that outcome. When something goes wrong — or when the agent encounters an edge case — companies lack the diagnostic tools to understand what happened."Agentforce Observability acts as a mission control system to not just monitor, but also analyze and optimize agent performance," said Gary Lerhaupt, vice president of Salesforce AI who leads the company's observability work, in an exclusive interview with VentureBeat. He emphasized that the system delivers business-specific metrics that traditional monitoring tools miss. "In service, this could be engagement or deflection rate. In sales, it could be leads assigned, converted, or reply rates."How AI monitoring tools helped 1-800Accountant and Reddit track autonomous agent decision-makingThe stakes become clear in early customer deployments. Ryan Teeples, chief technology officer at 1-800Accountant, said his company deployed Agentforce agents to serve as a 24/7 digital workforce handling complex tax inquiries and appointment scheduling. The AI draws on integrated data from audit logs, customer support history, and sources like IRS publications to provide instant responses — without human intervention.For a financial services firm handling sensitive tax information during peak season, the inability to see how the AI was making decisions would be a dealbreaker. "With this level of sensitive information and the fast pace in which we move during tax season in particular, Observability allows us to have full trust and transparency with every agent interaction in one unified view," Teeples said.The observability tools revealed insights Teeples didn't expect. "The optimization feature has been the most eye opening for us — giving full observability into agent reasoning, identifying performance gaps and revealing how our agents are making decisions," he said. "This has helped us quickly diagnose issues that would've otherwise gone undetected and configure guardrails in response."The business impact proved substantial. Agentforce resolved over 1,000 client engagements in the first 24 hours at 1-800Accountant. The company now projects it can support 40% client growth this year without recruiting and training seasonal staff, while freeing up 50% more time for CPAs to focus on complex advisory work rather than administrative tasks.Reddit has seen similar results since deploying the technology. John Thompson, vice president of sales strategy and operations at the social media platform, said the company has deflected 46% of support cases since launching Agentforce for advertiser support. "By observing every Agentforce interaction, we can understand exactly how our AI navigates advertisers through even the most complex tools," Thompson said. "This insight helps us understand not just whether issues are resolved, but how decisions are made along the way."Inside Salesforce's session tracing technology: Logging every AI agent interaction and reasoning stepSalesforce built the observability system on two foundational components. The Session Tracing Data Model logs every interaction — user inputs, agent responses, reasoning steps, language model calls, and guardrail checks — and stores them securely in Data 360, Salesforce's data platform. This creates what the company calls "unified visibility" into agent behavior at the session level.The second component, MuleSoft Agent Fabric, addresses a problem that will become more acute as companies build more AI systems: agent sprawl. The tool provides what Lerhaupt describes as "a single pane of glass across every agent," including those built outside the Salesforce ecosystem. Agent Fabric's Agent Visualizer creates a visual map of a company's entire agent network, giving visibility across all agent interactions from a single dashboard.The observability tools break down into three functional areas. Agent Analytics tracks performance metrics, surfaces KPI trends over time, and highlights ineffective topics or actions. Agent Optimization provides end-to-end visibility of every interaction, groups similar requests to uncover patterns, and identifies configuration issues. Agent Health Monitoring, which will become generally available in Spring 2026, tracks key health metrics in near real-time and sends alerts on critical errors and latency spikes.Pierre Matuchet, senior vice president of IT and digital transformation at Adecco, said the visibility helped his team build confidence even before full deployment. "Even during early notebook testing, we saw the agent handle unexpected scenarios, like when candidates didn't want to answer questions already covered in their CVs, appropriately and as designed," Matuchet said. "Agentforce Observability helped us identify unanticipated user behavior and gave us confidence, even before the agent went live, that it could act responsibly and reliably."Why Salesforce says its AI observability tools beat Microsoft, Google, and AWS monitoringThe announcement puts Salesforce in direct competition with Microsoft, Google, and Amazon Web Services, all of which offer monitoring capabilities built into their AI agent platforms. Lerhaupt argued that enterprises need more than the basic monitoring those providers offer."Observability comes out-of-the-box standard with Agentforce at no extra cost," Lerhaupt said, positioning the offering as comprehensive rather than supplementary. He emphasized that the tools provide "deeper insight than ever before" by capturing "the full telemetry and reasoning behind every agentic interaction" through the Session Tracing Data Model, then using that data to "provide key analysis and session quality scoring to help customers optimize and improve their agents."The competitive positioning matters because enterprises face a choice: build their AI infrastructure on a cloud provider's platform and use its native monitoring tools, or adopt a specialized observability layer like Salesforce's. Lerhaupt framed the decision as one of depth versus breadth. "Enterprises need more than basic monitoring to measure the success of their AI deployments," he said. "They need full visibility into every agent interaction and decision."The 1.2 billion workflow question: Are AI agent deployments moving from pilot projects to production?The broader question is whether Salesforce is solving a problem most enterprises will face imminently or building for a future that remains years away. The company's 282% surge in AI implementation sounds dramatic, but that figure doesn't distinguish between production deployments and pilot projects.When asked about this directly, Lerhaupt pointed to customer examples rather than offering a breakdown. He described a three-phase journey from experimentation to scale. "On Day 0, trust is the foundation," he said, citing 1-800Accountant's 70% autonomous resolution of chat engagements. "Day 1 is where designing ideas to become real, usable AI," with Williams Sonoma delivering more than 150,000 AI experiences monthly. "On Day 2, once trust and design are built, it becomes about scaling early wins into enterprise-wide outcomes," pointing to Falabella's 600,000 AI workflows per month that have grown fourfold in three months.Lerhaupt said Salesforce has 12,000-plus customers across 39 countries running Agentforce, powering 1.2 billion agentic workflows. Those numbers suggest the shift from pilot to production is already underway at scale, though the company didn't provide a breakdown of how many customers are running production workloads versus experimental deployments.The economics of AI deployment may accelerate adoption regardless of readiness. Companies face mounting pressure to reduce headcount costs while maintaining or improving service levels. AI agents promise to resolve that tension, but only if businesses can trust them to work reliably. Observability tools like Salesforce's represent the trust layer that makes scaled deployment possible.What happens after AI agent deployment: Why continuous monitoring matters more than initial testingThe deeper story is about a shift in how enterprises think about AI deployment. The official announcement framed this clearly: "The agent development lifecycle begins with three foundational steps: build, test, and deploy. While many organizations have already moved past the initial hurdle of creating their first agents, the real enterprise challenge starts immediately after deployment."That framing reflects a maturing understanding of AI in production environments. Early AI deployments often treated the technology as a one-time implementation — build it, test it, ship it. But AI agents behave differently than traditional software. They learn, adapt, and make decisions based on probabilistic models rather than deterministic code. That means their behavior can drift over time, or they can develop unexpected failure modes that only emerge under real-world conditions."Building an agent is just the beginning," Lerhaupt said. "Once the trust is built for agents to begin handling real work, companies may start by seeing the results, but may not understand the 'why' behind them or see areas to optimize. Customers interact with products—including agents—in unexpected ways and to optimize the customer experience, transparency around agent behavior and outcomes is critical."Teeples made the same point more bluntly when asked what would be different without observability tools. "This level of visibility has given full trust in continuing to expand our agent deployment," he said. The implication is clear: without visibility, deployment would slow or stop. 1-800Accountant plans to expand Slack integrations for internal workflows, deploy Service Cloud Voice for case deflection, and leverage Tableau for conversational analytics—all dependent on the confidence that observability provides.How enterprise AI trust issues became the biggest barrier to scaling autonomous agentsThe recurring theme in customer interviews is trust, or rather, the lack of it. AI agents work, sometimes spectacularly well, but executives don't trust them enough to deploy them widely. Observability tools aim to convert black-box systems into transparent ones, replacing faith with evidence.This matters because trust is the bottleneck constraining AI adoption, not technological capability. The models are powerful enough, the infrastructure is mature enough, and the business case is compelling enough. What's missing is executive confidence that AI agents will behave predictably and that problems can be diagnosed and fixed quickly when they arise.Salesforce is betting that observability tools can remove that bottleneck. The company positions Agentforce Observability not as a monitoring tool but as a management layer—"just like managers work with their human employees to ensure they are working towards the right objectives and optimizing performance," Lerhaupt said.The analogy is telling. If AI agents are becoming digital employees, they need the same kind of ongoing supervision, feedback, and optimization that human employees receive. The difference is that AI agents can be monitored with far more granularity than any human worker. Every decision, every reasoning step, every data point consulted can be logged, analyzed, and scored.That creates both opportunity and obligation. The opportunity is continuous improvement at a pace impossible with human workers. The obligation is to actually use that data to optimize agent performance, not just collect it. Whether enterprises can build the organizational processes to turn observability data into systematic improvement remains an open question.But one thing has become increasingly clear in the race to deploy AI at scale: Companies that can see what their agents are doing will move faster than those flying blind. In the emerging era of autonomous AI, observability isn't just a nice-to-have feature. It's the difference between cautious experimentation and confident deployment—between treating AI as a risky bet and managing it as a trusted workforce. The question is no longer whether AI agents can work. It's whether businesses can see well enough to let them.