50 AI agents get their first annual performance review - 6 lessons learned

seksan Mongkhonkhamsao/Moment via Getty Images Follow ZDNET: Add us as a preferred source on Google. ZDNET's key takeaways McKinsey implemented and observed 50+ agentic AI builds for one year. Digital employees require a lot of work to get up to speed. AI agents aren't the best answer to all business needs. By many accounts, AI agents are considered digital co-workers in today's workforce. So, as with human workers, they should be subjected to an annual performance review, right? The folks at McKinsey did just that, releasing the results of a one-year performance review of AI agents that the consulting firm had been implementing and observing. How did these digital employees do in their first year on the job? The McKinsey team's conclusions: They require a lot of work to get up to speed; they aren't always the best answer to every business need; and their human counterparts aren't always impressed with the agents' work. Also: Microsoft will compete with AWS to offer a marketplace of AI apps and agents The progress report, written by Lareina Yee, Michael Chui, and Roger Roberts, all with McKinsey, reviewed at least 50 agentic AI builds that the authors led at McKinsey, among others. After a year with the AI agents, they arrived at six lessons learned. 1. Agents perform better within workflows Implementing AI agents for the sake of having AI agents won't cut it, Yee and her colleagues advise. It's more about injecting agents to boost workflows. "Agentic AI efforts that focus on fundamentally reimagining entire workflows -- that is, the steps that involve people, processes, and technology -- are more likely to deliver a positive outcome," according to the review. Start with addressing key user pain points, the co-authors suggest. Organizations with document-intensive workflows, such as insurance companies or legal firms, for example, benefit from having agents handle tedious steps. 2. Agents aren't always the answer "To help avoid wasted investments or unwanted complexity, approach the role of agents much like they do when evaluating people for a high-performing team," Yee and her co-authors advise. "The key question to ask is, 'What is the work to be done and what are the relative talents of each potential team member-or agent-to work together to achieve those goals?'" Also: Got AI FOMO? 3 bold but realistic bets your business can try today If agentic AI is too much for a problem or if the problem calls for standardized, repetitive approaches with low variability, stick with simpler options such as rules-based automation, predictive analytics, or large language model (LLM) prompting. 3. AI 'slop' has been a recurring issue One of the most common issues observed by the McKinsey team is "agentic systems that seem impressive in demos but frustrate users who are actually responsible for the work" -- with "AI slop or low-quality outputs." As a result, users lose trust in the agents and stop using them. "Companies should invest heavily in agent development, just like they do for employee development," the co-authors recommend. As with human employees, "agents should be given clear job descriptions, onboarded, and given continual feedback so they become more effective and improve regularly." 4. It's difficult to track large numbers of agents "When working with only a few AI agents, reviewing their work and spotting errors can be mostly straightforward," Yee and her team stated. "But as companies roll out hundreds, or even thousands, of agents, the task becomes challenging. When there's a mistake -- and there will always be mistakes as companies scale agents -- it's hard to figure out precisely what went wrong." Also: 6 insights service leaders need to know about agentic AI The team learned this lesson by verifying agent performance at each step of the workflow, employing observability tools. "Building monitoring and evaluation into the workflow can enable teams to catch mistakes early, refine the logic, and continually improve performance, even after the agents are deployed." 5. Agents show the best value when shared across functions Agents can get expensive and redundant if their designers recreate the wheel for every task that comes up. "Companies often create a unique agent for each identified task," the McKinsey team pointed out. "This can lead to significant redundancy and waste because the same agent can often accomplish different tasks that share many of the same actions -- such as ingesting, extracting, searching, and analyzing." Also: How AI agents can generate $450 billion by 2028 - and what stands in the way Investing in reusable agents calls for first identifying recurring tasks, they advised. "Develop agents and agent components that can easily be reused across different workflows, and make it simple for developers to access them." 6. Agents will never work completely on their own There will always be a need for human workers to "oversee model accuracy, ensure compliance, use judgment, and handle edge cases," the co-authors emphasized. Redesign work "so that people and agents can collaborate well together. Without that focus, even the most advanced agentic programs risk silent failures, compounding errors, and user rejection." As a result, next year's agent performance appraisal may also end up less than stellar.

50 AI agents get their first annual performance review - 6 lessons learned

Share this article

Related Articles