How did Anthropic measure AI's "theoretical capabilities" in the job market?

If you follow the ongoing debate over AI’s growing economic impact, you may have seen the graphic below floating around this month. It comes from an Anthropic report on the labor market impacts of AI and is meant to compare the current “observed exposure” of occupations to LLMs (in red) to the “theoretical capability” of those same LLMs (in blue) across 22 job categories.

While the current “observed exposure” area is interesting in its own right, it’s the blue “theoretical capability” that jumps out. At a glance, the graph implies that LLM-based systems could perform at least 80 percent of the individual “job tasks” across a shockingly wide range of human occupations, at least theoretically. It looks like Anthropic is predicting that LLMs will eventually be able to do the vast majority of jobs in broad categories ranging from “Arts & Media” and “Office & Admin” to “Legal, Business & Finance,” and even “Management.”

Credit: Anthropic That “theoretical AI coverage” area seems like it’s destined to eat a huge swath of the US job market! Credit: Anthropic That “theoretical AI coverage” area seems like it’s destined to eat a huge swath of the US job market!

Digging into the basis for those “theoretical capability” numbers, though, provides a much less chilling image of AI’s future occupational impacts. When you drill down into the specifics, that blue field represents some outdated and heavily speculative educated guesses about where AI is likely to improve human productivity and not necessarily where it will take over for humans altogether.

The best AI 2023 can buy

The LLM “theoretical capability” baseline Anthropic is citing here isn’t based on the company’s own empirical testing of its current models or quantifiable projections of performance increases over time. Instead, Anthropic cites an August 2023 report titled “GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models” co-authored by researchers at OpenAI, OpenResearch, and the University of Pennsylvania.