AI Isn't Replacing Radiologists

Works in Progress is becoming a print magazine. Our first print issue, Issue 21, will land in November. If you live in the United States or the United Kingdom, you can subscribe here. If you live outside the US or UK and want to be notified as soon as subscriptions are live in your country, leave your details here.

CheXNet can detect pneumonia with greater accuracy than a panel of board-certified radiologists. It is an AI model released in 2017, trained on more than 100,000 chest X-rays. It is fast, free, and can run on a single consumer-grade GPU. A hospital can use it to classify a new scan in under a second.

Since then, companies like Annalise.ai, Lunit, Aidoc, and Qure.ai have released models that can detect hundreds of diseases across multiple types of scans with greater accuracy and speed than human radiologists in benchmark tests. Some products can reorder radiologist worklists to prioritize critical cases, suggest next steps for care teams, or generate structured draft reports that fit into hospital record systems. A few, like IDx-DR, are even cleared to operate without a physician reading the image at all. In total, there are over 700 FDA-cleared radiology models, which account for roughly three-quarters of all medical AI devices.

Radiology is a field optimized for human replacement, where digital inputs, pattern recognition tasks, and clear benchmarks predominate. In 2016, Geoffrey Hinton – computer scientist and Turing Award winner – declared that ‘people should stop training radiologists now’. If the most extreme predictions about the effect of AI on employment and wages were true, then radiology should be the canary in the coal mine.

But demand for human labor is higher than ever. In 2025, American diagnostic radiology residency programs offered a record 1,208 positions across all radiology specialties, a four percent increase from 2024, and the field’s vacancy rates are at all-time highs. In 2025, radiology was the second-highest-paid medical specialty in the country, with an average income of $520,000, over 48 percent higher than the average salary in 2015.

Three things explain this. First, while models beat humans on benchmarks, the standardized tests designed to measure AI performance, they struggle to replicate this performance in hospital conditions. Most tools can only diagnose abnormalities that are common in training data, and models often don’t work as well outside of their test conditions. Second, attempts to give models more tasks have run into legal hurdles: regulators and medical insurers so far are reluctant to approve or cover fully autonomous radiology models. Third, even when they do diagnose accurately, models replace only a small share of a radiologist’s job. Human radiologists spend a minority of their time on diagnostics and the majority on other activities, like talking to patients and fellow clinicians.

Artificial intelligence is rapidly spreading across the economy and society. But radiology shows us that it will not necessarily dominate every field in its first years of diffusion — at least until these common hurdles are overcome. Exploiting all of its benefits will involve adapting it to society, and society’s rules to it.

Islands of automation

All AIs are functions or algorithms, called models, that take in inputs and spit out outputs. Radiology models are trained to detect a finding, which is a measurable piece of evidence that helps identify or rule out a disease or condition. Most radiology models detect a single finding or condition in one type of image. For example, a model might look at a chest CT and answer whether there are lung nodules, rib fractures, or what the coronary arterial calcium score is.

For every individual question, a new model is required. In order to cover even a modest slice of what they see in a day, a radiologist would need to switch between dozens of models and ask the right questions of each one. Several platforms manage, run, and interpret outputs from dozens or even hundreds of separate AI models across vendors, but each model operates independently, analyzing for one finding or disease at a time. The final output is a list of separate answers to specific questions, rather than a single description of an image.

... continue reading