These medical X-rays are all deepfakes — and they fool even radiologists

AI-generated X-rays, such as these examples, could distort AI tools used to analyse medical data if included in their training data sets.Credit: Radiological Society of North America (RSNA)

Most radiologists struggle to identify X‑ray scans that are generated by artificial intelligence, with fewer than half spotting synthetic images hidden in real medical data, according to research published today in Radiology1. Large language models (LLMs) also had a hard time picking out real versus synthetic medical images.

The study provides training to help radiologists to improve their skills in detecting AI-generated X-rays. Researchers also warn that the negative impacts of synthetic data could creep into scientific literature and medical litigation.

“The results from this study are both disturbing and not very surprising to me,” says Elisabeth Bik, a microbiologist and image-integrity specialist based in San Francisco, California. “This raises concerns not only for research integrity, but also for clinical workflows, insurance claims and legal contexts where imaging evidence is used.”

Overly smooth bones

In the study, 17 radiologists from 12 research centres were presented with X-ray scans — half were real scans, and half were generated by AI. Without knowing the purpose of the study, participants were asked about the technical quality of the AI images and whether they noticed anything unusual. 41% raised concerns that AI scans might have infiltrated the data set.

How AI slop is causing a crisis in computer science

The radiologists were then informed that some of the images were AI-generated and asked to discern real scans from those created by ChatGPT. The participants correctly identified the AI and real scans 75% of the time, on average.

Importantly, “there was no difference based on the experience of the radiologists”, who had between zero and 40 years of professional experience, says study co-author Mickael Tordjman, a radiologist at the Icahn School of Medicine at Mount Sinai in New York.

The research team also investigated whether AI models such as ChatGPT and Gemini might have a more discerning eye than the radiologists, but they were only 57–85% accurate when teasing apart the real and ChatGPT-generated images.