Google’s healthcare AI made up a body part — what happens when doctors don’t notice?

is The Verge’s senior AI reporter. An AI beat reporter for more than five years, her work has also appeared in CNBC, MIT Technology Review, Wired UK, and other outlets.

Posts from this author will be added to your daily email digest and your homepage feed.

Scenario: A radiologist is looking at your brain scan and flags an abnormality in the basal ganglia. It’s an area of the brain that helps you with motor control, learning, and emotional processing. The name sounds a bit like another part of the brain, the basilar artery, which supplies blood to your brainstem — but the radiologist knows not to confuse them. A stroke or abnormality in one is typically treated in a very different way than in the other.

Now imagine your doctor is using an AI model to do the reading. The model says you have a problem with your “basilar ganglia,” conflating the two names into an area of the brain that does not exist. You’d hope your doctor would catch the mistake and double-check the scan. But there’s a chance they don’t.

Though not in a hospital setting, the “basilar ganglia” is a real error that was served up by Google’s healthcare AI model, Med-Gemini. A 2024 research paper introducing Med-Gemini included the hallucination in a section on head CT scans, and nobody at Google caught it, in either that paper or a blog post announcing it. When Bryan Moore, a board-certified neurologist and researcher with expertise in AI, flagged the mistake, he tells The Verge, the company quietly edited the blog post to fix the error with no public acknowledgement — and the paper remained unchanged. Google calls the incident a simple misspelling of “basal ganglia.” Some medical professionals say it’s a dangerous error and an example of the limitations of healthcare AI.

Med-Gemini is a collection of AI models that can summarize health data, create radiology reports, analyze electronic health records, and more. The pre-print research paper, meant to demonstrate its value to doctors, highlighted a series of abnormalities in scans that radiologists “missed” but AI caught. One of its examples was that Med-Gemini diagnosed an “old left basilar ganglia infarct.” But as established, there’s no such thing.

Fast-forward about a year, and Med-Gemini’s trusted tester program is no longer accepting new entrants — likely meaning that the program is being tested in real-life medical scenarios on a pilot basis. It’s still an early trial, but the stakes of AI errors are getting higher. Med-Gemini isn’t the only model making them. And it’s not clear how doctors should respond.

“What you’re talking about is super dangerous,” Maulin Shah, chief medical information officer at Providence, a healthcare system serving 51 hospitals and more than 1,000 clinics, tells The Verge. He added, “Two letters, but it’s a big deal.”

In a statement, Google spokesperson Jason Freidenfelds told The Verge that the company partners with the medical community to test its models and that Google is transparent about their limitations.

“Though the system did spot a missed pathology, it used an incorrect term to describe it (basilar instead of basal). That’s why we clarified in the blog post,” Freidenfelds said. He added, “We’re continually working to improve our models, rigorously examining an extensive range of performance attributes -- see our training and deployment practices for a detailed view into our process.”

... continue reading