Skip to content
Tech News
← Back to articles

Your doctor’s AI notetaker may be making things up, Ontario audit finds

read original get AI Medical Note Taker → more articles
Why This Matters

The Ontario audit highlights significant concerns about the reliability of AI medical scribes, revealing they can produce inaccurate or hallucinated information that may jeopardize patient safety. This underscores the importance of rigorous validation and oversight of AI tools in healthcare to prevent potential harm to patients and ensure quality care. For the tech industry, it emphasizes the need for improved AI accuracy and transparency in sensitive applications like medicine.

Key Takeaways

In recent years, many overworked doctors have turned to so-called AI medical scribes to help automatically summarize patient conversations, diagnoses, and care decisions into structured notes for health record logging. But a recent audit by the auditor general of Ontario found that AI scribes recommended by the provincial government regularly generated incorrect, incomplete and hallucinated information that could “potentially result in inadequate or harmful treatment plans that may potentially impact patient health outcomes.”

In a recent report on Use of Artificial Intelligence in the Ontario Government, the auditor general reviewed transcription tests of two simulated patient-doctor conversations performed across 20 AI scribe vendors that were approved and pre-qualified by the provincial government for purchase by healthcare providers. All 20 of those vendors showed some issue with accuracy or completeness in at least one of these simple tests, including nine that hallucinated patient information, 12 that recorded information incorrectly, and 17 that missed key details about discussed mental health issues.

In the report, the auditor general points out multiple concerning examples of mistakes in those summaries that could have a direct and negative impact on a patient’s subsequent care. That includes situations where an AI scribe hallucinated nonexistent referrals for blood tests or therapy, incorrectly transcribed the names of prescription medication, and/or missed “key details” of mental health issues discussed in the simulated conversations.