Clinical AI scribes in primary care: Accuracy, error severity, and implications for clinical practice

Evaluating the Performance and Safety of Clinical Artificial Intelligence Scribes

As the healthcare industry increasingly embraces technological solutions, the use of Clinical Artificial Intelligence Scribes (CAISs) is being explored for its potential to enhance clinical documentation. However, the efficacy and safety of these tools are under scrutiny amidst growing concerns over errors. This article delves into a recent study aimed at assessing the accuracy, potential clinical impact of errors, and quality of documentation produced by CAISs.

Methods and Analysis

The study evaluated seven commercially available CAIS products using eight standardized clinical consultation scenarios, which were audio recorded. The CAIS-generated summaries were meticulously compared against a human-validated transcript to identify errors, specifically focusing on omissions, factual inaccuracies, and hallucinations. Physicians assessed the severity of these errors, and a novel severity-weighted impact score was developed—both linear and exponential variants—to measure the potential clinical impact. Additionally, the Physician Documentation Quality Instrument (PDQI-10), a validated tool for assessing the quality of clinical notes, was employed to corroborate the findings.

Results

Omissions were the most prevalent error, constituting 83.8% of all errors (p<<0.001). The frequency and severity of errors varied significantly across the CAIS products, with a median of 1-6 omissions per consultation, depending on the specific CAIS. Although less frequent, hallucinations and factual inaccuracies tended to be more clinically significant. None of the tested CAISs produced error-free summaries. The impact score underscored the clinical severity of errors, particularly highlighting the importance of rare but serious errors. Notably, the PDQI-10 analysis indicated that while the summaries excelled in consistency and clinical utility, they were notably weak in conciseness and organization.

Conclusions

While CAISs exhibit a commendable level of summary accuracy, significant discrepancies exist among the available products. Some perform well, yet none achieve perfection. Consequently, physicians are advised to exercise caution, particularly in verifying omitted psychosocial details and medications and in scrutinizing plausible-sounding inclusions. Buyers and regulators must acknowledge the substantial performance differences identified, underscoring the necessity for thorough evaluation and selection of CAIS products.

For those interested in further details, the full study can be accessed Here.

“`

IBM announces new AI-powered cybersecurity tools – Campus Technology

GE Vernova to Acquire Robotech Automation to Expand Robotics Integration

Apple Watch killer or just a gadget? – Test of the Huawei Watch Fit 5 and Fit 5 Pro

Why the Mazda 3 Sedan is more than just Mazda’s cheapest model

Clinical AI scribes in primary care: Accuracy, error severity, and implications for clinical practice

Evaluating the Performance and Safety of Clinical Artificial Intelligence Scribes

Methods and Analysis

Results

Conclusions

IBM announces new AI-powered cybersecurity tools – Campus Technology

GE Vernova to Acquire Robotech Automation to Expand Robotics Integration

Apple Watch killer or just a gadget? – Test of the Huawei Watch Fit 5 and Fit 5 Pro

Why the Mazda 3 Sedan is more than just Mazda’s cheapest model

EOS-X Space raises $140 million, Mistral acquires Emmi AI and Bliq is approved for fully driverless road operations in Estonia

Hospitals claim contracted subsidiaries of CVS Health pocketed $340 billion in their savings

Impact of acoustic and informational noise on AI-generated clinical summaries

OpenEvidence Introduces Hands-Free Voice AI Capability and Expands Hospital Presence Through Collaboration Between Cedars and Sinai

Democrats in the House and Senate want to repeal the WISeR AI pilot to pre-authenticate CMS

Assessing the environmental sustainability of AI in radiology: a systematic review of current practice

LEAVE A REPLY Cancel reply

Useful Links

Latest News

GE Vernova to Acquire Robotech Automation to Expand Robotics Integration

Apple Watch killer or just a gadget? – Test of the Huawei Watch Fit 5 and Fit 5 Pro

Why the Mazda 3 Sedan is more than just Mazda’s cheapest model

Our Newsletter