The High Stakes of Diagnostic AI#
When artificial intelligence gets a diagnosis wrong, the consequences can be catastrophic. Missed cancers, delayed stroke treatment, sepsis alerts that fail to fire, diagnostic AI failures are increasingly documented, yet lawsuits directly challenging these systems remain rare. This tracker compiles the evidence: validated failures, performance gaps, bias documentation, FDA recalls, and the emerging litigation that will shape AI medical liability for decades.
- 67% of sepsis cases missed by Epic Sepsis Model despite generating alerts on 18% of all patients
- 182 recall events involving 60 FDA-cleared AI devices (through Nov 2024)
- 109 recalls specifically for diagnostic or measurement errors
- 43% of AI device recalls occur within one year of FDA authorization
- Only 10.2% of AI-generated dermatology images show dark skin tones
Radiology AI Failures#
Documented Performance Gaps#
Radiology AI comprises 78% of FDA-cleared AI medical devices, making it the highest-risk category for misdiagnosis. Despite marketing claims of high accuracy, real-world performance varies dramatically.
Critical Incidents:
| Year | Incident | Consequence |
|---|---|---|
| 2024 | FDA-cleared AI misidentified ischemic stroke as intracranial hemorrhage | Opposite conditions requiring different treatment |
| 2023 | AI mammography generated 69% of MAUDE adverse event reports | Primarily near-miss events |
| Ongoing | AI fails to detect early-stage tumors visible to experienced radiologists | Delayed cancer diagnosis |
Radiology AI Misdiagnosis Case Examples#
The following cases illustrate the emerging liability landscape for radiology AI failures. While many involve traditional radiology malpractice, they establish the damages framework AI systems will face as adoption increases.
Texas AI Healthcare Vendor Settlement
Texas AG Paxton secured first-ever settlement with Pieces Technologies, a Dallas AI company, for making false claims about the accuracy and safety of its generative AI products deployed at major Texas hospitals. The AI 'summarized' patient conditions in real-time, but investigation found accuracy metrics were likely inaccurate and deceptive. Settlement terms not disclosed.
FDA AI Stroke Misclassification
FDA-cleared AI algorithm misdiagnosed a patient's ischemic stroke as intracranial hemorrhage, conditions requiring opposite treatments. The case highlighted critical failure modes of AI diagnostic tools and the importance of human-machine interaction in urgent clinical decisions.
NY Basilar Artery Occlusion Miss
Largest radiology malpractice verdict: patient's basilar artery occlusion was not recognized on CT study, initially misinterpreted by radiology resident. While not AI-specific, this verdict establishes the damages framework for missed stroke diagnoses that AI systems increasingly handle.
Georgia AVM Misdiagnosis
High school senior suffered devastating injury after radiologist failed to identify arteriovenous malformation (AVM) on routine emergency scan. This case demonstrates the liability exposure for AI systems tasked with detecting vascular abnormalities in emergency imaging.
Pennsylvania CT Blood Clot Miss
27-year-old woman left legally blind after radiologist failed to diagnose brain blood clots on CT scan at Saint Vincent Hospital (November 2020). Case highlights liability for missed findings that AI CAD systems are marketed to detect.
Unlicensed Offshore AI Interpretation
The Radiology Group (Atlanta) settled federal lawsuit for using unlicensed labor from India to interpret patient radiology scans. Evidence showed radiologists approving results from unlicensed workers in as little as 30 seconds, a practice analogous to rubber-stamping AI outputs without clinical review.
See also: AI Medical Device Adverse Events, Comprehensive device-level analysis and MAUDE database trends
Pathology AI: Digital Diagnosis Failures#
Emerging Risk Category#
While pathology AI has shown promise:Paige Prostate became the first FDA-approved AI application in pathology in 2021, significant validation gaps remain. Unlike radiology AI with its 182 documented recalls, pathology AI adverse events are less publicly documented, partly because the field is newer and adoption remains limited.
Documented Performance Concerns#
| Issue | Finding | Source |
|---|---|---|
| Demographic Bias | AI models trained predominantly on lighter skin tissue samples show degraded performance on darker-pigmented specimens | Journal of Pathology Informatics 2024 |
| Edge Case Failures | AI struggles with rare tumor variants and unusual presentations that pathologists recognize from experience | CAP Digital Pathology Committee 2024 |
| Scanner Variability | Same slide scanned on different digital pathology scanners produces different AI outputs | FDA 510(k) review documents |
| Pre-analytical Variables | Tissue processing, staining intensity, and sectioning quality significantly impact AI accuracy | ASCP Position Statement 2024 |
Liability Framework for Pathology AI#
Who Bears Responsibility:
| Party | Potential Liability |
|---|---|
| Pathologist | Professional duty to exercise independent judgment; cannot defer entirely to AI |
| Laboratory | CLIA/CAP accreditation requires validation of new diagnostic tools before clinical use |
| AI Vendor | Product liability for design defects; failure to warn about demographic limitations |
| Scanner Manufacturer | Component liability if hardware affects AI performance |
The College of American Pathologists (CAP) position: AI should be used as an adjunct tool, not a replacement for pathologist interpretation. Labs must:
- Validate AI tools on their own patient populations before clinical deployment
- Document both AI recommendations and final pathologist diagnosis
- Monitor discordance rates between AI and pathologist calls
- Report significant failures to the vendor and FDA MAUDE database
Cardiology AI: ECG and Cardiac Imaging Failures#
The Atrial Fibrillation False Positive Crisis#
AI-powered ECG analysis has exploded in adoption, with CMS including AI-ECG technology in its 2025 Hospital Outpatient Prospective Payment System (OPPS) final rule. But population-scale screening creates massive false positive risks.
The False Positive Cascade#
When AI-ECG systems screen large populations, even high specificity creates massive downstream harm:
The Math (from AAFP 2024):
- Assume 10 million people screened via smartwatch/wearable
- 90% specificity (considered high)
- 2% actual AF prevalence
- Result: 980,000 false positive diagnoses
Consequences of False Positives:
- Iatrogenic harm from unnecessary diagnostic testing (echocardiograms, cardiac catheterization)
- Bleeding complications from unnecessary anticoagulation
- Psychological anxiety from cardiac diagnosis
- Healthcare system burden and costs
Documented Performance Issues#
| Area | Issue | Impact |
|---|---|---|
| Wearable ECG | Single-lead recordings limit diagnostic accuracy | Narrow clinical applicability |
| STEMI Detection | Standard care: 41.8% false-positive cath lab activations | Unnecessary invasive procedures |
| Population Bias | AI trained on predominantly white populations | Degraded performance on diverse patients |
| Operator Dependence | Over-reliance on AI reduces clinical vigilance | Automation complacency |
Positive Developments#
Not all cardiology AI news is concerning. A TCT 2025 study showed AI-ECG reduced false-positive cath lab activations from 41.8% to 7.9% for STEMI detection, a fourfold improvement. This demonstrates that properly validated AI can improve outcomes when targeted at specific, well-defined clinical questions.
AI-ECG STEMI Detection Study
AI-ECG analysis reduced false-positive cath lab activations from 41.8% (standard care) to 7.9%, a fourfold improvement. Demonstrates that AI can reduce harm when properly validated for specific clinical questions, unlike broad population screening applications.
See also: Cardiology AI Standard of Care, Full analysis of cardiac AI liability
Emergency Medicine AI: Triage and Diagnostic Failures#
The High-Stakes Environment#
Emergency departments represent perhaps the highest-risk environment for AI diagnostic tools. Chaotic, high-pressure settings with limited information and cognitive overload create fertile ground for both AI benefits and catastrophic failures.
Traditional Triage Failures AI Must Address#
| Error Type | Description | Patient Impact |
|---|---|---|
| Under-triage | Severe conditions missed or deprioritized | Delayed treatment, preventable death |
| Over-triage | Less severe conditions overly prioritized | Resource waste, morbidity from unnecessary intervention |
| Cognitive Overload | Rapid decisions with incomplete information | Diagnostic errors |
AI Triage Risks#
While AI offers potential solutions, it introduces new failure modes:
“Overconfident Answers”: AI systems may present diagnoses with inappropriate certainty, leading clinicians to accept incorrect recommendations without adequate scrutiny.
Limited Real-World Validation: Most AI emergency medicine research remains retrospective and proof-of-concept. As one systematic review noted: “The potential for AI applications in routine clinical care settings is yet to be achieved.”
Liability Ambiguity: Jurisdictions worldwide are grappling with accountability questions:
- Should providers be held accountable for following AI advice?
- Can liabilities extend to AI developers or institutions?
- The EU AI Act and FDA guidance represent initial steps, but specific guidelines for LLM-driven decision support remain limited.
Emerging Standard of Care Questions#
The fundamental question for emergency medicine AI: When an AI triage system under-triages a patient who then dies, who is liable?
Current framework suggests:
- The institution for deploying inadequately validated AI
- The clinician who accepted the AI recommendation without independent assessment
- The AI vendor potentially under product liability theories
Courts have not yet definitively ruled on emergency AI triage liability.
Mammography AI: Cancer Detection Performance#
AI-STREAM Trial Results (2025)#
The AI-STREAM prospective multicenter cohort study (24,543 women, 140 screen-detected cancers) provides the most rigorous evaluation of AI mammography performance:
Key Finding: While AI-CAD detected some cancers missed by radiologists, it missed more than twice as many that radiologists caught, challenging vendor marketing claims of AI superiority.
Dense Breast Tissue Challenge#
The most common reason for AI-missed cancers was lesions obscured by overlapping dense breast tissue:
- Overall mammographic sensitivity: 75-85% (drops to 30-50% in dense breasts)
- Women with dense breasts (≥75% density) face higher cancer risk AND lower detection rates
- AI systems trained primarily on non-dense breasts perform poorly on this high-risk population
FDA-Cleared Mammography AI#
More than 20 FDA-approved AI applications exist for breast imaging, but adoption remains “widely variable and low overall.” Historical CAD performance issues persist:
“In 2015, researchers demonstrated that although FDA had long cleared CAD for clinical use, CAD didn’t improve radiologists’ interpretations of mammograms in routine practice. In fact, CAD decreased sensitivity in the subset of radiologists who interpreted studies with and without it.”
AI Malpractice Claims: 2024-2025 Trends#
Emerging Litigation Statistics#
Primary AI Malpractice Sources: The majority of AI-related malpractice claims stem from diagnostic AI in:
- Radiology (imaging interpretation)
- Cardiology (ECG analysis)
- Oncology (treatment recommendations)
Insurance Industry Response#
Malpractice insurers are adapting to AI risks:
- Some insurers have introduced AI-specific exclusions
- Others require physicians to complete AI training to maintain coverage
- Premium adjustments for facilities deploying unvalidated AI
- New policy language addressing “algorithm error” vs “physician error”
Regulatory Developments#
Federation of State Medical Boards (April 2024): Suggested member boards hold clinicians, not AI makers, liable when AI makes medical errors, placing documentation and validation burden on physicians.
Georgia (2024): First state to pass legislation specifically governing AI in healthcare.
Texas AG (September 2024): First enforcement action against AI healthcare vendor for deceptive accuracy claims.
The Radiologist Liability Trap#
AI creates a novel double-bind for radiologists:
If AI flags something the radiologist misses:
“If AI flags a lung nodule on a chest radiograph that the radiologist doesn’t see and therefore doesn’t mention in the report, and that nodule turns out to be cancerous, the radiologist may be liable not just for missing the cancer but for ignoring AI’s advice.”
If radiologist follows AI and it’s wrong: The physician may be liable for failing to apply independent clinical judgment.
See also: AI Medical Device Adverse Events, Comprehensive device-level analysis
Epic Sepsis Model: The Most Documented Failure#
The Performance Crisis#
The Epic Sepsis Model (ESM) is deployed at hundreds of US hospitals. A landmark JAMA Internal Medicine study exposed catastrophic underperformance:
Study: University of Michigan, 27,697 patients, 38,455 hospitalizations
- 67% of sepsis cases missed despite generating alerts on 18% of patients
- AUC of 0.63 vs Epic’s reported 0.76-0.83
- Only 7% of missed cases identified that clinicians also missed
- Created massive alert fatigue without improving outcomes
Why It Failed#
Training vs Reality: The model was trained on synthetic sepsis definitions that don’t match real-world clinical presentations. When validated against Medicare/CDC-aligned definitions, performance collapsed.
No Independent Validation: Hundreds of hospitals deployed the algorithm without verifying its advertised 80% accuracy rate.
Epic’s Response: Epic disputed the findings, claiming hospitals needed to “tune” the model before deployment. In 2022, Epic released an updated version claiming better performance, but independent validation remains limited.
Liability Implications#
Hospitals deploying unvalidated AI for sepsis detection face potential liability for:
- Negligent implementation of unvalidated clinical tools
- Corporate negligence for systemic failure to validate vendor claims
- False Claims Act exposure if billing for AI-enhanced care that doesn’t meet standards
Dermatology AI: Racial Bias Crisis#
Documented Disparities#
Dermatology AI demonstrates some of the most severe documented racial bias in medical AI:
Performance Gap Evidence#
Northwestern University Study (2024):
- AI assistance improved diagnostic accuracy 33% for dermatologists, 69% for primary care
- However: Accuracy gap between light and dark skin tones widened with AI
- Primary care physicians who see mostly white patients showed AI-exacerbated bias on dark skin
Training Data Problem:
- Medical textbooks and dermatology training materials lack darker skin tone examples
- AI systems trained on unrepresentative data systematically misdiagnose conditions in darker skin
- Skin cancer detection models trained only on lighter skin perform poorly on darker-skinned patients
Potential Solutions#
Research shows fine-tuning AI models on diverse datasets (like the DDI dataset) effectively closes the performance gap, but most commercial tools haven’t implemented these corrections.
IBM Watson for Oncology: The $4 Billion Failure#
What Went Wrong#
IBM Watson for Oncology was marketed as revolutionary AI that would transform cancer treatment. Instead, it became a cautionary tale of AI overpromise.
Documented Failures:
| Issue | Example |
|---|---|
| Inappropriate recommendations | Recommended chemotherapy for patients whose cancer hadn’t spread to lymph nodes |
| Unexplainable reasoning | System couldn’t explain why it made recommendations outside normal protocols |
| Clinical trial failures | Consistently scored below human clinicians, sometimes under 50% |
| Alarming blind spots | Missed treatment considerations that oncologists routinely catch |
Legal and Liability Framework#
Legal scholars have debated Watson’s liability status:
Arguments for AI Personhood: Some suggest Watson warrants status analogous to a medical resident, requiring oversight but bearing some responsibility.
Current Reality: Because Watson was part of a physician team, it was never solely responsible for injuries. Each treating physician remains liable for their portion of damages.
The Outcome: IBM sold Watson Health’s data and analytics products for over $1 billion to Francisco Partners, a fraction of the $4 billion invested in development.
FDA Recalls and Safety Signals#
Recall Statistics (Through November 2024)#
A JAMA study analyzed FDA recalls of AI-enabled medical devices:
Recall Causes:
| Cause | Number of Recalls |
|---|---|
| Diagnostic/measurement errors | 109 |
| Functionality delay or loss | 44 |
| Physical hazards | 14 |
| Biochemical hazards | 13 |
Validation Gap Problem#
The FDA’s 510(k) clearance pathway, used for 97% of AI medical devices, doesn’t require prospective human testing:
- Devices enter market with limited or no clinical evaluation
- AI lacking validation data before FDA clearance is more likely to be recalled
- Even devices with strong premarket data “frequently” perform worse in real-world settings
- Public company devices recalled more frequently (92%) than private company devices (53%)
See also: AI Medical Device Adverse Events, MAUDE database analysis and FDA reporting gaps
Malpractice Verdicts and Settlements#
Cancer Misdiagnosis (2024)#
While not exclusively AI-related, radiology malpractice verdicts establish the liability framework AI systems will face:
| Amount | Case | Year |
|---|---|---|
| $9,000,000 | NY settlement: Radiologist failed to identify breast mass as cancer, 2.5-year delay | 2024 |
| $7,100,000 | PA verdict: Radiologist missed blood clots on CT, patient left legally blind | 2024 |
| $3,380,000 | MD verdict: CT scan misinterpretation led to stage I→IV cancer progression | 2024 |
| $3,000,000 | Judgment: Missed cancer diagnosis, terminal patient | 2024 |
| $2,000,000 | NY settlement: Post-lumpectomy MRI misread, second cancer missed | 2024 |
Average Values:
- Cancer misdiagnosis settlements average $300,000-$660,000
- 43% of breast cancer misdiagnosis defendants are radiologists
Emerging AI-Specific Litigation#
Direct AI diagnostic malpractice lawsuits remain rare, but the foundation is being established:
Theoretical Framework:
- Product liability claims against AI developers for design defects
- Malpractice claims against physicians for over-reliance on flawed AI
- Hospital negligence for deploying unvalidated AI systems
- Corporate manslaughter theories for gross negligence (UK precedent emerging)
Emerging Litigation Trends#
Product Liability for Diagnostic AI#
Following the Garcia v. Character Technologies precedent (treating AI as a “product”), diagnostic AI developers may face:
Design Defect Claims:
- AI trained on biased data
- Inadequate validation across patient populations
- Failure to perform as marketed
Failure to Warn:
- Inadequate disclosure of accuracy limitations
- Missing warnings about demographic performance gaps
- No disclosure of known failure modes
Manufacturing Defect:
- Training data contamination
- Version-specific bugs
- Data drift causing degraded performance
Hospital and Health System Liability#
Healthcare organizations deploying AI face potential claims for:
Negligent Selection:
- Choosing AI vendors without validating claims
- Deploying systems with known bias issues
- Ignoring FDA recall notices
Negligent Implementation:
- Failure to customize AI for local patient populations
- Inadequate training for clinical staff
- No override protocols for AI recommendations
Corporate Negligence:
- Systemic failure to monitor AI outcomes
- Prioritizing efficiency over patient safety
- Suppressing internal concerns about AI performance
Standard of Care for Diagnostic AI#
What Reasonable Use Looks Like#
Based on FDA guidance and emerging best practices:
Pre-Deployment:
- Independent validation in local patient population
- Bias testing across demographics
- Clear performance benchmarks vs human decision-making
- Override protocols for AI recommendations
Operational:
- Human review of all AI diagnostic recommendations
- Documentation of when AI is followed vs overridden
- Outcome monitoring by patient demographics
- Alert fatigue management
Ongoing:
- Regular revalidation as patient populations change
- Tracking real-world performance vs marketed claims
- Reporting to FDA when performance degrades
- Updating based on new evidence
What Falls Below Standard#
Practices likely to support liability:
- Deploying AI without independent validation
- Using AI with known demographic performance gaps
- Following AI recommendations without clinical judgment
- Ignoring FDA recalls or safety signals
- Failing to track outcomes
- Over-relying on vendor marketing claims
Frequently Asked Questions#
Can I sue if AI misdiagnosed my condition?
Who is liable when diagnostic AI gets it wrong, the doctor or the AI company?
Are there any AI misdiagnosis lawsuits I can join?
How can I find out if AI was used in my diagnosis?
What should hospitals do about the Epic Sepsis Model performance issues?
Is dermatology AI safe for people with darker skin?
What are the risks of AI-ECG screening for atrial fibrillation?
Is pathology AI ready for clinical use?
Who is liable when AI triage in the emergency department fails?
How should cardiologists handle AI-ECG recommendations they disagree with?
Related Resources#
AI Liability Framework#
- AI Product Liability, Strict liability for AI systems
- AI Software as a Product, Garcia ruling analysis
- AI Litigation Landscape 2025, Overview of AI lawsuits
Healthcare AI#
- AI Medical Device Adverse Events, FDA reporting and device-level analysis
- Healthcare AI Standard of Care, Medical AI deployment standards
- Radiology AI Standard of Care, Imaging AI liability framework
- Cardiology AI Standard of Care, Cardiac AI liability framework
- Surgical Robotics Standard of Care, Robotic surgery liability
- AI Insurance Claim Denials, Health insurer AI lawsuits
Algorithmic Bias#
- Mobley v. Workday, AI discrimination class action
- AI Family Law and Custody, Algorithm bias in sensitive decisions
Concerned About AI Diagnostic Accuracy?
From radiology AI that misses cancers to sepsis models that fail most patients, diagnostic AI raises serious questions about patient safety and liability. Understanding when AI tools meet, or fall short of, the standard of care is essential for providers and patients alike.
Contact Us