teleo-codex/inbox/queue/2024-xx-handley-npj-ai-safety-issues-fda-device-reports.md

---
type: source
title: "Artificial Intelligence Related Safety Issues Associated with FDA Medical Device Reports"
author: "Handley J.L., Krevat S.A., Fong A. et al."
url: https://www.nature.com/articles/s41746-024-01357-5
date: 2024-01-01
domain: health
secondary_domains: [ai-alignment]
format: journal-article
status: unprocessed
priority: high
tags: [FDA, MAUDE, AI-medical-devices, adverse-events, patient-safety, post-market-surveillance, belief-5]
---

## Content

Published in *npj Digital Medicine* (2024). Examined feasibility of using MAUDE patient safety reports to identify AI/ML device safety issues, in response to Biden 2023 AI Executive Order's directive to create a patient safety program for AI.

**Study design:**
- Reviewed 429 MAUDE reports associated with AI/ML-enabled medical devices
- Classified each as: potentially AI/ML related, not AI/ML related, or insufficient information

**Key findings:**
- 108 of 429 (25.2%) were potentially AI/ML related
- 148 of 429 (34.5%) contained **insufficient information to determine whether AI contributed**
- Implication: for more than a third of adverse events involving AI-enabled devices, it is impossible to determine whether the AI contributed to the event

**Interpretive note (from session research context):**
The Biden AI Executive Order created the mandate; this paper demonstrates that existing surveillance infrastructure cannot execute on the mandate. MAUDE lacks the fields, the taxonomy, and the reporting protocols needed to identify AI contributions to adverse events. The 34.5% "insufficient information" category is the key signal — not a data gap, but a structural gap.

**Recommendations from the paper:**
- Guidelines to inform safe implementation of AI in clinical settings
- Proactive AI algorithm monitoring processes
- Methods to trace AI algorithm contributions to safety issues
- Infrastructure for healthcare facilities lacking expertise to safely implement AI

**Significance of publication context:**
Published in npj Digital Medicine, 2024 — one year before FDA's January 2026 enforcement discretion expansion. The paper's core finding (MAUDE can't identify AI contributions to harm) is the empirical basis for the Babic et al. 2025 framework paper's policy recommendations. FDA's January 2026 guidance addresses none of these recommendations.

## Agent Notes

**Why this matters:** This paper directly tested whether the existing surveillance system can detect AI-specific safety issues — and found that 34.5% of reports involving AI devices contain insufficient information to determine AI's role. This is not a sampling problem; it is structural. The MAUDE system cannot answer the basic safety question: "did the AI contribute to this patient harm event?"

**What surprised me:** The framing connects directly to the Biden AI EO. This paper was written explicitly to inform a federal patient safety program for AI. It demonstrates that the required infrastructure doesn't exist. The subsequent FDA CDS enforcement discretion expansion (January 2026) expanded AI deployment without creating this infrastructure.

**What I expected but didn't find:** Evidence that any federal agency acted on this paper's recommendations between publication (2024) and January 2026. No announced MAUDE reform for AI-specific reporting fields found in search results.

**KB connections:**
- Babic framework paper (archived this session) — companion, provides the governance solution framework
- FDA CDS Guidance January 2026 (archived this session) — policy expansion without addressing surveillance gap
- Belief 5 (clinical AI novel safety risks) — the failure to detect is itself a failure mode

**Extraction hints:**
"Of 429 FDA MAUDE reports associated with AI-enabled devices, 34.5% contained insufficient information to determine whether AI contributed to the adverse event — establishing that MAUDE's design cannot answer basic causal questions about AI-related patient harm, making it structurally incapable of generating the safety evidence needed to evaluate whether clinical AI deployment is safe."

**Context:** One of the co-authors (Krevat) works in FDA's patient safety program. This paper has official FDA staff co-authorship — meaning FDA insiders have documented the inadequacy of their own surveillance tool for AI. This is institutional self-documentation of a structural gap.

## Curator Notes

PRIMARY CONNECTION: Babic framework paper; FDA CDS guidance; Belief 5 clinical AI safety risks
WHY ARCHIVED: FDA-staff co-authored paper documenting that MAUDE cannot identify AI contributions to adverse events — the most credible possible source for the post-market surveillance gap claim. An FDA insider acknowledging the agency's surveillance limitations.
EXTRACTION HINT: The FDA co-authorship is the key credibility signal. Extract with attribution to FDA staff involvement. Pair with Babic's structural framework for the most complete post-market surveillance gap claim.