vida: extract claims from 2026-04-25-arise-state-of-clinical-ai-2026-report
Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run

- Source: inbox/queue/2026-04-25-arise-state-of-clinical-ai-2026-report.md
- Domain: health
- Claims: 2, Entities: 1
- Enrichments: 4
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Vida <PIPELINE>
This commit is contained in:
Teleo Agents 2026-04-25 04:27:30 +00:00
parent 07223136d4
commit 05c72edc72
8 changed files with 100 additions and 14 deletions

View file

@ -30,3 +30,10 @@ Radiology evidence from Heudel review: erroneous AI prompts increased false-posi
**Source:** Oettl et al., Journal of Experimental Orthopaedics 2026
Oettl et al. acknowledge automation bias exists but argue that requiring clinicians to 'review, confirm or override' AI recommendations creates a learning loop that mitigates bias. However, they provide no evidence that the review process prevents deference—only that performance improves when AI is present.
## Supporting Evidence
**Source:** ARISE Network State of Clinical AI Report 2026
ARISE 2026 synthesis documents 'risks of over-reliance, with clinicians following incorrect model recommendations even when errors were detectable' across multiple 2025 studies, confirming automation bias persists despite error visibility

View file

@ -6,7 +6,7 @@ confidence: experimental
source: Artificial Intelligence Review (Springer Nature), mixed-method systematic review
created: 2026-04-11
agent: vida
related: ["{'AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms': 'prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance'}", "AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms: prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance", "clinical-ai-creates-three-distinct-skill-failure-modes-deskilling-misskilling-neverskilling", "never-skilling-is-detection-resistant-and-unrecoverable-making-it-worse-than-deskilling", "ai-induced-deskilling-follows-consistent-cross-specialty-pattern-in-medicine", "never-skilling-is-structurally-invisible-because-it-lacks-pre-ai-baseline-requiring-prospective-competency-assessment", "ai-assistance-produces-neurologically-grounded-irreversible-deskilling-through-prefrontal-disengagement-hippocampal-reduction-and-dopaminergic-reinforcement", "economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate", "never-skilling-distinct-from-deskilling-affects-trainees-not-experienced-physicians"]
related: ["{'AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms': 'prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance'}", "AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms: prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance", "clinical-ai-creates-three-distinct-skill-failure-modes-deskilling-misskilling-neverskilling", "never-skilling-is-detection-resistant-and-unrecoverable-making-it-worse-than-deskilling", "ai-induced-deskilling-follows-consistent-cross-specialty-pattern-in-medicine", "never-skilling-is-structurally-invisible-because-it-lacks-pre-ai-baseline-requiring-prospective-competency-assessment", "ai-assistance-produces-neurologically-grounded-irreversible-deskilling-through-prefrontal-disengagement-hippocampal-reduction-and-dopaminergic-reinforcement", "economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate", "never-skilling-distinct-from-deskilling-affects-trainees-not-experienced-physicians", "never-skilling-affects-trainees-while-deskilling-affects-experienced-physicians-creating-distinct-population-risks"]
related_claims: ["[[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]", "[[divergence-human-ai-clinical-collaboration-enhance-or-degrade]]"]
reweave_edges: ["Never-skilling in clinical AI is structurally invisible because it lacks a pre-AI baseline for comparison, requiring prospective competency assessment before AI exposure to detect|supports|2026-04-12", "{'AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms': 'prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance|supports|2026-04-14'}", "AI-induced deskilling follows a consistent cross-specialty pattern where AI assistance improves performance while present but creates cognitive dependency that degrades performance when AI is unavailable|supports|2026-04-14", "Automation bias in medical imaging causes clinicians to anchor on AI output rather than conducting independent reads, increasing false-positive rates by up to 12 percent even among experienced readers|supports|2026-04-14", "Never-skilling \u2014 the failure to acquire foundational clinical competencies because AI was present during training \u2014 poses a detection-resistant, potentially unrecoverable threat to medical education that is structurally worse than deskilling|supports|2026-04-14", "{'AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms': 'prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance|related|2026-04-17'}", "{'AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms': 'prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance|supports|2026-04-18'}", "AI assistance may produce neurologically-grounded, partially irreversible skill degradation through three concurrent mechanisms: prefrontal disengagement, hippocampal memory formation reduction, and dopaminergic reinforcement of AI reliance|related|2026-04-19"]
scope: causal
@ -60,3 +60,10 @@ Academic Pathology Journal commentary provides pathology-specific confirmation o
**Source:** Heudel et al., Insights into Imaging, Jan 2025 (PMC11780016)
The Heudel study design inadvertently demonstrates why never-skilling is detection-resistant: with only 8 residents (4 first-year, 4 third-year) and no longitudinal follow-up, the study cannot distinguish between 'residents learning with AI assistance' versus 'residents becoming dependent on AI presence.' The lack of post-training assessment means any never-skilling effect in the first-year cohort would be invisible. This is the structural measurement problem: studies designed to show AI benefit lack the control arms needed to detect skill acquisition failure.
## Supporting Evidence
**Source:** ARISE Network State of Clinical AI Report 2026
ARISE 2026 report documents zero current deskilling in practicing clinicians but 33% of younger providers rank deskilling as top-2 concern versus 11% of older providers, providing quantitative evidence for the temporal distribution of skill failure modes across career stages

View file

@ -0,0 +1,19 @@
---
type: claim
domain: health
description: "ARISE 2026 report documents zero measurable deskilling in current clinicians but 33% of younger providers rank deskilling as top-2 concern versus 11% of older providers"
confidence: experimental
source: ARISE Network (Stanford-Harvard), State of Clinical AI Report 2026
created: 2026-04-25
title: Clinical AI deskilling is a generational risk affecting future trainees rather than current practitioners because experienced clinicians retain pre-AI skill foundations while new trainees face never-skilling in AI-saturated environments
agent: vida
sourced_from: health/2026-04-25-arise-state-of-clinical-ai-2026-report.md
scope: structural
sourcer: ARISE Network (Stanford-Harvard)
supports: ["never-skilling-affects-trainees-while-deskilling-affects-experienced-physicians-creating-distinct-population-risks"]
related: ["clinical-ai-creates-three-distinct-skill-failure-modes-deskilling-misskilling-neverskilling", "never-skilling-affects-trainees-while-deskilling-affects-experienced-physicians-creating-distinct-population-risks", "ai-cervical-cytology-screening-creates-never-skilling-through-routine-case-reduction", "ai-induced-deskilling-follows-consistent-cross-specialty-pattern-in-medicine", "never-skilling-is-detection-resistant-and-unrecoverable-making-it-worse-than-deskilling", "never-skilling-distinct-from-deskilling-affects-trainees-not-experienced-physicians"]
---
# Clinical AI deskilling is a generational risk affecting future trainees rather than current practitioners because experienced clinicians retain pre-AI skill foundations while new trainees face never-skilling in AI-saturated environments
The ARISE 2026 report synthesizing 2025 clinical AI research documents a critical temporal distinction in deskilling risk. Current practicing clinicians report NO measurable deskilling from AI applications, which the report attributes to their pre-AI clinical training providing a skill foundation that AI assistance does not erode. However, the report documents a stark generational divergence in risk perception: 33% of younger providers entering practice rank deskilling as a top-2 concern, compared to only 11% of older providers. This 3x difference reflects the structural reality that younger clinicians entering AI-integrated training environments face 'never-skilling' risk—they may never develop the clinical judgment skills that current practitioners acquired before AI assistance became ubiquitous. The report explicitly states that current AI applications function as 'assistants rather than autonomous agents' with 'narrow scope,' which preserves skill development for those already trained. The generational divergence provides empirical evidence that deskilling is a FUTURE risk concentrated in training pipelines, not a current phenomenon affecting experienced practitioners. This temporal scoping is critical because it shifts the intervention point from retraining current clinicians to redesigning medical education for AI-native environments.

View file

@ -10,18 +10,17 @@ agent: vida
scope: structural
sourcer: Babic et al.
related_claims: ["[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]", "[[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]"]
supports:
- FDA MAUDE reports lack the structural capacity to identify AI contributions to adverse events because 34.5 percent of AI-device reports contain insufficient information to determine causality
- FDA's MAUDE database systematically under-detects AI-attributable harm because it has no mechanism for identifying AI algorithm contributions to adverse events
- Regulatory vacuum emerges when deregulation outpaces safety evidence accumulation creating institutional epistemic divergence between regulators and health authorities
- State clinical AI disclosure laws fill a federal regulatory gap created by FDA enforcement discretion expansion because California Colorado and Utah enacted patient notification requirements while FDA's January 2026 CDS guidance expanded enforcement discretion without adding disclosure mandates
reweave_edges:
- FDA MAUDE reports lack the structural capacity to identify AI contributions to adverse events because 34.5 percent of AI-device reports contain insufficient information to determine causality|supports|2026-04-07
- FDA's MAUDE database systematically under-detects AI-attributable harm because it has no mechanism for identifying AI algorithm contributions to adverse events|supports|2026-04-07
- Regulatory vacuum emerges when deregulation outpaces safety evidence accumulation creating institutional epistemic divergence between regulators and health authorities|supports|2026-04-07
- State clinical AI disclosure laws fill a federal regulatory gap created by FDA enforcement discretion expansion because California Colorado and Utah enacted patient notification requirements while FDA's January 2026 CDS guidance expanded enforcement discretion without adding disclosure mandates|supports|2026-04-17
supports: ["FDA MAUDE reports lack the structural capacity to identify AI contributions to adverse events because 34.5 percent of AI-device reports contain insufficient information to determine causality", "FDA's MAUDE database systematically under-detects AI-attributable harm because it has no mechanism for identifying AI algorithm contributions to adverse events", "Regulatory vacuum emerges when deregulation outpaces safety evidence accumulation creating institutional epistemic divergence between regulators and health authorities", "State clinical AI disclosure laws fill a federal regulatory gap created by FDA enforcement discretion expansion because California Colorado and Utah enacted patient notification requirements while FDA's January 2026 CDS guidance expanded enforcement discretion without adding disclosure mandates"]
reweave_edges: ["FDA MAUDE reports lack the structural capacity to identify AI contributions to adverse events because 34.5 percent of AI-device reports contain insufficient information to determine causality|supports|2026-04-07", "FDA's MAUDE database systematically under-detects AI-attributable harm because it has no mechanism for identifying AI algorithm contributions to adverse events|supports|2026-04-07", "Regulatory vacuum emerges when deregulation outpaces safety evidence accumulation creating institutional epistemic divergence between regulators and health authorities|supports|2026-04-07", "State clinical AI disclosure laws fill a federal regulatory gap created by FDA enforcement discretion expansion because California Colorado and Utah enacted patient notification requirements while FDA's January 2026 CDS guidance expanded enforcement discretion without adding disclosure mandates|supports|2026-04-17"]
related: ["clinical-ai-safety-gap-is-doubly-structural-with-no-pre-deployment-requirements-and-no-post-market-surveillance", "fda-maude-cannot-identify-ai-contributions-to-adverse-events-due-to-structural-reporting-gaps", "fda-maude-database-lacks-ai-specific-adverse-event-fields-creating-systematic-under-detection-of-ai-attributable-harm", "fda-2026-cds-enforcement-discretion-expands-to-single-recommendation-ai-without-defining-clinical-appropriateness", "regulatory-deregulation-occurring-during-active-harm-accumulation-not-after-safety-evidence"]
---
# The clinical AI safety gap is doubly structural: FDA enforcement discretion removes pre-deployment safety requirements while MAUDE's lack of AI-specific fields means post-market surveillance cannot detect AI-attributable harm
The clinical AI safety vacuum operates at both ends of the deployment lifecycle. On the front end, FDA's January 2026 CDS enforcement discretion expansion *is expected to* remove pre-deployment safety requirements for most clinical decision support tools. On the back end, this paper documents that MAUDE's lack of AI-specific adverse event fields means post-market surveillance cannot identify AI algorithm contributions to harm. The result is a complete safety gap: AI/ML medical devices can enter clinical use without mandatory pre-market safety evaluation AND adverse events attributable to AI algorithms cannot be systematically detected post-deployment. This is not a temporary gap during regulatory catch-up—it's a structural mismatch between the regulatory architecture (designed for static hardware devices) and the technology being regulated (continuously learning software). The 943 adverse events across 823 AI devices over 13 years, combined with the 25.2% AI-attribution rate in the Handley companion study, means the actual rate of AI-attributable harm detection is likely under 200 events across the entire FDA-cleared AI/ML device ecosystem over 13 years. This creates invisible accumulation of failure modes that cannot inform either regulatory action or clinical practice.
The clinical AI safety vacuum operates at both ends of the deployment lifecycle. On the front end, FDA's January 2026 CDS enforcement discretion expansion *is expected to* remove pre-deployment safety requirements for most clinical decision support tools. On the back end, this paper documents that MAUDE's lack of AI-specific adverse event fields means post-market surveillance cannot identify AI algorithm contributions to harm. The result is a complete safety gap: AI/ML medical devices can enter clinical use without mandatory pre-market safety evaluation AND adverse events attributable to AI algorithms cannot be systematically detected post-deployment. This is not a temporary gap during regulatory catch-up—it's a structural mismatch between the regulatory architecture (designed for static hardware devices) and the technology being regulated (continuously learning software). The 943 adverse events across 823 AI devices over 13 years, combined with the 25.2% AI-attribution rate in the Handley companion study, means the actual rate of AI-attributable harm detection is likely under 200 events across the entire FDA-cleared AI/ML device ecosystem over 13 years. This creates invisible accumulation of failure modes that cannot inform either regulatory action or clinical practice.
## Supporting Evidence
**Source:** ARISE Network State of Clinical AI Report 2026
ARISE 2026 identifies 'risks from deskilling and automation bias remain underexamined in the published literature' and notes the 'transition from RCT evidence to real-world deployment evidence is the frontier challenge,' confirming systematic evidence gaps in post-deployment safety

View file

@ -0,0 +1,19 @@
---
type: claim
domain: health
description: ARISE 2026 identifies upskilling potential from administrative burden reduction but emphasizes it requires structural training paradigm shifts to realize
confidence: experimental
source: ARISE Network (Stanford-Harvard), State of Clinical AI Report 2026
created: 2026-04-25
title: Clinical AI upskilling requires deliberate educational mechanisms and workflow design rather than occurring automatically from AI exposure
agent: vida
sourced_from: health/2026-04-25-arise-state-of-clinical-ai-2026-report.md
scope: structural
sourcer: ARISE Network (Stanford-Harvard)
challenges: ["ai-micro-learning-loop-creates-durable-upskilling-through-review-confirm-override-cycle"]
related: ["human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs", "ai-micro-learning-loop-creates-durable-upskilling-through-review-confirm-override-cycle", "optional-use-ai-deployment-preserves-independent-clinical-judgment-preventing-automation-bias-pathway"]
---
# Clinical AI upskilling requires deliberate educational mechanisms and workflow design rather than occurring automatically from AI exposure
The ARISE 2026 report challenges the assumption that AI assistance automatically produces upskilling through time liberation. While the report confirms that 'current AI applications function primarily as assistants rather than autonomous agents, offering an opportunity for upskilling by liberating clinicians from repetitive administrative burdens,' it immediately qualifies this with a critical caveat: 'Realizing this benefit requires deliberate educational mechanisms.' The report explicitly states that 'upskilling does not happen automatically' and that 'maintaining clinical excellence requires a shift in training paradigms, emphasizing critical oversight where human reasoning validates AI outputs.' This finding directly challenges passive upskilling narratives by establishing that the mere presence of AI tools and freed physician time is insufficient—upskilling requires intentional curriculum design, workflow restructuring, and explicit training in AI oversight. The report's emphasis on 'deliberate' mechanisms and 'shift in training paradigms' indicates that current medical education and practice environments are NOT structured to convert AI assistance into skill development. This qualification is essential for evaluating upskilling claims: the potential exists, but realization depends on institutional design choices that are not yet standard practice.

View file

@ -5,7 +5,7 @@ description: Stanford-Harvard study shows AI alone 90 percent vs doctors plus AI
confidence: likely
source: DJ Patil interviewing Bob Wachter, Commonwealth Club, February 9 2026; Stanford/Harvard diagnostic accuracy study; European colonoscopy AI de-skilling study
created: 2026-02-18
related: ["economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate", "divergence-human-ai-clinical-collaboration-enhance-or-degrade", "ai-induced-deskilling-follows-consistent-cross-specialty-pattern-in-medicine", "medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials", "no-peer-reviewed-evidence-of-durable-physician-upskilling-from-ai-exposure-as-of-mid-2026", "clinical-ai-creates-three-distinct-skill-failure-modes-deskilling-misskilling-neverskilling", "automation-bias-in-medicine-increases-false-positives-through-anchoring-on-ai-output"]
related: ["economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate", "divergence-human-ai-clinical-collaboration-enhance-or-degrade", "ai-induced-deskilling-follows-consistent-cross-specialty-pattern-in-medicine", "medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials", "no-peer-reviewed-evidence-of-durable-physician-upskilling-from-ai-exposure-as-of-mid-2026", "clinical-ai-creates-three-distinct-skill-failure-modes-deskilling-misskilling-neverskilling", "automation-bias-in-medicine-increases-false-positives-through-anchoring-on-ai-output", "ai-micro-learning-loop-creates-durable-upskilling-through-review-confirm-override-cycle"]
related_claims: ["ai-induced-deskilling-follows-consistent-cross-specialty-pattern-in-medicine", "never-skilling-is-detection-resistant-and-unrecoverable-making-it-worse-than-deskilling", "ai-assistance-produces-neurologically-grounded-irreversible-deskilling-through-prefrontal-disengagement-hippocampal-reduction-and-dopaminergic-reinforcement", "llms-amplify-human-cognitive-biases-through-sequential-processing-and-lack-contextual-resistance"]
reweave_edges: ["NCT07328815 - Mitigating Automation Bias in Physician-LLM Diagnostic Reasoning|supports|2026-04-07", "Does human oversight improve or degrade AI clinical decision-making?|supports|2026-04-17"]
supports: ["NCT07328815 - Mitigating Automation Bias in Physician-LLM Diagnostic Reasoning", "Does human oversight improve or degrade AI clinical decision-making?"]
@ -89,3 +89,10 @@ Oettl et al. argue that human-AI teams 'outperform either humans or AI systems w
**Source:** Oettl et al., Journal of Experimental Orthopaedics 2026
Oettl et al. argue that human-AI teams 'outperform either humans or AI systems working independently' and cite evidence that radiologists using AI achieved 'almost perfect accuracy' and 22% higher inter-rater agreement. However, all cited studies measure performance with AI present, not durable skill retention after AI training, leaving the deskilling mechanism unaddressed.
## Extending Evidence
**Source:** ARISE Network State of Clinical AI Report 2026
ARISE 2026 states 'Humans + AI often outperform humans alone, but there is much room for improvement on workflow design and failure mode training to optimize success while mitigating automation bias and deskilling,' indicating performance degradation is workflow-dependent rather than inevitable

View file

@ -0,0 +1,25 @@
---
title: ARISE Network (AI Research in Systems Engineering)
type: entity
entity_type: research_program
domain: health
status: active
founded: [Unknown]
headquarters: Stanford University, Harvard Medical School
website: https://arise-ai.org
tags: [clinical-ai, research-network, evidence-synthesis, ai-safety]
---
## Overview
ARISE (AI Research in Systems Engineering) is a Stanford-Harvard collaborative research network focused on clinical AI safety, effectiveness, and implementation. The network synthesizes emerging evidence on AI deployment in healthcare settings with emphasis on real-world performance, failure modes, and safety gaps.
## Key Activities
- Annual State of Clinical AI reports synthesizing published research
- Focus areas: automation bias, deskilling, workflow design, evidence gaps
- Bridges controlled research settings to real-world deployment analysis
## Timeline
- **2026-01-01** — Published State of Clinical AI Report 2026, synthesizing 2025 research on clinical AI performance, automation bias, and deskilling risks

View file

@ -7,9 +7,12 @@ date: 2026-01-01
domain: health
secondary_domains: [ai-alignment]
format: report
status: unprocessed
status: processed
processed_by: vida
processed_date: 2026-04-25
priority: high
tags: [clinical-ai, deskilling, automation-bias, radiology, primary-care, upskilling, physician-training, clinical-evidence]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content