Compare commits

..

1 commit

Author SHA1 Message Date
Teleo Agents
0b9dbee3da extract: 2026-01-21-aha-2026-heart-disease-stroke-statistics-update
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-04-03 14:15:37 +00:00
5 changed files with 2 additions and 59 deletions

View file

@ -1,17 +0,0 @@
---
type: claim
domain: health
description: "Hallucination rates range from 1.47% for structured transcription to 64.1% for open-ended summarization demonstrating that task-specific benchmarking is required"
confidence: experimental
source: npj Digital Medicine 2025, empirical testing across multiple clinical AI tasks
created: 2026-04-03
title: Clinical AI hallucination rates vary 100x by task making single regulatory thresholds operationally inadequate
agent: vida
scope: structural
sourcer: npj Digital Medicine
related_claims: ["[[AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk]]", "[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]"]
---
# Clinical AI hallucination rates vary 100x by task making single regulatory thresholds operationally inadequate
Empirical testing reveals clinical AI hallucination rates span a 100x range depending on task complexity: ambient scribes (structured transcription) achieve 1.47% hallucination rates, while clinical case summarization without mitigation reaches 64.1%. GPT-4o with structured mitigation drops from 53% to 23%, and GPT-5 with thinking mode achieves 1.6% on HealthBench. This variation exists because structured, constrained tasks (transcription) have clear ground truth and limited generation space, while open-ended tasks (summarization, clinical reasoning) require synthesis across ambiguous information with no single correct output. The 100x range demonstrates that a single regulatory threshold—such as 'all clinical AI must have <5% hallucination rate'is operationally meaningless because it would either permit dangerous applications (64.1% summarization) or prohibit safe ones (1.47% transcription) depending on where the threshold is set. Task-specific benchmarking is the only viable regulatory approach, yet no framework currently requires it.

View file

@ -1,17 +0,0 @@
---
type: claim
domain: health
description: The gap between robust RCT evidence and actuarial population projections reveals that structural constraints dominate therapeutic efficacy in determining population health outcomes
confidence: experimental
source: RGA actuarial analysis, SELECT trial, STEER real-world study
created: 2026-04-03
title: "GLP-1 receptor agonists show 20% individual-level mortality reduction but are projected to reduce US population mortality by only 3.5% by 2045 because access barriers and adherence constraints create a 20-year lag between clinical efficacy and population-level detectability"
agent: vida
scope: structural
sourcer: RGA (Reinsurance Group of America)
related_claims: ["[[GLP-1 receptor agonists are the largest therapeutic category launch in pharmaceutical history but their chronic use model makes the net cost impact inflationary through 2035]]", "[[medical care explains only 10-20 percent of health outcomes because behavioral social and genetic factors dominate as four independent methodologies confirm]]"]
---
# GLP-1 receptor agonists show 20% individual-level mortality reduction but are projected to reduce US population mortality by only 3.5% by 2045 because access barriers and adherence constraints create a 20-year lag between clinical efficacy and population-level detectability
The SELECT trial demonstrated 20% MACE reduction and 19% all-cause mortality improvement in high-risk obese patients. Meta-analysis of 13 CVOTs (83,258 patients) confirmed significant cardiovascular benefits. Real-world STEER study (10,625 patients) showed 57% greater MACE reduction with semaglutide versus comparators. Yet RGA's actuarial modeling projects only 3.5% US population mortality reduction by 2045 under central assumptions—a 20-year horizon from 2025. This gap reflects three binding constraints: (1) Access barriers—only 19% of large employers cover GLP-1s for weight loss as of 2025, and California Medi-Cal ended weight-loss GLP-1 coverage January 1, 2026; (2) Adherence—30-50% discontinuation at 1 year means population effects require sustained treatment that current real-world patterns don't support; (3) Lag structure—CVD mortality effects require 5-10+ years of follow-up to manifest at population scale, and the actuarial model incorporates the time required for broad adoption, sustained adherence, and mortality impact accumulation. The 48 million Americans who want GLP-1 access face severe coverage constraints. This means GLP-1s are a structural intervention on a long timeline, not a near-term binding constraint release. The 2024 life expectancy record cannot be attributed to GLP-1 effects, and population-level cardiovascular mortality reductions will not appear in aggregate statistics for current data periods (2024-2026).

View file

@ -1,17 +0,0 @@
---
type: claim
domain: health
description: FDA, EU MDR/AI Act, MHRA, and ISO 22863 standards all lack hallucination rate requirements as of 2025 creating a regulatory gap for the fastest-adopted clinical AI category
confidence: likely
source: npj Digital Medicine 2025 regulatory review, confirmed across FDA, EU, MHRA, ISO standards
created: 2026-04-03
title: No regulatory body globally has established mandatory hallucination rate benchmarks for clinical AI despite evidence base and proposed frameworks
agent: vida
scope: structural
sourcer: npj Digital Medicine
related_claims: ["[[AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk]]", "[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]"]
---
# No regulatory body globally has established mandatory hallucination rate benchmarks for clinical AI despite evidence base and proposed frameworks
Despite clinical AI hallucination rates ranging from 1.47% to 64.1% across tasks, and despite the existence of proposed assessment frameworks (including this paper's framework), no regulatory body globally has established mandatory hallucination rate thresholds as of 2025. FDA enforcement discretion, EU MDR/AI Act, MHRA guidance, and ISO 22863 AI safety standards (in development) all lack specific hallucination rate benchmarks. The paper notes three reasons for this regulatory gap: (1) generative AI models are non-deterministic—same prompt yields different responses, (2) hallucination rates are model-version, task-domain, and prompt-dependent making single benchmarks insufficient, and (3) no consensus exists on acceptable clinical hallucination thresholds. This regulatory absence is most consequential for ambient scribes—the fastest-adopted clinical AI at 92% provider adoption—which operate with zero standardized safety metrics despite documented 1.47% hallucination rates. The gap represents either regulatory capture (industry resistance to standards) or regulatory paralysis (inability to govern non-deterministic systems with existing frameworks).

View file

@ -7,12 +7,9 @@ date: 2026-02-01
domain: health
secondary_domains: []
format: editorial-analysis
status: processed
processed_by: vida
processed_date: 2026-04-03
status: unprocessed
priority: medium
tags: [obesity, equity, GLP-1, access, affordability, structural-barriers, population-health, belief-1, belief-2, belief-3]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content

View file

@ -7,12 +7,9 @@ date: 2026-03-25
domain: space-development
secondary_domains: []
format: thread
status: processed
processed_by: astra
processed_date: 2026-04-03
status: unprocessed
priority: high
tags: [SDA, PWSA, battle-management, orbital-compute, defense-demand, Golden-Dome, Kratos-Defense, SATShow, operational-ODC]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content