pipeline: clean 7 stale queue duplicates
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
This commit is contained in:
parent
9566804906
commit
059714cab1
7 changed files with 0 additions and 517 deletions
|
|
@ -1,69 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "Cell Reports Medicine 2025: Pharmacist + LLM Co-pilot Outperforms Pharmacist Alone by 1.5x for Serious Medication Errors"
|
||||
author: "Multiple authors (Cell Reports Medicine, cross-institutional)"
|
||||
url: https://pmc.ncbi.nlm.nih.gov/articles/PMC12629785/
|
||||
date: 2025-10-15
|
||||
domain: health
|
||||
secondary_domains: [ai-alignment]
|
||||
format: research-paper
|
||||
status: null-result
|
||||
priority: medium
|
||||
tags: [clinical-ai-safety, centaur-model, medication-safety, llm-copilot, pharmacist, clinical-decision-support, rag, belief-5-counter-evidence]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
extraction_notes: "LLM returned 1 claims, 1 rejected by validator"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
Published in *Cell Reports Medicine*, October 2025 (doi: 10.1016/j.xcrm.2025.00396-9). Prospective, cross-over study. Published in PMC as PMC12629785.
|
||||
|
||||
**Study design:**
|
||||
- 91 error scenarios based on 40 clinical vignettes across **16 medical and surgical specialties**
|
||||
- LLM-based clinical decision support system (CDSS) using retrieval-augmented generation (RAG) framework
|
||||
- Three arms: (1) LLM-based CDSS alone, (2) Pharmacist + LLM co-pilot, (3) Pharmacist alone
|
||||
- Outcome: accuracy in identifying medication safety errors
|
||||
|
||||
**Key findings:**
|
||||
- **Pharmacist + LLM co-pilot:** 61% accuracy (precision 0.57, recall 0.61, F1 0.59)
|
||||
- **Serious harm errors:** Co-pilot mode increased accuracy by **1.5-fold over pharmacist alone**
|
||||
- Conclusion: "Effective LLM integration for complex tasks like medication chart reviews can enhance healthcare professional performance, improving patient safety"
|
||||
|
||||
**Implementation note:** This used a RAG architecture (retrieval-augmented generation), meaning the LLM retrieved drug information from a curated database rather than relying solely on parametric memory — reducing hallucination risk.
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters:** This is the clearest counter-evidence to Belief 5's pessimistic reading in the KB. Where NOHARM shows 22% severe error rates and the Oxford RCT shows zero improvement over controls, this study shows a POSITIVE centaur outcome: pharmacist + LLM outperforms pharmacist alone by 1.5x on the outcomes that matter most (serious harm errors). This is the centaur model working as intended.
|
||||
|
||||
**What surprised me:** The 1.5x improvement on serious harm specifically — not just average accuracy. This means the LLM helps most where the stakes are highest. That's the ideal safety profile: catching the worst errors. The RAG architecture may be key — this isn't a general chat LLM but a structured decision support tool with constrained information retrieval.
|
||||
|
||||
**What I expected but didn't find:** A clear statement of failure conditions. When does the co-pilot model FAIL to improve? The 61% accuracy ceiling suggests the co-pilot mode also misses ~39% of errors. The study doesn't clearly delineate what the LLM adds vs. what it misses.
|
||||
|
||||
**KB connections:**
|
||||
- Counter-evidence to Sessions 8-11 clinical AI safety concern: the centaur model CAN work in specific conditions (RAG architecture, domain-expert+LLM combination, structured safety task)
|
||||
- The centaur design requires domain expert + LLM — this is specifically a pharmacist co-pilot, not a physician being replaced
|
||||
- Connects to NOHARM: NOHARM found 76.6% of severe errors are omissions. If the pharmacist+LLM catches errors the pharmacist alone misses, the omission-detection mechanism is real — but requires the pharmacist to be present and engaged (not automation bias mode)
|
||||
- The RAG architecture is important: this isn't vulnerable to the misinformation propagation failure mode (Lancet DH 2026) the way a general LLM is, because it retrieves from a curated database
|
||||
- Connects to the distinction between "clinical reasoning AI" (OE) and "structured CDSS with RAG" (this study) — these are different products with different safety profiles
|
||||
|
||||
**Extraction hints:**
|
||||
- Primary claim: "LLM-based clinical decision support in co-pilot mode with a domain expert improves serious medication harm detection by 1.5x vs. pharmacist alone — evidence that centaur design works for structured safety tasks using RAG architecture"
|
||||
- The constraint is important: centaur works when (a) the expert is engaged (not automation bias mode), (b) the LLM uses RAG (not parametric memory), (c) the task is structured (medication safety, 16 specialties)
|
||||
- This limits the claim — it does NOT say "clinical AI is safe in general" — it says "LLM + expert in a structured RAG setting improves safety for a defined task"
|
||||
|
||||
**Context:** Cell Reports Medicine is a high-tier Cell Press journal for clinical translational research. Prospective cross-over design with clear comparison arms. 16 specialties gives the finding breadth across clinical contexts.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: Belief 5 counter-evidence — centaur model works under specific conditions
|
||||
WHY ARCHIVED: Best positive clinical AI safety evidence found across 12 sessions; establishes the conditions under which centaur design improves outcomes
|
||||
EXTRACTION HINT: Extract with explicit scope constraint: centaur + RAG + structured safety task = works; general CDSS + automation bias mode = doesn't work per other evidence
|
||||
|
||||
|
||||
## Key Facts
|
||||
- Cell Reports Medicine published prospective cross-over study in October 2025 (doi: 10.1016/j.xcrm.2025.00396-9, PMC12629785)
|
||||
- Study tested 91 error scenarios based on 40 clinical vignettes across 16 medical and surgical specialties
|
||||
- Pharmacist + LLM co-pilot achieved 61% accuracy (precision 0.57, recall 0.61, F1 0.59)
|
||||
- Co-pilot mode increased accuracy by 1.5-fold over pharmacist alone for serious harm errors specifically
|
||||
- LLM-based CDSS used retrieval-augmented generation (RAG) framework with curated drug database
|
||||
|
|
@ -1,70 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "JMIR 2025 Systematic Review: Knowledge-Practice Performance Gap in Clinical LLMs — Only 5% of 761 Studies Used Real Patient Data"
|
||||
author: "JMIR authors (systematic review team)"
|
||||
url: https://www.jmir.org/2025/1/e84120
|
||||
date: 2025-11-01
|
||||
domain: health
|
||||
secondary_domains: [ai-alignment]
|
||||
format: research-paper
|
||||
status: enrichment
|
||||
priority: medium
|
||||
tags: [clinical-ai-safety, benchmark-performance-gap, llm-evaluation, knowledge-practice-gap, real-world-deployment, belief-5, systematic-review]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
enrichments_applied: ["medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials.md"]
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
Published in *Journal of Medical Internet Research* (JMIR), 2025, Vol. 2025, e84120. Available in PMC as PMC12706444. Systematic review of 761 LLM evaluation studies across clinical medicine, analyzing 39 benchmarks.
|
||||
|
||||
**Key findings:**
|
||||
- **Only 5%** of 761 LLM evaluation studies assessed performance on real patient care data
|
||||
- Remaining 95%: relied on medical examination questions (USMLE-style) or case vignettes
|
||||
- Traditional knowledge-based benchmarks show saturation: leading models achieve 84-90% accuracy on USMLE
|
||||
- **Conversational frameworks:** Diagnostic accuracy drops from 82% on traditional case vignettes to 62.7% on multi-turn patient dialogues — **a 19.3 percentage point decrease**
|
||||
- LLMs demonstrate "markedly lower performance on script concordance testing (evaluating clinical reasoning) than on medical multiple-choice benchmarks"
|
||||
- Review conclusion: "Recent audits reveal substantial disconnects from clinical reality and foundational gaps in construct validity, data integrity, and safety coverage"
|
||||
|
||||
**Related findings (npj Digital Medicine benchmark study):**
|
||||
- Six LLMs evaluated: average total score 57.2%, safety score 54.7%, effectiveness 62.3%
|
||||
- **13.3% performance drop in high-risk scenarios** vs. average scenarios
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters:** This is the methodological foundation under both the Oxford/Nature Medicine RCT (94.9% → 34.5% deployment gap) and the broader claim that OE's USMLE 100% benchmark performance doesn't predict clinical outcomes. The systematic review establishes that the benchmark-to-reality gap is systematic across the field, not anomalous. The 5% real-patient-data figure is particularly striking: 95% of clinical AI evaluation is done with questions that would never fool a medical student, not with actual clinical workflows.
|
||||
|
||||
**What surprised me:** The 19.3 percentage point drop from case vignettes to multi-turn dialogues. This is the conversational complexity gap — the same model that answers discrete questions well fails in the back-and-forth of real clinical interaction. OE users query OE in conversational clinical language, making this gap directly relevant.
|
||||
|
||||
**What I expected but didn't find:** Any indication that the field is systematically correcting this — moving toward real-patient-data evaluation. The review documents the problem but doesn't identify a trend toward better evaluation practices.
|
||||
|
||||
**KB connections:**
|
||||
- Methodological foundation for the Oxford/Nature Medicine RCT deployment gap finding
|
||||
- Directly explains why OE's USMLE 100% benchmark performance (cited in Session 9) doesn't predict clinical safety
|
||||
- Connects to NOHARM's finding that real clinical scenario evaluation (31 LLMs, complex vignettes) shows 22% severe error rates — vs. USMLE saturation at 84-90%
|
||||
- The 13.3% performance drop in high-risk scenarios (npj Digital Medicine) maps to NOHARM's finding that omissions cluster in complex, high-acuity scenarios
|
||||
|
||||
**Extraction hints:**
|
||||
- Primary claim: "95% of clinical LLM evaluation uses medical examination questions rather than real patient care data — a systematic evaluation methodology gap that makes benchmark performance (84-90% USMLE) uninterpretable as a clinical safety signal"
|
||||
- Secondary: "Conversational frameworks reveal 19.3pp accuracy drop vs. case vignettes, demonstrating that LLMs fail in the back-and-forth interaction that defines actual clinical use"
|
||||
- This could merge with the Oxford/Nature Medicine source as a unified "benchmark saturation and real-world deployment gap" claim
|
||||
|
||||
**Context:** JMIR is a leading peer-reviewed journal in digital health and health informatics. Systematic review of 761 studies is a large corpus. The PMC availability confirms peer review.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: Belief 5 — clinical AI safety evaluation methodology gap
|
||||
WHY ARCHIVED: Provides systematic evidence that the KB's reliance on benchmark performance data (e.g., "OE scores 100% on USMLE") is epistemically weak — and establishes that the Oxford RCT deployment gap finding is part of a systematic pattern
|
||||
EXTRACTION HINT: Extract the 5%/95% finding as a standalone methodological claim about the clinical AI evaluation field; pair with Oxford Nature Medicine RCT as empirical confirmation
|
||||
|
||||
|
||||
## Key Facts
|
||||
- JMIR systematic review analyzed 761 LLM evaluation studies across 39 benchmarks
|
||||
- Only 5% of 761 studies assessed performance on real patient care data
|
||||
- 95% of studies relied on medical examination questions (USMLE-style) or case vignettes
|
||||
- Leading models achieve 84-90% accuracy on USMLE benchmarks
|
||||
- Diagnostic accuracy drops from 82% on case vignettes to 62.7% on multi-turn dialogues (19.3pp decrease)
|
||||
- npj Digital Medicine study: six LLMs averaged 57.2% total score, 54.7% safety score, 62.3% effectiveness
|
||||
- 13.3% performance drop in high-risk scenarios versus average scenarios (npj Digital Medicine)
|
||||
- LLMs show markedly lower performance on script concordance testing than on multiple-choice benchmarks
|
||||
|
|
@ -1,78 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "NHS England AI Scribing Supplier Registry (January 2026): 19 Vendors, DTAC + MHRA Class 1 Required — OpenEvidence Absent"
|
||||
author: "NHS England / Digital Health Network"
|
||||
url: https://www.digitalhealth.net/2026/01/nhs-england-launches-supplier-registry-for-ai-scribing-tech/
|
||||
date: 2026-01-16
|
||||
domain: health
|
||||
secondary_domains: [ai-alignment]
|
||||
format: news
|
||||
status: null-result
|
||||
priority: high
|
||||
tags: [nhs-dtac, clinical-ai-safety, regulatory-compliance, openevidence, ambient-scribing, mhra, supplier-registry, uk-healthcare, belief-5]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
extraction_notes: "LLM returned 2 claims, 2 rejected by validator"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
NHS England published a self-certified supplier registry for AI-enabled ambient scribing (Ambient Voice Technology, AVT) on January 16, 2026. The registry was announced in early 2025 and launched following an open application process.
|
||||
|
||||
**Registry requirements for suppliers:**
|
||||
- Completion of NHS DTAC (Digital Technology Assessment Criteria) assessment
|
||||
- MHRA Class 1 Medical Device registration with evidence of post-market surveillance
|
||||
- Proven impact and experience in healthcare environments
|
||||
- Integration with existing NHS digital infrastructure
|
||||
- Scalability
|
||||
- Evidence of meeting stated clinical capabilities
|
||||
|
||||
**The 19 registered vendors (as of January 2026):**
|
||||
33n, Accurx, Anathem, Aprobrium (Lexacom), Beam Up, Corti, Dictate IT, eConsult, HealthOrbit AI, Heidi Health, Lyrebird Health, Microsoft Dragon, Optum (EMIS), Pungo t/a Joy, Scribetech, Tandem, Tortus, T-Pro, X-On Health.
|
||||
|
||||
**Applications reopened February 3, 2026, and remain open indefinitely.**
|
||||
|
||||
**NHS DTAC V2 update (February 24, 2026):** NHS England published an updated DTAC form with 25% fewer questions, de-duplicated with DSPT and pre-acquisition questionnaire. Deadline: ALL NHS digital health tool procurement must use the new form from April 6, 2026.
|
||||
|
||||
**NHS England April 2025 guidance on AI-enabled ambient scribing:** Mandates full clinical safety case (DCB0160), Data Protection Impact Assessment (DPIA), MHRA medical device determination, DTAC compliance.
|
||||
|
||||
**OpenEvidence "Visits" context:** In August 2025, OE launched "Visits" — a documentation tool that auto-generates clinical notes from patient encounters AND enriches notes with evidence-based guidelines. This is a hybrid documentation+CDSS tool that would need DTAC + MHRA Class 1 to be formally deployed in NHS settings. OE is **not on the 19-vendor registry.** OE's public website contains **no DTAC assessment and no MHRA registration evidence.**
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters:** The NHS supplier registry is the regulatory forcing function I hypothesized in Session 11. It's now operational: 19 vendors have met DTAC + MHRA Class 1 requirements. OpenEvidence "Visits" (documentation tool launched August 2025) would directly compete with tools on this registry — but OE has not completed the required compliance steps. OE's stated 2026 UK expansion plans require DTAC compliance for any NHS deployment. This creates a choice point for OE: formalize UK compliance (and thereby disclose clinical safety data) or remain UK individual-clinician only (informal use, not NHS-reimbursed).
|
||||
|
||||
**What surprised me:** OE's absence from the registry despite "Visits" being a clear ambient scribing competitor. The 19-vendor registry includes Microsoft Dragon and Accurx (major players) — OE would be a meaningful addition if it were compliance-ready. Its absence suggests either: (a) OE has not prioritized UK compliance, or (b) OE has not completed DTAC assessment, or (c) OE is pursuing UK expansion through a different channel. Option (b) is consistent with all prior findings.
|
||||
|
||||
**What I expected but didn't find:** Any indication that OE has initiated a DTAC assessment or MHRA Class 1 registration process in anticipation of UK expansion. No press release from OE about EU or UK regulatory compliance has been found across 12 sessions.
|
||||
|
||||
**KB connections:**
|
||||
- Directly relevant to OE model opacity finding (Sessions 8-11): DTAC compliance REQUIRES clinical safety case disclosure — this is the mechanism that could force the transparency the research literature has demanded
|
||||
- Connects to NHS England's April 2025 ambient scribing guidance (DCB0160/0129) — OE Visits falls within scope
|
||||
- Extends the regulatory track finding from Session 11 to a more concrete level: 19 vendors already complied; OE has not
|
||||
- The DTAC V2 April 6 deadline (13 days from today) codifies the new form but doesn't create new substantive requirements — it's a procedural update
|
||||
|
||||
**Extraction hints:**
|
||||
- Primary claim: "NHS England's January 2026 AI scribing supplier registry established DTAC completion and MHRA Class 1 registration as compliance requirements for clinical AI documentation tools in NHS settings — OpenEvidence 'Visits' is absent despite being a direct category competitor"
|
||||
- Secondary claim: "DTAC assessment requires clinical safety case (DCB0160) disclosure — making NHS deployment an indirect forcing function for clinical AI safety transparency that market incentives have not produced"
|
||||
- This is the UK regulatory equivalent of the EU AI Act (August 2026) for documentation tools specifically
|
||||
|
||||
**Context:** NHS England is the executive body of the NHS in England, responsible for overseeing and commissioning health services. DTAC is its baseline digital governance standard. MHRA (Medicines and Healthcare products Regulatory Authority) is the UK equivalent of FDA for medical devices.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: Session 11 regulatory track finding — NHS DTAC compliance is an observable forcing function
|
||||
WHY ARCHIVED: Provides concrete evidence that the NHS regulatory compliance mechanism is operational (19 vendors), and that OE is choosing not to comply despite clear competitive incentive
|
||||
EXTRACTION HINT: Focus on OE's conspicuous absence from registry + what DTAC compliance would require (clinical safety disclosure) — this is the structural gap claim
|
||||
|
||||
|
||||
## Key Facts
|
||||
- NHS England published AI scribing supplier registry on January 16, 2026
|
||||
- 19 vendors completed DTAC + MHRA Class 1 requirements by registry launch
|
||||
- Registry applications reopened February 3, 2026, remain open indefinitely
|
||||
- DTAC V2 published February 24, 2026 with 25% fewer questions
|
||||
- April 6, 2026 deadline: ALL NHS digital health procurement must use DTAC V2
|
||||
- NHS England April 2025 guidance mandates DCB0160, DPIA, MHRA determination, DTAC for AI scribing
|
||||
- OpenEvidence 'Visits' launched August 2025 as documentation + CDSS hybrid tool
|
||||
- OpenEvidence is not on the 19-vendor NHS registry as of January 2026
|
||||
- NHS DTAC requires DCB0160 clinical safety case including hazard identification, risk assessment, risk control measures, post-market surveillance
|
||||
|
|
@ -1,74 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "OBBBA Medicaid Work Requirements: 7 States With Pending Waivers, December 2026 Federal Mandate Deadline"
|
||||
author: "Ballotpedia News / Georgetown CCF / NASHP / AMA"
|
||||
url: https://news.ballotpedia.org/2026/01/23/mandatory-medicaid-work-requirements-are-coming-what-do-they-look-like-now/
|
||||
date: 2026-01-23
|
||||
domain: health
|
||||
secondary_domains: []
|
||||
format: news
|
||||
status: null-result
|
||||
priority: medium
|
||||
tags: [obbba, medicaid, work-requirements, vbc, belief-3, structural-misalignment, enrollment-stability, vbc-attractor-state, state-policy]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
extraction_notes: "LLM returned 0 claims, 0 rejected by validator"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
As of January 23, 2026, implementation progress on OBBBA's Medicaid work requirements:
|
||||
|
||||
**Federal mandate:** All states must implement work requirements by **December 31, 2026**. States that need more time can request HHS extension to 2028.
|
||||
|
||||
**Work requirement terms:** Ages 19-64 must work or participate in qualifying activities ≥80 hours/month to maintain eligibility. Exemptions: parents of children ≤13, medically frail, and others.
|
||||
|
||||
**State-level progress (as of Jan 2026):**
|
||||
- **7 states with pending Section 1115 waivers:** Arizona, Arkansas, Iowa, Montana, Ohio, South Carolina, Utah. All still pending at CMS as of January 23.
|
||||
- **Nebraska:** Implementing via state plan amendment (without waiver), ahead of federal mandate.
|
||||
- **Early implementation states** can proceed immediately; others have until December 31, 2026, or 2028 with extension.
|
||||
|
||||
**Federal funding:** $200M for HHS implementation, $200M for states in FY2026. Required state outreach to beneficiaries: June–August 2026.
|
||||
|
||||
**Scale context:** CBO projected 5.3M people losing Medicaid coverage; implementation timeline confirms this affects 2027 coverage losses (January 1, 2027 mandatory start date was confirmed in Session 8 analysis).
|
||||
|
||||
Supporting sources: Georgetown Center for Children and Families (CCF) analysis of how OBBBA changed the waiver landscape (July 2025); NASHP state-level policy update; AMA changes to Medicaid and ACA overview; King & Spalding detailed healthcare industry review.
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters:** The work requirements implementation timeline is on track for the disruption to VBC enrollment stability that Session 8 identified as the primary mechanism by which OBBBA threatens the attractor state thesis. The December 2026 deadline means observable effects will begin January 2027. The 7-state waiver pipeline shows early-mover states are actively pursuing implementation — this is not administrative stall.
|
||||
|
||||
**What surprised me:** The Nebraska precedent — implementing without a waiver via state plan amendment. This suggests states don't even need CMS waiver approval to proceed; they can use a state plan amendment if the OBBBA statutory requirement is self-executing. This accelerates the timeline.
|
||||
|
||||
**What I expected but didn't find:** Any substantial state-level resistance or legal challenges blocking implementation. The OBBBA work requirements appear to be proceeding through regulatory channels without the court injunctions that blocked Obama-era waiver work requirements. The political landscape has shifted.
|
||||
|
||||
**KB connections:**
|
||||
- Directly extends Session 8 finding on OBBBA + VBC enrollment stability (Belief 3)
|
||||
- The December 2026 deadline means VBC plan enrollment disruption begins Q1 2027 — this is the window to watch for BALANCE model implementation being tested against enrollment fragmentation
|
||||
- Connects to OBBBA's 5.3M coverage loss (CBO) — these are disproportionately working-age adults with chronic conditions, exactly the population VBC risk-bearing plans need for prevention economics
|
||||
- The June-August 2026 required state outreach is a potential signal point: if states fail to effectively notify beneficiaries, coverage loss will exceed CBO estimates
|
||||
|
||||
**Extraction hints:**
|
||||
- This is an implementation status update for the Session 8 OBBBA claim — update the existing claim with: "seven states have pending waivers, Nebraska proceeding without waiver, December 2026 mandatory deadline confirmed"
|
||||
- Primary new claim: "OBBBA Medicaid work requirements are on track for December 2026 implementation with 7 states seeking early waivers and Nebraska proceeding via state plan amendment — enrollment disruption for VBC prevention economics begins Q1 2027"
|
||||
- Don't create a new claim; update the existing OBBBA source with this timeline confirmation
|
||||
|
||||
**Context:** Ballotpedia News provides nonpartisan tracking of state/federal policy; Georgetown CCF is the leading Medicaid policy research center. AMA and NASHP provide clinical/public health perspective. Cross-source consistency confirms the timeline.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: Belief 3 "structural misalignment" + OBBBA enrollment stability mechanism from Session 8
|
||||
WHY ARCHIVED: Implementation update confirming that the December 2026 OBBBA enrollment disruption is on track — the KB needs to update confidence from "projected" to "in-progress"
|
||||
EXTRACTION HINT: Update the existing OBBBA claim rather than creating a new one; the observation period is Q1 2027 when work requirements take full effect
|
||||
|
||||
|
||||
## Key Facts
|
||||
- As of January 23, 2026, 7 states have pending Section 1115 waivers for Medicaid work requirements: Arizona, Arkansas, Iowa, Montana, Ohio, South Carolina, Utah
|
||||
- Nebraska is implementing work requirements via state plan amendment without waiver
|
||||
- Federal mandate requires all states to implement by December 31, 2026, or request extension to 2028
|
||||
- Work requirements: ages 19-64 must work or participate in qualifying activities ≥80 hours/month
|
||||
- Exemptions include parents of children ≤13 and medically frail individuals
|
||||
- Federal funding: $200M for HHS implementation, $200M for states in FY2026
|
||||
- Required state outreach to beneficiaries: June-August 2026
|
||||
- CBO projected 5.3M people losing Medicaid coverage
|
||||
- Mandatory start date for states without extension: January 1, 2027
|
||||
|
|
@ -1,74 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "NHS DTAC V2 (February 2026): Updated Form With 25% Fewer Questions, Mandatory From April 6, 2026"
|
||||
author: "NHS England / Periculo Cyber / Acorn Compliance"
|
||||
url: https://www.periculo.co.uk/cyber-security-blog/dtac-version-2-what-digital-health-organisations-need-to-know-before-6th-april-2026
|
||||
date: 2026-02-24
|
||||
domain: health
|
||||
secondary_domains: []
|
||||
format: news
|
||||
status: enrichment
|
||||
priority: low
|
||||
tags: [nhs-dtac, regulatory-compliance, digital-health, uk-healthcare, clinical-ai-safety, belief-5]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
NHS England published an updated DTAC form on February 24, 2026. Key changes:
|
||||
|
||||
**What changed:**
|
||||
- 25% reduction in questions
|
||||
- De-duplicated with: DSPT (Data Security and Protection Toolkit) and pre-acquisition questionnaire
|
||||
- Clearer guidance on DTAC's purpose, scope, and how to complete assessments
|
||||
|
||||
**What DIDN'T change:**
|
||||
- The five core DTAC domains: Clinical Safety, Data Protection, Technical Security, Interoperability, Usability & Accessibility
|
||||
- The substantive clinical safety requirements (DCB0129/DCB0160)
|
||||
- The requirement for all NHS digital health tool procurement to use DTAC assessment
|
||||
|
||||
**Implementation:**
|
||||
- Previous version NOT to be used from April 6, 2026 onwards
|
||||
- Suppliers already on NHS supplier registries must transition to new form
|
||||
|
||||
**This is a PROCEDURAL update, not a new substantive requirement.** The compliance bar for clinical AI tools has not been raised or lowered — it's been streamlined.
|
||||
|
||||
Source also: Periculo Cyber (cyber security compliance specialists), Acorn Compliance (healthtech compliance), NHS Transformation Directorate guidance portal.
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters (or why it matters less than I anticipated):** When researching the "April 6 deadline" from Session 11, I expected to find new substantive requirements. Instead, it's a form update — 25% fewer questions, better documentation. This is administrative streamlining, not a regulatory tightening. The "mandatory" framing in NHS communications made this sound like a new compliance gate; it's actually just a form swap.
|
||||
|
||||
**What surprised me:** The de-duplication with DSPT and pre-acquisition questionnaire. This reduces friction for suppliers completing DTAC — it makes compliance EASIER, not harder. This partially undermines the "regulatory pressure forcing OE to disclose safety data" thesis from Session 11 — DTAC V2 is less burdensome, not more.
|
||||
|
||||
**What I expected but didn't find:** New Annex-III-style requirements for clinical AI specifically. The DTAC V2 update is general digital health governance (applies to apps, devices, platforms) — there's no AI-specific clinical safety update analogous to EU AI Act's Annex III. That remains a gap in UK regulation.
|
||||
|
||||
**KB connections:**
|
||||
- This corrects an overstatement from Session 11: "NHS DTAC V2 is a mandatory clinical safety standard" is accurate but the "April 6, 2026 deadline" was framed as more consequential than it is
|
||||
- The substantive compliance requirement is DCB0160 (clinical safety risk assessment) — unchanged
|
||||
- The real regulatory pressure comes from the supplier registry (January 2026) and NHS procurement requirements — not DTAC V2 specifically
|
||||
- Does NOT represent a new forcing function for OE safety disclosure; suppliers already using previous DTAC form just switch forms
|
||||
|
||||
**Extraction hints:**
|
||||
- Do NOT create a standalone claim for "DTAC V2 creates new compliance requirements" — it doesn't
|
||||
- The relevant claim is already in the KB or in the supplier registry source: "NHS procurement of digital health tools requires DTAC assessment + clinical safety case (DCB0160)"
|
||||
- This source is primarily a CORRECTION of Session 11's slightly elevated framing of the April 6 deadline
|
||||
|
||||
**Context:** Multiple compliance advisory firms (Periculo, Acorn) confirm this interpretation — DTAC V2 is an administrative update, not a new compliance threshold.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: Session 11 regulatory track finding — corrects overstatement about April 6 deadline significance
|
||||
WHY ARCHIVED: Prevents future sessions from treating the DTAC V2 April 6 deadline as a major regulatory event — it's a form update, not a new substantive requirement
|
||||
EXTRACTION HINT: Do not extract as a standalone claim; use as context correction for Session 11 regulatory track framing
|
||||
|
||||
|
||||
## Key Facts
|
||||
- NHS DTAC V2 published February 24, 2026
|
||||
- DTAC V2 has 25% fewer questions than V1
|
||||
- DTAC V2 de-duplicates with DSPT (Data Security and Protection Toolkit) and pre-acquisition questionnaire
|
||||
- DTAC V2 mandatory from April 6, 2026
|
||||
- Five core DTAC domains unchanged: Clinical Safety, Data Protection, Technical Security, Interoperability, Usability & Accessibility
|
||||
- DCB0129/DCB0160 clinical safety requirements unchanged
|
||||
- Previous DTAC version not to be used from April 6, 2026 onwards
|
||||
|
|
@ -1,68 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "PNAS 2026: US Life Expectancy Stagnation Rooted in Post-1970 Birth Cohort Mortality Deterioration"
|
||||
author: "Abrams & Bramajo et al. (UTMB researchers)"
|
||||
url: https://www.pnas.org/doi/full/10.1073/pnas.2519356123
|
||||
date: 2026-03-10
|
||||
domain: health
|
||||
secondary_domains: []
|
||||
format: research-paper
|
||||
status: enrichment
|
||||
priority: high
|
||||
tags: [life-expectancy, deaths-of-despair, birth-cohort, cardiovascular-disease, cancer, external-causes, mortality-trends, healthspan, belief-1]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
enrichments_applied: ["Americas declining life expectancy is driven by deaths of despair concentrated in populations and regions most damaged by economic restructuring since the 1980s.md", "medical care explains only 10-20 percent of health outcomes because behavioral social and genetic factors dominate as four independent methodologies confirm.md"]
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
Published in *Proceedings of the National Academy of Sciences*, March 9-10, 2026, by UTMB researchers. Using Lexis diagrams, the study analyzed mortality changes from 1979–2023 for all-cause mortality and three cause groups (cardiovascular disease, cancer, external causes) across cohorts born between the 1890s and 1980s.
|
||||
|
||||
**Key findings:**
|
||||
- The **1950s birth cohort** is the inflection point: general improvements in earlier cohorts gave way to deterioration in later cohorts.
|
||||
- Cohorts born **since 1970** exhibit **increasing mortality in cardiovascular disease, cancer, AND external causes** compared to their predecessors — across all three cause groups simultaneously.
|
||||
- A **broad period-based mortality deterioration beginning around 2010** affected nearly every living adult cohort at the time, driven primarily by cardiovascular disease mortality.
|
||||
- These patterns portend **"an unprecedented longer-run stagnation, or even sustained decline, in US life expectancy."**
|
||||
- Stagnating life expectancy is "not the result of a single cause but a complex convergence of rising chronic disease, shifting behavioral risks, and increases in certain cancers among younger adults."
|
||||
|
||||
Context: CDC separately released 2024 life expectancy data showing US LE reached 79.0 years (up 0.6 from 78.4 in 2023) — a modest COVID/overdose mortality recovery. But the PNAS cohort analysis shows this surface improvement masks structural deterioration embedded in younger cohorts.
|
||||
|
||||
Companion piece: PNAS paper "Cohort mortality forecasts indicate signs of deceleration in life expectancy gains" (doi: 10.1073/pnas.2519179122) from same period, using cohort mortality forecasts to confirm deceleration.
|
||||
|
||||
Coverage: News-Medical.net (March 10), UTMB newsroom (March 9), Subodh Verma MD on X summarizing the key cohort finding.
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters:** This is the strongest structural confirmation of Belief 1 (healthspan as civilization's binding constraint) in the past year. It's not just deaths of despair (drug overdoses — which temporarily surged and are now recovering) — it's a cohort-level deterioration across cardiovascular disease, cancer, AND external causes in Americans born after 1970. This is multi-causal, structural, and worsening.
|
||||
|
||||
**What surprised me:** The 2010 period-effect deteriorating EVERY adult cohort simultaneously. This isn't just a younger generation problem — something happened around 2010 that made ALL adult cohorts sicker. That's not a behavioral cohort story; it's a systemic environment story. This is highly relevant to the "compounding failure" framing of Belief 1.
|
||||
|
||||
**What I expected but didn't find:** Evidence of a genuine reversal or plateau in deaths-of-despair as a sign that the healthspan problem is self-correcting. The CDC's +0.6 year LE improvement in 2024 might have suggested recovery. The PNAS cohort analysis shows this is surface-level optimism — the structural problem is in the cohort trajectory.
|
||||
|
||||
**KB connections:**
|
||||
- Directly strengthens Belief 1 ("Healthspan Is Civilization's Binding Constraint") — the compounding failure is confirmed across multiple cause categories
|
||||
- Extends the deaths-of-despair framing: not just drug overdoses, but CVD and cancer also deteriorating in post-1970 cohorts
|
||||
- Connects to Belief 2 (80-90% non-clinical determinants) — if this is "rising chronic disease, shifting behavioral risks, and behavioral cancers," that's entirely within the non-clinical determinant zone
|
||||
- The "2010 period effect" is a potential new claim candidate: something environmental/social changed system-wide around 2010
|
||||
|
||||
**Extraction hints:**
|
||||
- Primary claim: "US life expectancy stagnation is driven by a cohort-level mortality deterioration in Americans born after 1970 spanning CVD, cancer, and external causes — not a single-cause problem"
|
||||
- Secondary claim: "A period-based mortality deterioration beginning around 2010 affected nearly every adult US cohort simultaneously, suggesting systemic environmental/behavioral causes beyond cohort effects"
|
||||
- Belief 1 update candidate: temporal language should shift from "binding constraint" to "worsening binding constraint with compounding cohort dynamics"
|
||||
- Counter-note: CDC 2024 shows +0.6 LE recovery — should be noted as COVID/overdose surface recovery, not structural improvement
|
||||
|
||||
**Context:** UTMB = University of Texas Medical Branch. Lead researchers Abrams and Bramajo. Independently confirmed by PNAS companion paper. This is peer-reviewed, large-n historical analysis — highest quality evidence for longitudinal claims.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: Belief 1 "healthspan is civilization's binding constraint" — structural confirmation
|
||||
WHY ARCHIVED: Direct disconfirmation target for Belief 1 in Session 12; result is that Belief 1 is CONFIRMED and STRENGTHENED, not disconfirmed
|
||||
EXTRACTION HINT: Extract as TWO claims: (1) post-1970 cohort mortality deterioration across CVD+cancer+external causes; (2) 2010 period-effect deteriorating all adult cohorts simultaneously — these have different causal implications
|
||||
|
||||
|
||||
## Key Facts
|
||||
- CDC released 2024 US life expectancy data showing 79.0 years (up 0.6 from 78.4 in 2023)
|
||||
- PNAS published companion paper 'Cohort mortality forecasts indicate signs of deceleration in life expectancy gains' (doi: 10.1073/pnas.2519179122)
|
||||
- Study analyzed mortality changes from 1979–2023 for all-cause mortality and three cause groups (cardiovascular disease, cancer, external causes) across cohorts born between the 1890s and 1980s
|
||||
- 1950s birth cohort identified as inflection point where mortality improvements gave way to deterioration
|
||||
|
|
@ -1,84 +0,0 @@
|
|||
---
|
||||
type: source
|
||||
title: "iatroX Clinical AI Insights 2026: OpenEvidence Has No DTAC Assessment or MHRA Registration for UK Deployment — US-Centric Corpus Adds Clinical Risk"
|
||||
author: "iatroX Clinical AI Insights"
|
||||
url: https://www.iatrox.com/blog/openevidence-chatgpt-5-medwise-ai-iatrox-uk-clinicians-dtac-nice-esf
|
||||
date: 2026-03-20
|
||||
domain: health
|
||||
secondary_domains: []
|
||||
format: blog-analysis
|
||||
status: enrichment
|
||||
priority: medium
|
||||
tags: [openevidence, nhs-dtac, nice-esf, uk-healthcare, clinical-ai-safety, belief-5, regulatory-compliance, corpus-bias]
|
||||
processed_by: vida
|
||||
processed_date: 2026-03-24
|
||||
enrichments_applied: ["OpenEvidence became the fastest-adopted clinical technology in history reaching 40 percent of US physicians daily within two years.md", "healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software.md"]
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
---
|
||||
|
||||
## Content
|
||||
|
||||
iatroX Clinical AI Insights is a UK-focused clinical AI review publication that evaluates tools through the lens of NHS governance requirements (DTAC, NICE Evidence Standards Framework, MHRA). Multiple 2025-2026 reviews compare OpenEvidence against UK-compliant alternatives.
|
||||
|
||||
**Key findings from multiple iatroX reviews:**
|
||||
|
||||
**1. OE UK governance status:**
|
||||
- "OpenEvidence's UK-specific governance (DTAC/DCB) is not explicitly positioned on its public pages"
|
||||
- OE qualifies as a US-focused tool being used informally by UK clinicians — not formally NHS-deployed
|
||||
- OE has no published DTAC assessment, no MHRA Class 1 registration listed, no NICE ESF submission
|
||||
|
||||
**2. US-centric corpus clinical risk:**
|
||||
- OE is "built on a US-centric corpus"
|
||||
- May cite AHA (American Heart Association) guidelines instead of NICE guidelines
|
||||
- May suggest FDA-approved drugs that are: (a) not licensed in the UK, or (b) not cost-effective for NHS prescribing (not on formulary)
|
||||
- May reference dosing standards or treatment pathways that differ from BNF (British National Formulary)
|
||||
- This is a CLINICAL SAFETY RISK for UK physicians, distinct from the demographic bias or automation bias documented in prior sessions
|
||||
|
||||
**3. OE 2026 UK expansion signals:**
|
||||
- OE has "signalled plans for global expansion as a key 2026 and beyond initiative"
|
||||
- UK, Canada, Australia identified as "English-first markets with lower regulatory barriers"
|
||||
- But "lower regulatory barriers" perception may be inaccurate for UK: NHS requires DTAC + MHRA Class 1 for formal deployment
|
||||
|
||||
**4. OE "Visits" documentation tool (August 2025):**
|
||||
- OE Visits auto-generates clinical notes + enriches with evidence-based guidelines
|
||||
- Described as "hybrid documentation+CDSS" — directly competes with the 19 registered NHS AVT suppliers
|
||||
- Not on NHS England's supplier registry (launched January 2026)
|
||||
- Would require DTAC + MHRA Class 1 for formal NHS procurement
|
||||
|
||||
**5. UK landscape context:**
|
||||
- UK-native compliant alternatives exist: iatroX, Medwise AI, Praktiki, Pathway — all DTAC-compliant with UK guideline corpus
|
||||
- NHS England's April 2025 ambient scribing guidance requires clinical safety case (DCB0160), DPIA, mandatory human verification
|
||||
|
||||
## Agent Notes
|
||||
|
||||
**Why this matters:** iatroX provides the clearest independent assessment of what OE's governance gap means for UK clinical practice. The corpus risk is a different category from the demographic bias / automation bias concerns documented in prior sessions — it's not about LLM failure modes but about CONTENT misalignment with clinical practice guidelines. A UK physician querying OE about hypertension management may receive AHA recommendations (different thresholds than NICE) or be directed to drugs not available on NHS formulary. This is immediately actionable clinical harm, not a probabilistic risk.
|
||||
|
||||
**What surprised me:** OE characterizing UK as a market with "lower regulatory barriers" relative to the US. The UK NHS actually has MORE formal digital health procurement governance than the US (no equivalent to DTAC in the US at federal level). OE's US-market framing may be a strategic misjudgment about UK regulatory requirements.
|
||||
|
||||
**What I expected but didn't find:** Any indication that OE has begun a DTAC assessment process in preparation for its stated 2026 UK expansion. Given the January 2026 supplier registry launch and April 6 DTAC V2 deadline, OE has had 3+ months to begin compliance — and no announcement.
|
||||
|
||||
**KB connections:**
|
||||
- New failure mode for OE in UK context: US corpus → guideline mismatch → wrong recommendations for UK practice (distinct from demographic bias, automation bias, misinformation propagation)
|
||||
- Directly extends the OE safety opacity thread from Sessions 8-11 into the UK market context
|
||||
- The 19-vendor registry provides UK competitive context: OE Visits is behind UK-native tools in governance compliance
|
||||
- Connects to the EU AI Act forcing function: if OE targets UK/EU expansion, regulatory compliance is not optional
|
||||
|
||||
**Extraction hints:**
|
||||
- New claim: "OpenEvidence's US-centric corpus creates a clinical safety risk for UK physicians that is distinct from LLM failure modes: AHA vs. NICE guideline misalignment and off-formulary drug suggestions in a market where OE has no DTAC assessment or MHRA registration"
|
||||
- This claim is PROVEN (the governance gap is documented; the corpus misalignment is documented; no counter-evidence from OE)
|
||||
- This is a UK-specific extension of the Session 11 "OE model opacity" finding — different mechanism, same transparency gap
|
||||
|
||||
**Context:** iatroX is an independent UK clinical AI review publication. Not affiliated with any AI company. Reviews are conducted from a clinical governance perspective. Multiple consistent reviews across 2025-2026 confirm the governance gap.
|
||||
|
||||
## Curator Notes
|
||||
PRIMARY CONNECTION: OE model opacity thread (Sessions 8-11) — extended to UK clinical corpus mismatch
|
||||
WHY ARCHIVED: Provides a previously undocumented clinical risk category for OE in non-US markets: guideline mismatch, not just LLM failure modes
|
||||
EXTRACTION HINT: Extract as "OE UK deployment risk" claim, keeping scope to UK clinical practice (NICE vs. AHA corpus misalignment); link to DTAC absence finding
|
||||
|
||||
|
||||
## Key Facts
|
||||
- NHS England launched ambient scribing supplier registry in January 2026 with 19 registered vendors
|
||||
- NHS England's April 2025 ambient scribing guidance requires clinical safety case (DCB0160), DPIA, mandatory human verification
|
||||
- DTAC V2 deadline was April 6, 2026
|
||||
- OpenEvidence Visits launched August 2025 as hybrid documentation+CDSS tool
|
||||
- UK-native DTAC-compliant alternatives include: iatroX, Medwise AI, Praktiki, Pathway
|
||||
Loading…
Reference in a new issue