From f0d6522cb4cd2515a89d80bc87ff817b008c2d1c Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Tue, 21 Apr 2026 04:23:30 +0000 Subject: [PATCH] =?UTF-8?q?vida:=20research=20session=202026-04-21=20?= =?UTF-8?q?=E2=80=94=2015=20sources=20archived?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pentagon-Agent: Vida --- agents/vida/musings/research-2026-04-21.md | 113 ++++++++++++++++++ agents/vida/research-journal.md | 28 +++++ ...-tentative-approval-generic-semaglutide.md | 61 ++++++++++ ...al-mh-equity-medicaid-provider-gap-jmir.md | 77 ++++++++++++ ...1-goh-jama-llm-diagnostic-reasoning-rct.md | 62 ++++++++++ ...-21-heudel-ai-deskilling-scoping-review.md | 60 ++++++++++ ...1-hrsa-behavioral-health-workforce-2025.md | 74 ++++++++++++ ...1-jorem-telehealth-mental-health-access.md | 56 +++++++++ ...-medicaid-mental-health-treatment-rates.md | 62 ++++++++++ ...tal-health-workforce-shortage-2025-2026.md | 53 ++++++++ ...exus-telehealth-deprivation-disparities.md | 60 ++++++++++ ...ammography-optional-use-nature-medicine.md | 63 ++++++++++ ...ubmed-null-result-ai-durable-upskilling.md | 53 ++++++++ ...1-savardi-radiology-ai-error-resilience.md | 59 +++++++++ ...e-mental-health-apps-efficacy-attrition.md | 75 ++++++++++++ ...21-telehealth-disparities-2019-2020-jtt.md | 59 +++++++++ ...ho-glp1-obesity-guideline-december-2025.md | 69 +++++++++++ 17 files changed, 1084 insertions(+) create mode 100644 agents/vida/musings/research-2026-04-21.md create mode 100644 inbox/queue/2026-04-21-apotex-fda-tentative-approval-generic-semaglutide.md create mode 100644 inbox/queue/2026-04-21-digital-mh-equity-medicaid-provider-gap-jmir.md create mode 100644 inbox/queue/2026-04-21-goh-jama-llm-diagnostic-reasoning-rct.md create mode 100644 inbox/queue/2026-04-21-heudel-ai-deskilling-scoping-review.md create mode 100644 inbox/queue/2026-04-21-hrsa-behavioral-health-workforce-2025.md create mode 100644 inbox/queue/2026-04-21-jorem-telehealth-mental-health-access.md create mode 100644 inbox/queue/2026-04-21-kff-medicaid-mental-health-treatment-rates.md create mode 100644 inbox/queue/2026-04-21-mental-health-workforce-shortage-2025-2026.md create mode 100644 inbox/queue/2026-04-21-pnas-nexus-telehealth-deprivation-disparities.md create mode 100644 inbox/queue/2026-04-21-praim-mammography-optional-use-nature-medicine.md create mode 100644 inbox/queue/2026-04-21-pubmed-null-result-ai-durable-upskilling.md create mode 100644 inbox/queue/2026-04-21-savardi-radiology-ai-error-resilience.md create mode 100644 inbox/queue/2026-04-21-smartphone-mental-health-apps-efficacy-attrition.md create mode 100644 inbox/queue/2026-04-21-telehealth-disparities-2019-2020-jtt.md create mode 100644 inbox/queue/2026-04-21-who-glp1-obesity-guideline-december-2025.md diff --git a/agents/vida/musings/research-2026-04-21.md b/agents/vida/musings/research-2026-04-21.md new file mode 100644 index 000000000..6f381511b --- /dev/null +++ b/agents/vida/musings/research-2026-04-21.md @@ -0,0 +1,113 @@ +--- +type: musing +domain: health +session: 24 +date: 2026-04-21 +status: active +--- + +# Research Session 24 — Clinical AI Deskilling Divergence + Digital Mental Health Access Expansion + +## Research Question + +**Primary:** Is there counter-evidence for AI-induced clinical deskilling — specifically, prospective studies showing AI calibrates or up-skills clinicians durably (not just while AI is present) — and does this evidence create a genuine divergence that changes the existing deskilling claim's confidence level? + +**Secondary:** Is digital mental health actually scaling to underserved populations in 2025-2026, or does the existing KB claim (technology "primarily serves the already-served") still hold? + +**Why this question now:** +Session 23 closed the loop on GLP-1 behavioral adherence. Two claims are READY TO EXTRACT from the extractor (GLP-1 access inversion, USPSTF gap). The most productive research direction for this session is the open structural question from Session 23: + +- The clinical AI deskilling body of evidence has grown substantially (1 → 5+ quantitative findings, Natali 2025 synthesis). But Session 23 flagged a potential divergence: AI IMPROVES performance while present AND reduces performance when absent. These aren't contradictory — they're two halves of the same dependency mechanism. But the divergence file hasn't been created yet. +- If counter-evidence exists showing AI durably improves skills (calibration studies, error-reduction RCTs), the divergence is genuine. If not, the deskilling pattern is one-directional. +- The mental health thread is flagged as a KB thin area: "what DOES work for scalable mental health delivery." Zero evidence archived on whether digital therapeutics are expanding access vs. serving already-served. + +## Keystone Belief + +**Belief 1: Healthspan is civilization's binding constraint, and we are systematically failing at it in ways that compound.** + +**Disconfirmation target:** +The specific grounding chain to challenge: the mental health supply gap is widening, not closing. If digital mental health is genuinely expanding access to previously underserved populations (Medicaid, rural, uninsured, non-English speaking), that would mean ONE layer of the compounding failure is being addressed. This wouldn't disconfirm Belief 1 wholesale, but it would complicate the "systematically failing" framing and require belief revision. + +**Belief 5 disconfirmation target:** +If there are prospective studies showing AI PREVENTS clinical errors durably (not just while present), that would weaken the "novel safety risks" framing. The existing claim [[human-in-the-loop clinical AI degrades to worse-than-AI-alone...]] has confidence: likely. Evidence of durable up-skilling would challenge this. + +**What I expected to find:** +- No prospective studies showing durable AI up-skilling; the calibration evidence probably exists for narrow tasks but not generalized to clinical skill development +- Digital mental health access expansion: mixed — some promising evidence for specific modalities (text-based, app-based) reaching underserved populations, but structural barriers (internet access, digital literacy) limiting reach +- The deskilling divergence is real but lopsided: strong evidence for AI dependency/deskilling; weak or absent evidence for durable calibration/up-skilling + +## What I Searched For + +- Clinical AI up-skilling calibration prospective studies 2025-2026 (durable skill improvement with AI) +- Clinical AI error reduction RCT evidence beyond diagnostic accuracy (does AI prevent wrong decisions that humans make?) +- Digital mental health Medicaid rural underserved access expansion 2025-2026 +- Digital mental health scale access equity evidence +- USPSTF weight loss pharmacotherapy update 2026 (quick check — Session 23 said dead end but worth one re-check) +- GLP-1 biosimilar timeline FDA approval 2025-2026 (whether US generic access is moving faster than 2032 estimate) + +## Key Findings + +### 1. DISCONFIRMATION TEST RESULT — Clinical AI Up-Skilling: NULL (Belief 5 strengthened) + +**The disconfirmation question:** Is there peer-reviewed evidence that AI exposure durably improves physician clinical skills? + +**Answer: No — zero papers found.** PubMed search for "AI clinical decision support physician performance up-skilling calibration" (2024-2026) returned zero results. After 5+ years of large-scale clinical AI deployment (92% scribe adoption, 40% of physicians daily on OpenEvidence), no prospective study documents durable physician skill improvement from AI exposure. + +**The complement:** The deskilling literature is growing in the same period: +- Heudel et al. 2026 (ESMO, PMID 41890350): scoping review through August 2025. Evidence "consistent across specialties." Four specialties documented: colonoscopy (ADR 28.4% → 22.4%), radiology (12% false-positive increase), pathology (30%+ reversal of correct diagnoses), cytology (80-85% volume reduction → training pipeline destruction). +- The cytology finding is new to this session: lab consolidation from 45 to 8 centers reduces training case volumes by 80-85%. This is never-skilling via structural destruction of apprenticeship infrastructure — not cognitive dependency, but pipeline elimination. +- The null result on up-skilling is itself the finding: the deskilling literature has no peer-reviewed counterweight. + +**Belief 5 status:** SIGNIFICANTLY STRENGTHENED. The deskilling case is now one-directional: consistent cross-specialty empirical evidence of deskilling + never-skilling, zero peer-reviewed evidence of durable up-skilling, confirmed by a formal scoping review (Heudel 2026) that found no counter-evidence. + +### 2. Digital Mental Health Access: NOT CLOSING THE GAP (Belief 1 not disconfirmed) + +**The disconfirmation question:** Is digital mental health technology expanding access to underserved populations, complicating the "systematically failing" framing? + +**Answer: No — multiple convergent findings confirm the technology-primarily-serves-already-served thesis.** + +**Finding A — Jorem et al. 2026, JAMA Network Open (PMID 41784959):** 17,742 mental health specialists, 2018-2023 Medicare claims. Mental health telemedicine expansion associated with only 0.88 percentage points more rural visits. **Highest telemedicine providers see 3.55 percentage points FEWER new patients** than low-telemedicine providers — telemedicine is used for existing relationship retention, not new patient acquisition from underserved areas. Conclusion: "additional policy interventions may be required to achieve telemedicine's potential." + +**Finding B — Journal of Telemedicine and Telecare 2025:** 2019-2020 Medicare claims. COVID telehealth expansion EXPANDED disparities. Rural patients were MORE likely to use telehealth in 2019 (early adopters), LESS likely in 2020 (crowded out by urban surge). "Many patients in greatest need of healthcare are least likely to utilize telehealth services." + +**Finding C — Lancet Digital Health 2025 + npj Digital Medicine 2025:** Smartphone mental health apps have real efficacy (Hedges' g = 0.43) but 64% attrition in motivated, self-selected RCT participants. Real-world reach in underserved populations (lower digital literacy, privacy concerns, cultural/linguistic barriers) would be substantially lower. The populations with greatest treatment gap face highest engagement barriers. + +**Finding D — KFF 2025:** Medicaid adults with mental illness receive treatment at HIGHER rates than commercially insured (59% vs. 55%) — the largest unmet need is among the uninsured (63% unmet need). The primary access failure is not Medicaid populations but the uninsured. This reframes the problem: coverage matters more than technology. + +**Finding E — Mental health workforce shortage (JAPNA 2025, Nursing Clinics 2026):** 51-55 million Americans restricted by provider shortage. Shortage worsening. Telehealth proposed as mitigation but not resolving the structural gap. + +**Belief 1 status:** NOT DISCONFIRMED. The "systematically failing" framing holds. Technology is not closing the access gap for underserved populations — it's serving existing patients more conveniently. The structural gap (51-55 million affected, shortage worsening, digital tools with 64% attrition in best-case conditions) is not being offset by technology deployment. Coverage (Medicaid) matters more than technology for actual treatment rates. + +### 3. COUNTERINTUITIVE FINDING — Medicaid outperforms commercial insurance on mental health treatment rates + +Medicaid adults with mental illness receive treatment at 59% vs. 55% for commercially insured — Medicaid is actually the better mental health coverage vehicle. The structural explanation: Medicaid has historically stronger behavioral health infrastructure (behavioral health carve-outs, FQHCs, community mental health centers) than commercial plans, which have narrow behavioral health networks despite parity requirements. The primary access gap is for the uninsured (37% treatment rate vs. 63% unmet need). + +### 4. GLP-1 Biosimilars — Already in KB (no new archiving needed) + +Background agent search found an existing KB claim: "Indian generic semaglutide exports enabled by evergreening rejection create a global access pathway before US patent expiry" (Delhi High Court ruling, March 2026). This thread is covered. The claim shows US patents remain active until 2031-2033, with Canadian high-income market launch in May 2026 as first test case. No new archiving needed. + +## Follow-up Directions + +### Active Threads (continue next session) + +- **Clinical AI deskilling divergence file:** The evidence is now sufficient to create a divergence file between "AI deskilling (performance declines when AI removed)" and "AI up-skilling while present (performance improves with AI assistance)." These are both true simultaneously — the dependency mechanism. The null result on durable up-skilling makes this a lopsided divergence with strong deskilling evidence and zero up-skilling counter-evidence, but the divergence captures the important structural tension. **Next session: draft the divergence file.** Files to reference: [[human-in-the-loop clinical AI degrades to worse-than-AI-alone...]] + [[AI diagnostic triage achieves 97 percent sensitivity...]]. + +- **Cytology never-skilling claim:** The Heudel 2026 finding on 80-85% training volume reduction (45 → 8 labs) is a new structural pathway distinct from cognitive deskilling. This is extractable as a standalone claim: "AI-enabled screening consolidation eliminates the training case volumes that develop clinical judgment, creating never-skilling through structural destruction of apprenticeship pipelines." The cytology case is the cleanest example. **Next session: extract this claim from Heudel 2026.** + +- **Medicaid mental health advantage:** The KFF finding (Medicaid 59% > commercial 55% treatment rate) is counterintuitive and extractable. The structural explanation (Medicaid behavioral health carve-outs + FQHC infrastructure) is more interesting than the raw number. **Next session: verify with additional KFF/SAMHSA data and extract if confirmed.** + +- **Mental health app attrition claim:** The 64% attrition in motivated RCT samples (Lancet Digital Health 2025, npj Digital Medicine 2025) is extractable as evidence for why digital mental health doesn't close the population-level access gap even when efficacy is real. **Next session: extract the two-part finding (real efficacy + engagement failure).** + +### Dead Ends (don't re-run these) + +- **GLP-1 biosimilars/USPSTF status:** GLP-1 biosimilar thread already covered by existing KB claim (Indian generics, Delhi HC ruling). USPSTF GLP-1 update — confirmed dead end from Session 23, nothing new. Don't re-run these searches. + +- **AI durable up-skilling literature search:** Confirmed null. Zero papers in PubMed. Don't search again for 6 months unless there's a specific trigger (RCT publication announced, medical school prospective study published). + +- **Health Affairs/SAMHSA/APA direct website fetches:** These URLs consistently return 403 errors. Use PubMed searches and KFF instead for US health data. + +### Branching Points (one finding opened multiple directions) + +- **Jorem et al. "fewer new patients" finding:** Direction A — extract as standalone claim about telemedicine's retention vs. access-expansion mechanism; Direction B — frame as divergence between "telemedicine solves the access gap" (optimistic thesis) and "telemedicine serves existing relationships" (Jorem finding). Direction A first; the divergence can come later when there's a real competing claim. + +- **Mental health treatment gap coverage reframe:** Direction A — extract the Medicaid > commercial finding as a structural claim about behavioral health carve-outs; Direction B — use this to challenge the "serving the already-served" framing (Medicaid IS the most-served by mental health systems, but that's because Medicaid was designed for vulnerable populations). These aren't contradictory — pursue both, but frame carefully to avoid false tension. diff --git a/agents/vida/research-journal.md b/agents/vida/research-journal.md index be7e12373..518bab039 100644 --- a/agents/vida/research-journal.md +++ b/agents/vida/research-journal.md @@ -1,5 +1,33 @@ # Vida Research Journal +## Session 2026-04-21 — Clinical AI Deskilling Divergence + Digital Mental Health Access: Both Null Disconfirmations + +**Question:** (1) Is there counter-evidence for AI-induced clinical deskilling — prospective studies showing AI calibrates or up-skills clinicians durably? (2) Is digital mental health technology actually expanding access to underserved populations? + +**Belief targeted:** Belief 5 (clinical AI creates novel safety risks) via disconfirmation — searched for durable up-skilling evidence. Belief 1 (systematically failing in compounding ways) via disconfirmation — searched for digital mental health closing the access gap for underserved. + +**Disconfirmation result:** DOUBLE NULL — both disconfirmation searches failed to find counter-evidence: + +(1) AI durable up-skilling: **CONFIRMED NULL**. PubMed search for durable physician skill improvement from AI exposure (2024-2026) returned zero results. Heudel et al. 2026 scoping review (ESMO, PMID 41890350) reviewed all available evidence through August 2025 and found no counter-evidence to deskilling. The deskilling case is now one-directional — consistent evidence of deskilling, zero peer-reviewed evidence of durable up-skilling. Belief 5 significantly strengthened. + +(2) Digital mental health access expansion: **NOT DISCONFIRMED**. Three independent lines of evidence confirm "serves already-served": Jorem et al. 2026 (JAMA Net Open) — highest telemedicine providers see 3.55 pp FEWER new patients, only 0.88 pp more rural visits; JTT 2025 — COVID telehealth expansion EXPANDED rural/demographic disparities; Lancet Digital Health/npj Digital Medicine 2025 — 64% attrition in motivated RCT participants. Coverage (Medicaid) matters more than technology — Medicaid adults have HIGHER treatment rates than commercial (59% vs 55%). + +**Key finding:** Cytology never-skilling mechanism (Heudel 2026): AI-enabled screening consolidation reduced training case volumes 80-85% (45 → 8 UK labs). This is never-skilling via structural destruction of apprenticeship infrastructure — not cognitive dependency but pipeline elimination. It's irreversible without rebuilding training infrastructure and is the most alarming mechanism in the deskilling literature. + +Secondary key finding: Jorem et al. 2026 "fewer new patients" finding — high-telemedicine mental health providers see FEWER new patients (3.55 pp), not more. Telemedicine is a retention tool for existing relationships, not an access expansion tool. This is the mechanism explaining why mental health telemedicine fails to serve underserved populations despite theoretical geographic reach. + +Counterintuitive finding: Medicaid adults with mental illness receive treatment at HIGHER rates than commercially insured (59% vs 55%). The primary mental health access failure is for the uninsured (37% treatment rate, 63% unmet need), not Medicaid populations. + +**Pattern update:** Sessions 1-24 now show a consistent pattern: every attempt to disconfirm Belief 1 ("systematically failing in compounding ways") and Belief 5 ("novel safety risks from clinical AI") instead produces confirmation or strengthening. Session 24's double null is the clearest instance yet — the disconfirmation searches found nothing. In principle, consistent null results could reflect filter bias (I'm not searching in the right places) — but the Heudel 2026 scoping review is the strongest possible counter to this concern: it specifically looked for counter-evidence and found none. + +The deskilling pattern is now: (1) cognitive deskilling (performance decline when AI removed); (2) automation bias (commission errors from following incorrect AI); (3) never-skilling via cognitive pipeline (no productive struggle); (4) never-skilling via structural pipeline (training volume destruction). Four distinct pathways, all empirically documented. + +**Confidence shift:** +- Belief 5 (clinical AI creates novel safety risks): **STRONGLY STRENGTHENED** — one-directional evidence base confirmed by formal scoping review. Zero counter-evidence. Cytology never-skilling is a new structural mechanism. +- Belief 1 ("systematically failing in compounding ways"): **UNCHANGED BUT SCOPE EXTENDED** — digital mental health adds another documented technology-doesn't-fix-it layer. Apps work at individual level (g=0.43) but 64% attrition limits population reach. The "systematically failing" claim is confirmed across yet another dimension (mental health technology access). + +--- + ## Session 2026-04-13 — USPSTF GLP-1 Gap + Behavioral Adherence: Continuous-Delivery Thesis Complicated **Question:** What is the current USPSTF status on GLP-1 pharmacotherapy recommendations, and are behavioral adherence programs closing the gap that coverage alone can't fill — particularly for the 85.7% of commercially insured GLP-1 users who don't achieve durable metabolic benefit? diff --git a/inbox/queue/2026-04-21-apotex-fda-tentative-approval-generic-semaglutide.md b/inbox/queue/2026-04-21-apotex-fda-tentative-approval-generic-semaglutide.md new file mode 100644 index 000000000..d57172f32 --- /dev/null +++ b/inbox/queue/2026-04-21-apotex-fda-tentative-approval-generic-semaglutide.md @@ -0,0 +1,61 @@ +--- +type: source +title: "First US FDA tentative approval for generic semaglutide injection (Apotex/Orbicular, April 10, 2026) signals US generic machinery in motion but marketing blocked by patents until ~2032" +author: "Apotex press release; HCPLive; FDA biosimilar product list" +url: https://www.apotex.com/global/news/news-release/2026/04/10/apotex-receives-first-us-fda-tentative-approval-for-a-generic-version-of-ozempic-semaglutide-injection-in-partnership-with-orbicular +date: 2026-04-10 +domain: health +secondary_domains: [] +format: press-release +status: unprocessed +priority: medium +tags: [GLP-1, semaglutide, generic, FDA, biosimilar, patent, access] +--- + +## Content + +**Event:** April 10, 2026 — FDA granted tentative approval to Apotex Inc. for an ANDA (Abbreviated New Drug Application) for generic semaglutide injection, developed in partnership with Orbicular Pharmaceutical Technologies. + +**What "tentative approval" means:** +- Confirms the application meets FDA standards for quality, safety, and efficacy +- Does NOT permit commercial marketing because patent and exclusivity barriers from Novo Nordisk's 154 US patents remain in place +- As the first ANDA filer, Apotex would receive 180 days of market exclusivity upon eventual commercial launch + +**Why this matters despite no commercial launch:** +This is the first US FDA action on any generic semaglutide application — it signals that the generic industry has cleared regulatory hurdles and the only remaining barriers are legal/patent. The tentative approval validates Apotex's formulation and manufacturing quality, positioning them for immediate commercial launch if/when patents are successfully challenged. + +**Patent timeline:** +- Semaglutide primary compound patent: expired March 20, 2026 in US, but Patent Term Extension extends key '343 patent to December 5, 2031 +- Novo holds 154 granted US patents across 320 applications — substantial thicket +- Realistic US generic market entry: 2031-2033, with 2032 as central estimate +- No Paragraph IV patent challenge ruling is yet public + +**Contrast with international:** +Zero semaglutide generics appear on FDA's approved (non-tentative) biosimilar/generic product list as of April 2026. Meanwhile, generic launches are occurring in India, Canada, China, Brazil, Turkey. The US is 5-7 years behind international markets on generic access. + +**IRA context:** CMS negotiated Maximum Fair Price (MFP) for semaglutide of **$274/month**, effective January 2027 for Medicare Part D. This is the IRA price negotiation outcome — distinct from the Trump MFN deal ($245/month Medicare/Medicaid). The IRA price creates a second structural price reduction mechanism taking effect in 2027. + +## Agent Notes + +**Why this matters:** The Apotex tentative approval is the first concrete signal that the US generic entry machinery is in motion. The timing matters: primary patent expiration (March 2026) → tentative approval (April 2026) → 5-year patent extension blocking commercial launch (until 2031-2033). This is the "patent thicket" mechanism playing out in real time. The generic industry is ready and waiting; Novo's 154 patents are the only barrier. + +**What surprised me:** The speed — just 21 days after the primary compound patent expired (March 20), FDA issued tentative approval (April 10). This suggests Apotex filed the ANDA well in advance and FDA had it ready. The generic competition pressure is real and immediate in the legal sense, even if commercial launch is years away. + +**What I expected but didn't find:** A Paragraph IV patent challenge ruling (Apotex challenging specific Novo patents). If Apotex wins a Paragraph IV challenge, they could launch before 2031. No such ruling is yet public as of April 2026. + +**KB connections:** +- [[GLP-1 receptor agonists are the largest therapeutic category launch in pharmaceutical history but their chronic use model makes the net cost impact inflationary through 2035]] — existing claim already references $245/month MFN price and price compression trajectory. The tentative approval is a new evidence point that the 2032 timeline is the operative constraint, not regulatory uncertainty. +- Existing claim: "Indian generic semaglutide exports enabled by evergreening rejection create a global access pathway before US patent expiry" — the Apotex US tentative approval is the domestic counterpart: US generic machinery is moving, patents are the only barrier. + +**Extraction hints:** +- The tentative approval is probably best used as an enrichment to the existing GLP-1 access claim rather than a standalone claim +- The IRA $274/month negotiated price (January 2027) is worth noting as a separate structural price reduction mechanism — two now exist (MFN $245 + IRA $274) for public program populations +- Key claim to consider: "The US GLP-1 pricing structure has shifted from ~$1,000/month to ~$245-274/month for public program populations via Trump MFN deal and IRA negotiation, with commercial market excluded — creating a two-tier affordability system" + +## Curator Notes + +PRIMARY CONNECTION: [[GLP-1 receptor agonists are the largest therapeutic category launch in pharmaceutical history but their chronic use model makes the net cost impact inflationary through 2035]] + +WHY ARCHIVED: First US FDA tentative approval for generic semaglutide (April 10, 2026) — confirms generic machinery in motion while patents block commercial launch. Also includes IRA negotiated price ($274/month, January 2027) not yet in KB. + +EXTRACTION HINT: Best used as enrichment evidence for the existing GLP-1 economics claim. The two-tier pricing system (public programs $245-274; commercial market excluded) is the most extractable new claim. diff --git a/inbox/queue/2026-04-21-digital-mh-equity-medicaid-provider-gap-jmir.md b/inbox/queue/2026-04-21-digital-mh-equity-medicaid-provider-gap-jmir.md new file mode 100644 index 000000000..74ee94464 --- /dev/null +++ b/inbox/queue/2026-04-21-digital-mh-equity-medicaid-provider-gap-jmir.md @@ -0,0 +1,77 @@ +--- +type: source +title: "Medicaid facilities 25% less likely to offer telehealth; majority-Black counties 42% less likely to have telehealth-offering facilities — coverage-to-access gap is structural" +author: "Multiple authors (JMIR 2024; ASPE/HHS Medicaid telehealth trends 2019-2021)" +url: https://www.jmir.org/2024/1/e59939 +date: 2024-01-01 +domain: health +secondary_domains: [] +format: journal-article +status: unprocessed +priority: high +tags: [telehealth, mental-health, Medicaid, access-equity, provider-participation, structural-barrier, race] +--- + +## Content + +**Primary source:** JMIR 2024. "Equity in Digital Mental Health Interventions in the US." e59939. + +**Supporting source:** ASPE/HHS. "Medicaid and CHIP Telehealth Utilization: Enrollee and Provider Rurality, 2019-2021." 2024. + +**Key findings:** + +**Provider participation gap (the core mechanism):** +1. Facilities accepting Medicaid were **~25% less likely to offer telehealth services** than non-Medicaid facilities — the populations with the most need for telehealth (Medicaid enrollees) are served by providers least likely to offer it +2. Facilities in counties with greater than **20% Black residents were 42% less likely** to offer telehealth services compared to predominantly White counties +3. Medicaid/CHIP-enrolled children in counties with higher Black and Hispanic populations were **less likely to receive telemental health services** + +**Coverage-to-access gap:** +- 46 state Medicaid programs now reimburse audio-only telehealth in some form (up from near-zero pre-2020) +- 37 states allow FQHCs to serve as distant-site telehealth providers +- But: coverage does not equal access when providers don't participate + +**Audio-only telehealth — the equity-relevant exception:** +- Medicare beneficiaries who are older, racial/ethnic minorities, dual-enrolled, rural, or have low broadband access are significantly more likely to use audio-only than video-based telehealth +- Audio-only reaches the populations that cannot manage video — it is functionally the most equitable modality +- Maryland is cited as the only state that has legislatively expanded Medicaid telehealth definition to include text messaging + +**What is reaching underserved populations:** +- Audio-only telehealth +- Crisis Text Line (over-indexes on young, rural, low-income users) +- FQHCs adopting telemental health showed 5-7% increase in visit rates among Medicaid and low-income groups +- Culturally adapted digital interventions (effect size g=0.90 for racial/ethnic minorities vs g=0.43 for standard apps) — though attrition remains 42% even in these adapted programs + +**What is reinforcing disparities:** +- Video-based telehealth (dominant modality): 1.62-1.67x more common in low-deprivation areas (PNAS Nexus 2025) +- Standalone apps (BetterHelp, Headspace, Calm): cost $260-400/month, no Medicaid coverage, predominantly insured/higher-income/younger/White users +- Text therapy (Talkspace, BetterHelp messaging): $65-100/week, no Medicaid coverage in virtually all states + +**JMIR meta-finding:** "No specific equity data by modality or population currently exists in peer-reviewed literature" — the field acknowledges the evidence gap, suggesting disparities are systematically understated. + +## Agent Notes + +**Why this matters:** This source identifies the precise structural mechanism behind the "serving the already-served" pattern: Medicaid facilities are LESS likely to offer telehealth, and majority-Black counties are far less likely to have telehealth-offering providers. The coverage expansion (46 states now reimburse audio-only) is real but doesn't translate to access because provider participation follows the same disparities as in-person care. Telehealth doesn't eliminate structural barriers — it may reproduce them in digital form. + +The audio-only finding is the most important partial positive signal: among modalities, audio-only over-indexes on precisely the populations (older, minority, rural, dual-enrolled) that video-based telehealth underserves. This suggests that audio-only policy is the equity-relevant lever. + +**What surprised me:** The culturally adapted digital intervention effect size (g=0.90 vs g=0.43 for standard apps) is a large and meaningful difference. If culturally adapted programs achieve double the effect size for racial/ethnic minorities, this is a strong argument that the "apps don't work" finding is partly a cultural adaptation failure, not a technology failure. The 42% attrition even in culturally adapted programs is still high, but the efficacy signal is stronger. + +**What I expected but didn't find:** Evidence that the gap between Medicaid and non-Medicaid provider telehealth participation was narrowing. The JMIR data describes the current state without trend data. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — the provider participation gap (Medicaid facilities 25% less likely to offer telehealth) is the structural mechanism +- [[SDOH interventions show strong ROI but adoption stalls because Z-code documentation remains below 3 percent and no operational infrastructure connects screening to action]] — parallel structural mechanism: policy enables something (telehealth coverage, Z-code documentation) that doesn't translate to practice because operational infrastructure is absent + +**Extraction hints:** +- The 25% lower Medicaid facility telehealth rate is extractable as a structural claim about how coverage mandates fail when provider participation follows existing disparities +- The 42% less likely for majority-Black county facilities is the racial disparity mechanism +- Audio-only as the equity-relevant modality is a scope qualifier on the "digital mental health serves already-served" claim — audio-only is a genuine partial exception +- Culturally adapted g=0.90 vs standard g=0.43 is worth extracting as evidence that the app efficacy gap for minority populations is partly a design failure, not a technology failure + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: Provider participation gap is the structural mechanism explaining why Medicaid coverage expansion doesn't translate to telehealth access — facilities serving Medicaid populations are 25% less likely to offer telehealth, reproducing in-person disparities in the digital modality. + +EXTRACTION HINT: Lead with the provider participation mechanism (25% less likely for Medicaid facilities), then add the racial geography finding (42% less likely in majority-Black counties). Include audio-only exception and culturally adapted programs as scope qualifiers. diff --git a/inbox/queue/2026-04-21-goh-jama-llm-diagnostic-reasoning-rct.md b/inbox/queue/2026-04-21-goh-jama-llm-diagnostic-reasoning-rct.md new file mode 100644 index 000000000..f075765c4 --- /dev/null +++ b/inbox/queue/2026-04-21-goh-jama-llm-diagnostic-reasoning-rct.md @@ -0,0 +1,62 @@ +--- +type: source +title: "RCT: LLM access does not significantly improve physician diagnostic reasoning — AI alone scored 16 points higher than physicians using it" +author: "Goh et al. (JAMA Network Open, October 2024)" +url: https://pmc.ncbi.nlm.nih.gov/articles/PMC11519755/ +date: 2024-10-28 +domain: health +secondary_domains: [ai-alignment] +format: rct +status: unprocessed +priority: high +tags: [clinical-ai, LLM, diagnostic-reasoning, RCT, physician-performance, human-AI-team] +--- + +## Content + +**Full citation:** Goh E et al. "Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial." JAMA Network Open. October 28, 2024. PMC11519755. + +**Study design:** Single-blind RCT, stratified by career stage. 50 physicians (26 attendings, 24 residents). 244 clinical cases. + +**Key findings:** + +1. **LLM access did NOT significantly improve diagnostic reasoning:** Median score 76% (LLM group) vs. 74% (conventional resources) — non-significant 2-point difference. + +2. **AI alone scored 16 points higher than physicians using it:** The LLM standalone outperformed the human-AI team by 16 percentage points (74%+16=90%? — exact numbers not specified but the gap is 16 points). This is the most alarming finding: physicians with AI access performed no better than those without, while the AI alone would have performed substantially better. + +3. **Companion RCT (different cognitive task):** LLM assistance DID improve management reasoning — suggesting the AI benefit is task-specific. Diagnosis vs. treatment planning benefit unevenly from AI support. + +4. **No durable skill evidence:** Single-session study, no longitudinal tracking, no washout condition. + +**Interpretation:** This is a different failure mode from deskilling — it's integration failure. Physicians fail to extract AI capability, achieving no improvement despite access to a 90%+ diagnostic tool. The team performs at the level of the human, not the AI. + +**Why this is distinct from deskilling:** +- Deskilling: skill degrades after AI exposure when AI is removed +- Integration failure (Goh 2024): skill does not improve despite AI access; the human anchors the team performance, not the AI +- Both produce the same outcome (human performance level), but through different mechanisms + +## Agent Notes + +**Why this matters:** The Goh 2024 RCT is methodologically the strongest evidence on the human-AI diagnostic team question — it's a real RCT (not observational) with reasonable sample size for a physician study. The null result is damning in a different way than deskilling: physicians aren't being harmed by the AI (no deskilling measured), but they're also not benefiting from it in the most common clinical task (diagnosis). The AI alone performs dramatically better, but the human-AI team doesn't outperform humans without AI. + +**What surprised me:** The 16-point gap between AI-alone and the human-AI team. This is the clearest evidence of integration failure in the literature. The benefit that could be extracted from the AI isn't being extracted. This connects to centaur design — the centaur only works if the human and AI roles are structurally separated. + +**What I expected but didn't find:** Stratification by AI experience or digital health literacy. Presumably, physicians who use AI tools more regularly would extract more benefit. The JAMA paper presumably stratified by career stage (attending vs. resident) but the agent didn't report whether that moderated the result. + +**KB connections:** +- [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] — the Goh 2024 study adds a related but distinct mechanism: even without deskilling, human-AI teams underperform AI alone because of integration failure (not extracting AI capability) +- [[centaur team performance depends on role complementarity not mere human-AI combination]] — this study is precisely what the centaur claim predicts: simply having access to AI doesn't create centaur performance; role complementarity requires deliberate design +- [[medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials]] — this is the same study referenced in the existing KB claim! The Goh 2024 study IS the grounding evidence for this KB claim. + +**Extraction hints:** +- The existing KB claim [[medical LLM benchmark performance does not translate to clinical impact...]] may already be grounded in Goh 2024. Check whether the existing claim file references this PMID before extracting. +- The "integration failure" concept — AI alone outperforms human-AI team because humans fail to extract AI capability — is worth adding to the existing claim as enrichment +- The management reasoning companion RCT (AI DOES improve treatment planning but NOT diagnosis) is worth noting as a scope qualifier + +## Curator Notes + +PRIMARY CONNECTION: [[medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials]] + +WHY ARCHIVED: This may be the primary evidence source for the existing KB claim — if so, archive as enrichment. The "integration failure" mechanism (AI alone scores 16 points higher than human-AI team) is the strongest new element. + +EXTRACTION HINT: Check if this is already cited in the existing claim file before extracting. If it is, this is enrichment (add the 16-point gap finding and the management reasoning exception). If not, it's a primary source for the existing claim. diff --git a/inbox/queue/2026-04-21-heudel-ai-deskilling-scoping-review.md b/inbox/queue/2026-04-21-heudel-ai-deskilling-scoping-review.md new file mode 100644 index 000000000..1db6bac62 --- /dev/null +++ b/inbox/queue/2026-04-21-heudel-ai-deskilling-scoping-review.md @@ -0,0 +1,60 @@ +--- +type: source +title: "AI deskilling scoping review: evidence consistent across colonoscopy, radiology, pathology, cytology — no counter-evidence of durable up-skilling" +author: "Heudel PE, Crochet H, Filori Q, Bachelot T, Blay JY (ESMO Real World Data & Digital Oncology)" +url: https://pubmed.ncbi.nlm.nih.gov/41890350/ +date: 2026-03-19 +domain: health +secondary_domains: [ai-alignment] +format: journal-article +status: unprocessed +priority: high +tags: [clinical-ai, deskilling, never-skilling, physician-skills, automation-bias, scoping-review] +flagged_for_theseus: ["Clinical deskilling is domain-specific instance of general AI alignment failure; the cytology consolidation finding (80-85% training volume reduction) is the never-skilling pathway via structural destruction of training pipelines"] +--- + +## Content + +**Full citation:** Heudel PE, Crochet H, Filori Q, Bachelot T, Blay JY. "Artificial intelligence in medicine: a scoping review of the risk of deskilling and loss of expertise among physicians." ESMO Real World Data Digit Oncol. 2026 Mar 19; eCollection 2026 Jun. PMID: 41890350. DOI: 10.1016/j.esmorw.2026.100693. + +**Scope:** Literature reviewed through August 2025. Scoping review examining empirical evidence of physician skill loss following AI deployment across clinical specialties. + +**Specialties with deskilling evidence:** + +1. **Colonoscopy/Endoscopy:** Adenoma detection rate (ADR) dropped from 28.4% to 22.4% when endoscopists reverted to non-AI procedures after repeated AI use — a 6.0 percentage point decline attributable to AI reliance. ADR remained stable at 25.3% with AI assistance. This is the strongest quantitative deskilling signal in the literature. + +2. **Radiology (breast imaging):** Erroneous AI prompts increased false-positive recalls by up to 12% even among experienced radiologists — automation bias mechanism operating in expert practitioners, not just novices. + +3. **Computational pathology:** Over 30% of participants reversed correct initial diagnoses when exposed to incorrect AI suggestions under time pressure — mis-skilling in real time, not just skill decay. + +4. **Cytology:** Following UK cervical screening consolidation (shift to AI-assisted reading), case volumes reduced 80-85%, consolidating labs from 45 to 8 centers. The authors identify this as having "major implications for training capacity" — the never-skilling pathway via structural volume destruction. When training volume is eliminated, skills are never acquired in the first place. + +**Counter-evidence:** The review found NO opposing evidence. Authors conclude: "empirical studies consistently demonstrate that AI can inadvertently impair physicians' performance." No studies showed durable improvement in physician skills after AI exposure. + +**Recommendations:** Authors emphasize need for monitoring mechanisms and safeguards, but do not propose specific structural solutions. + +## Agent Notes + +**Why this matters:** This is the most comprehensive peer-reviewed synthesis of clinical AI deskilling evidence as of mid-2026. It extends the existing KB finding (colonoscopy ADR; Natali et al. 2025 multi-specialty review) with a formal scoping review covering the same specialties. The cytology lab consolidation finding is new and adds a structural never-skilling pathway that wasn't in previous deskilling literature — not physicians forgetting skills, but the training system being structurally dismantled. + +**What surprised me:** The cytology finding is the most alarming mechanism in this review. When lab consolidation reduces training case volumes by 80-85%, clinicians never acquire the skill in the first place — the never-skilling pathway isn't about individual cognitive dependency but systemic destruction of the apprenticeship infrastructure. This is worse than deskilling because it's irreversible without rebuilding training infrastructure. + +**What I expected but didn't find:** Any counter-evidence of durable up-skilling. I searched extensively for prospective studies showing AI calibrates physicians or produces lasting skill improvement — PubMed returned zero results for "AI clinical decision support physician performance up-skilling calibration." This null result is itself an important finding: after 5+ years of clinical AI deployment, there is no peer-reviewed evidence of durable skill improvement. + +**KB connections:** +- [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] — this review provides the synthesis evidence supporting this claim +- [[AI diagnostic triage achieves 97 percent sensitivity across 14 conditions making AI-first screening viable for all imaging and pathology]] — the irony: higher AI accuracy → more reliance → more deskilling +- [[the physician role shifts from information processor to relationship manager as AI automates documentation triage and evidence synthesis]] — if physicians de-skill from AI reliance, the "relationship manager" role may also be compromised + +**Extraction hints:** +- The cytology never-skilling finding (80-85% volume reduction, 45 → 8 labs) is the strongest new claim candidate — it names a structural mechanism distinct from cognitive deskilling +- Consider: should this lead to a divergence file between "AI deskilling (performance declines when AI removed)" vs. "AI up-skilling (performance improves while AI present)"? The Heudel review's null result on up-skilling makes this divergence lopsided — strong evidence on one side, no evidence on the other +- The "no counter-evidence" finding is extractable: "No peer-reviewed study demonstrates durable physician skill improvement following AI exposure, while consistent evidence documents performance decline when AI assistance is withdrawn" — this has confidence: likely + +## Curator Notes + +PRIMARY CONNECTION: [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] + +WHY ARCHIVED: First comprehensive scoping review (post-2025) synthesizing clinical AI deskilling evidence across 4 specialties with null counter-evidence; introduces cytology structural never-skilling mechanism. + +EXTRACTION HINT: Focus on two extractable claims: (1) the cytology/lab-consolidation never-skilling pathway as a structural mechanism distinct from individual cognitive deskilling; (2) the confirmed null result — no durable up-skilling evidence exists in the peer-reviewed literature as of mid-2026. diff --git a/inbox/queue/2026-04-21-hrsa-behavioral-health-workforce-2025.md b/inbox/queue/2026-04-21-hrsa-behavioral-health-workforce-2025.md new file mode 100644 index 000000000..477876079 --- /dev/null +++ b/inbox/queue/2026-04-21-hrsa-behavioral-health-workforce-2025.md @@ -0,0 +1,74 @@ +--- +type: source +title: "40% of US population (137 million) live in mental health shortage areas — HRSA projects 136,350 additional psychologists needed by 2038, with shortage worsening across all categories" +author: "HRSA (Health Resources and Services Administration)" +url: https://bhw.hrsa.gov/sites/default/files/bureau-health-workforce/data-research/Behavioral-Health-Workforce-Brief-2025.pdf +date: 2025-01-01 +domain: health +secondary_domains: [] +format: government-report +status: unprocessed +priority: high +tags: [mental-health, workforce-shortage, rural-health, psychiatry, HPSA, access] +--- + +## Content + +**Source:** HRSA State of the Behavioral Health Workforce 2025. Data as of December 31, 2025. + +**Key statistics:** + +**Scale of shortage:** +- 4,212 designated Mental Health Professional Shortage Areas (HPSAs) in rural areas +- ~40% of the US population — approximately **137 million Americans** — live in areas with designated mental health provider shortages +- 1,797 additional practitioners needed just to remove rural HPSA designations + +**2025 HRSA workforce shortfall projections (additional providers needed):** +- 16,940 mental health and substance abuse social workers +- 13,740 school counselors +- 8,220 psychologists +- 6,080 psychiatrists +- 2,440 marriage and family therapists + +**Long-term projection:** 136,350 additional psychologists needed by 2038 to meet all unmet need (under current utilization-based estimates; true gap is larger if latent unmet demand is included) + +**Rural-urban disparity (severe and persisting):** +- 22% of rural counties have no social workers (vs. 5% of urban counties) +- Rural counties are 3x more likely to have no psychologist than urban counties +- 70% of US counties have no child/adolescent psychiatrist at all +- Psychiatrist density: 17.5/100K in metro areas vs. **5.8/100K in non-metro areas** (3x gap) +- Patient-to-provider ratios reach 5,000:1 in some rural areas + +**Emergency system failure signal:** +- Emergency department visits for suicide attempts/intentional self-harm more than tripled (0.6% → 2% of ED visits, 2015-2020) — interpreted as a proxy for the mental health system failing at the access layer (patients presenting to EDs when primary mental health care is unavailable) + +**Provider Medicaid participation compounds scarcity:** +- Many mental health providers won't accept Medicaid due to low reimbursement rates, making the effective provider-to-Medicaid-patient ratio substantially worse than the raw numbers suggest + +## Agent Notes + +**Why this matters:** This is the most specific and comprehensive workforce shortage data available. The 137 million Americans (40% of the population) in shortage areas figure provides the scale argument that the supply gap is not a marginal problem but a majority problem. The specific shortage counts by category (6,080 psychiatrists, 8,220 psychologists, etc.) are precise enough to support strong claims. + +The 136,350 additional psychologists needed by 2038 figure is striking — that's roughly equivalent to tripling the entire current US psychologist workforce. The shortage is not a gap that can be closed by policy changes or technology deployment alone; it would require a fundamental expansion of training pipelines, which takes decades. + +**What surprised me:** The 70% of US counties with no child/adolescent psychiatrist is the most alarming single statistic. Mental health conditions predominantly emerge in adolescence — and 70% of counties have no specialist provider for this population. The shortage is worst precisely where early intervention matters most. + +**What I expected but didn't find:** Evidence of significant shortage improvement trends. The HRSA data shows the shortage worsening, not improving, across all categories. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — this is the definitive quantification of the supply-side gap: 137 million Americans in shortage areas, specific shortfall counts by category +- [[social isolation costs Medicare 7 billion annually and carries mortality risk equivalent to smoking 15 cigarettes per day making loneliness a clinical condition not a personal problem]] — social isolation → untreated mental illness → ED presentation; the tripling of self-harm ED visits is the outcomes signal of this failure + +**Extraction hints:** +- Update the existing mental health supply gap claim with HRSA 2025 quantification: 137 million Americans in shortage areas; 6,080 more psychiatrists, 8,220 more psychologists needed as of 2025 +- The ED self-harm tripling (0.6% → 2% of ED visits, 2015-2020) is independently extractable as a system failure signal +- The 70% no child/adolescent psychiatrist finding may be extractable as a standalone structural failure claim +- Confidence: proven (HRSA government data, December 31, 2025) + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: Definitive HRSA 2025 quantification of the mental health workforce shortage — 137 million Americans, specific shortfall by provider category, rural-urban disparity, and the ED self-harm tripling as outcomes evidence of system failure. + +EXTRACTION HINT: The 137 million / 40% figure should be the lead for any extraction. The ED self-harm tripling (0.6% → 2%) is the most emotionally resonant proxy for system failure. diff --git a/inbox/queue/2026-04-21-jorem-telehealth-mental-health-access.md b/inbox/queue/2026-04-21-jorem-telehealth-mental-health-access.md new file mode 100644 index 000000000..df3e049a4 --- /dev/null +++ b/inbox/queue/2026-04-21-jorem-telehealth-mental-health-access.md @@ -0,0 +1,56 @@ +--- +type: source +title: "Mental health telemedicine expansion produces only marginal rural access gains — high-telemedicine providers see fewer new patients" +author: "Jorem J, Wilcock AD, Busch AB, Huskamp HA, Mehrotra A (JAMA Network Open)" +url: https://pubmed.ncbi.nlm.nih.gov/41784959/ +date: 2026-03-02 +domain: health +secondary_domains: [] +format: journal-article +status: unprocessed +priority: high +tags: [mental-health, telehealth, access-equity, rural-health, treatment-gap, mental-health-workforce] +--- + +## Content + +**Full citation:** Jorem J, Wilcock AD, Busch AB, Huskamp HA, Mehrotra A. "Mental Health Specialist Telemedicine Uptake and Patient Location." JAMA Network Open. 2026 Mar 2;9(3):e260823. PMID: 41784959. + +**Study design:** Retrospective analysis of Medicare claims data, 2018-2023. 17,742 mental health specialists. Stratified providers by 2021 telemedicine usage levels (high vs. low). Compared patient geography in 2023 across usage strata. + +**Key findings:** + +1. Mental health specialists with highest telemedicine use had only **0.88 percentage points** more visits with rural patients compared to low-use providers — a statistically small difference that fails to close the rural access gap. + +2. Highest-telemedicine providers saw **3.55 percentage points fewer new patients** compared to low-use providers by 2023. This is the most counterintuitive finding: telemedicine adoption is associated with REDUCED new patient acquisition, suggesting specialists are using telehealth for existing patient relationships rather than to expand access. + +3. Small increases for patients in shortage areas (HPSA-designated), different states, and those 20+ miles away — statistically present but clinically marginal. + +4. Authors' conclusion: "greater telemedicine uptake was associated with only small increases in the share of visits to patients in rural, low-access-to-care, or distant communities," and "additional policy interventions may be required to achieve telemedicine's potential in addressing access disparities." + +**Context:** Data from November 2024-December 2025. Medicare claims sample, so limited to Medicare population (elderly and disabled). Mental health specialists only (psychiatrists, psychologists), not primary care with behavioral health integration. + +## Agent Notes + +**Why this matters:** This directly tests the hypothesis that telemedicine is closing the mental health access gap for underserved populations. It uses a large, longitudinal national dataset (not a single-center study) and reaches a clear null result: telemedicine expansion among mental health specialists is NOT substantially expanding access to rural or underserved populations. The "fewer new patients" finding is the most important — it reveals the mechanism: telemedicine is being used to maintain existing relationships (greater convenience, less dropout), not to acquire new underserved patients. + +**What surprised me:** The 3.55 percentage points FEWER new patients among high-telemedicine providers is genuinely counterintuitive. The intuitive hypothesis is: telemedicine removes geographic barriers → providers see more new patients from distant/rural locations. The finding is the opposite: high telemedicine providers are seeing fewer new patients overall. This suggests telemedicine is primarily a retention tool for existing patients, not an access expansion tool. The "why" isn't fully explained but plausible mechanisms: (1) high telemedicine providers may be filling their capacity with existing patients via convenient virtual follow-ups; (2) new patient acquisition requires in-person trust-building that telemedicine doesn't easily enable. + +**What I expected but didn't find:** A clear positive finding showing telemedicine expanding rural mental health access. I expected the geographic reach data to show meaningfully higher shares of rural patients for high-telemedicine providers — even 5-10 percentage points. The 0.88 percentage point difference is a near-null result at scale. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — this provides strong quantitative evidence for the "serves already-served" mechanism +- [[healthcare AI creates a Jevons paradox because adding capacity to sick care induces more demand for sick care]] — potential parallel: telehealth capacity added to existing relationships rather than expanding to underserved (a retention Jevons paradox) + +**Extraction hints:** +- This is the strongest available evidence for the claim that telehealth mental health primarily serves existing patients rather than expanding access to underserved populations +- The "fewer new patients" finding could be extracted as a standalone claim: "Mental health telehealth expansion is associated with reduced new patient acquisition, revealing a retention mechanism that prevents geographic access expansion" +- Confidence: likely (large national dataset, longitudinal, peer-reviewed in JAMA Network Open; limited to Medicare population) + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: First large-scale national longitudinal study (n=17,742 providers, 2018-2023 Medicare data) directly testing whether telemedicine expansion improves rural mental health access — null result with counterintuitive "fewer new patients" finding. + +EXTRACTION HINT: Focus on the "fewer new patients" mechanism — this explains WHY telemedicine serves existing rather than new patients, which is the core of the "already-served" claim. diff --git a/inbox/queue/2026-04-21-kff-medicaid-mental-health-treatment-rates.md b/inbox/queue/2026-04-21-kff-medicaid-mental-health-treatment-rates.md new file mode 100644 index 000000000..7f98ba027 --- /dev/null +++ b/inbox/queue/2026-04-21-kff-medicaid-mental-health-treatment-rates.md @@ -0,0 +1,62 @@ +--- +type: source +title: "Medicaid adults with mental illness receive treatment at higher rates than commercially insured (59% vs 55%) — but 41% unmet need persists and uninsured face 63% unmet need" +author: "KFF (Kaiser Family Foundation)" +url: https://www.kff.org/mental-health/issue-brief/5-key-facts-about-medicaid-coverage-for-adults-with-mental-illness/ +date: 2025-01-01 +domain: health +secondary_domains: [] +format: policy-brief +status: unprocessed +priority: medium +tags: [mental-health, Medicaid, treatment-gap, access-equity, insurance-coverage] +--- + +## Content + +**Source:** KFF Issue Brief. "5 Key Facts About Medicaid Coverage for Adults with Mental Illness." Data year: 2023. + +**Key data points:** + +1. **Medicaid adults with mental illness treatment rate:** 59% received treatment in 2023 — higher than both commercially insured (55%) and uninsured (37%). + +2. **Treatment gap by coverage type:** + - Medicaid: 41% unmet need + - Private insurance: 45% unmet need + - Uninsured: 63% unmet need + +3. **Serious mental illness (SMI) treatment rates:** + - Medicaid enrollees with SMI: 77% received treatment + - Private insurance with SMI: 71.6% received treatment + - Medicaid advantage in SMI treatment is larger than overall mental illness treatment + +4. **Scale:** Approximately 52 million nonelderly adults have mental illness; Medicaid covers about 15 million (29%) of them — 1 in 3 nonelderly adults with mental illness is Medicaid-enrolled. + +5. **Complicating factors:** Medicaid enrollees with mental illness have higher rates of chronic conditions and substance use disorders; coverage alone doesn't eliminate all care barriers. + +**Data source:** SAMHSA (Substance Abuse and Mental Health Services Administration), 2023. + +## Agent Notes + +**Why this matters:** This counterintuitively shows that Medicaid provides BETTER mental health treatment access than commercial insurance — the 59% vs. 55% finding challenges the narrative that Medicaid populations are uniformly the most underserved for mental health care. The uninsured have the worst outcomes (37%), with a treatment gap more than 26 percentage points larger than Medicaid. This reframes the policy problem: the primary mental health access failure is for the uninsured, not for Medicaid populations. + +**What surprised me:** Medicaid actually outperforms commercial insurance on mental health treatment rates. I expected the reverse, given Medicaid's often-limited provider networks and lower reimbursement rates. The likely explanation: Medicaid's mental health coverage has historically been more comprehensive (behavioral health carve-outs, FQHC availability, community mental health centers) than commercial plans, which often have narrow behavioral health networks despite parity requirements. + +**What I expected but didn't find:** Evidence that Medicaid mental health coverage produces substantially worse outcomes than commercial coverage. The finding is the opposite — Medicaid is actually a better coverage vehicle for mental health than commercial insurance. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — this data provides a coverage-type breakdown of the treatment gap; the largest gap is for the uninsured +- [[value-based care transitions stall at the payment boundary because 60 percent of payments touch value metrics but only 14 percent bear full risk]] — parity analogy: commercial plans technically have parity but in practice have 45% unmet need + +**Extraction hints:** +- The counterintuitive Medicaid > commercial finding is extractable if it can be grounded in structural explanation (Medicaid's stronger behavioral health infrastructure vs. commercial narrow networks) +- The 63% unmet need for uninsured (vs. 41% Medicaid) is the clearest policy target +- Note: data is cross-sectional 2023, no trend; doesn't tell us if gaps are widening or narrowing + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: Provides coverage-type breakdown of mental health treatment gap; the counterintuitive Medicaid > commercial finding challenges standard narratives and reframes the access problem as primarily an uninsured problem. + +EXTRACTION HINT: The Medicaid advantage in SMI treatment (77% vs 71.6% commercial) may be extractable as evidence that behavioral health carve-outs and community mental health infrastructure outperform commercial narrow networks — a structural argument, not just a coverage-level argument. diff --git a/inbox/queue/2026-04-21-mental-health-workforce-shortage-2025-2026.md b/inbox/queue/2026-04-21-mental-health-workforce-shortage-2025-2026.md new file mode 100644 index 000000000..b8afee2d0 --- /dev/null +++ b/inbox/queue/2026-04-21-mental-health-workforce-shortage-2025-2026.md @@ -0,0 +1,53 @@ +--- +type: source +title: "Mental health workforce shortage restricts access for 51-55 million people and is worsening — psychiatric nursing shortage directly impacts 55% of adults requiring mental health care" +author: "Multiple sources: Journal of American Psychiatric Nurses Association (2025); Nursing Clinics of North America (2026)" +url: https://pubmed.ncbi.nlm.nih.gov/?term=mental+health+workforce+shortage+telehealth+2025 +date: 2025-01-01 +domain: health +secondary_domains: [] +format: journal-articles +status: unprocessed +priority: medium +tags: [mental-health, workforce-shortage, treatment-gap, psychiatric-nursing, access] +--- + +## Content + +**Source 1:** "Telehealth and Team-Based Care." Journal of the American Psychiatric Nurses Association. 2025. +- Key finding: "Chronic shortages of mental health providers restrict access for 51 million people." +- Framing: Technology (telehealth + team-based care) proposed as mitigation but shortage persists. + +**Source 2:** "Unified and Portable Nurse Licensure for Psychiatric Mental Health Registered Nurses." Nursing Clinics of North America. 2026. +- Key finding: "The shortage of MH providers directly impacts 55% of adults requiring MH care." +- Context: Multi-state licensing compacts proposed to increase workforce mobility and partially address shortage. + +**Source 3:** "Mental Health Leadership Perspectives." Administration and Policy in Mental Health. 2026. +- Key finding: "Staffing gaps are common and may compound access challenges." Virtual contingency staffing programs demonstrate telehealth utility for filling temporary gaps — but not structural shortage. + +**Cross-source pattern:** All papers note worsening shortages; all suggest telehealth as mitigation; none show shortage is resolving. The 51-55 million figure across multiple sources suggests this estimate is becoming a consensus figure in the workforce literature. + +## Agent Notes + +**Why this matters:** The mental health workforce shortage literature consistently identifies 51-55 million Americans as restricted in mental health access due to provider unavailability. When combined with the Jorem et al. 2026 JAMA finding that telemedicine barely improves geographic access, this creates a coherent picture: the supply gap is the binding constraint, and telemedicine doesn't substantially solve it (it primarily serves existing relationships). + +**What surprised me:** The 55% figure (shortage directly impacts 55% of adults requiring MH care) is a majority of the affected population — if accurate, this is not a fringe problem but a majority problem. The sheer scale makes the "widening, not closing" claim structurally necessary: a shortage affecting the majority of the treatment population cannot be closed by organic workforce growth or current technology deployment. + +**What I expected but didn't find:** A forecast showing the shortage improving. Papers from 2025-2026 consistently describe the shortage as persistent or worsening, not as resolving. No paper found shows a credible trajectory toward closure. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — direct support: these papers document the "worsening shortage" claim +- [[social isolation costs Medicare 7 billion annually and carries mortality risk equivalent to smoking 15 cigarettes per day making loneliness a clinical condition not a personal problem]] — the mental health supply gap interacts with loneliness epidemic; untreated mental illness compounds social isolation + +**Extraction hints:** +- The 51-55 million figure is now robust enough to be the anchor for a workforce shortage claim update +- The persistent shortage despite telehealth deployment is extractable as evidence against the "telemedicine solves the shortage" narrative +- Note: multi-source summary; extractor should verify specific figures against primary papers before extracting + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: Multiple 2025-2026 papers converging on 51-55 million Americans restricted by mental health workforce shortage, with consistent framing of shortage as worsening and telehealth as insufficient mitigation. + +EXTRACTION HINT: The 51-55 million figure has become a literature consensus estimate — worth extracting as a claim update with the 2025-2026 sourcing if the existing KB claim lacks this quantification. diff --git a/inbox/queue/2026-04-21-pnas-nexus-telehealth-deprivation-disparities.md b/inbox/queue/2026-04-21-pnas-nexus-telehealth-deprivation-disparities.md new file mode 100644 index 000000000..df3e13322 --- /dev/null +++ b/inbox/queue/2026-04-21-pnas-nexus-telehealth-deprivation-disparities.md @@ -0,0 +1,60 @@ +--- +type: source +title: "Telehealth use is 1.62-1.67x higher in low-deprivation areas vs high-deprivation areas — no evidence of telehealth improving access for highest-need populations (2016-2024)" +author: "Multiple authors, Johns Hopkins (PNAS Nexus, February 2025)" +url: https://academic.oup.com/pnasnexus/article/4/2/pgaf016/8003900 +date: 2025-02-01 +domain: health +secondary_domains: [] +format: journal-article +status: unprocessed +priority: high +tags: [telehealth, mental-health, access-equity, deprivation, disparity, primary-care, psychiatry] +--- + +## Content + +**Full citation:** Multiple authors (Johns Hopkins). "Telehealth and area deprivation, 2016-2024." PNAS Nexus. February 2025. + +**Study design:** EHR analysis, Johns Hopkins health system. 2016-2024. n=42,640 primary care patients; n=12,846 psychiatry patients. + +**Key findings:** + +1. **Primary care:** Patients from low-deprivation areas were **1.62x more likely** to use telehealth than patients from high-deprivation areas. + +2. **Psychiatry (mental health):** Patients from low-deprivation areas were **1.67x more likely** to use telehealth than patients from high-deprivation areas. + +3. **No evidence of improvement:** The study found "no evidence of telehealth improving access for high-deprivation area patients" — over the 2016-2024 period, the deprivation-related disparity in telehealth use did not narrow. + +4. **Coverage of period:** This is a pre-pandemic, pandemic, and post-pandemic time series — the full arc of telehealth expansion. The expansion did not reduce deprivation-related disparities. + +**Context:** +- Area Deprivation Index (ADI) used to measure neighborhood disadvantage +- Johns Hopkins health system = urban academic medical center with a high proportion of Medicaid and underserved patients — if telehealth were going to reach underserved populations anywhere, this system should be a favorable context +- The persistence of the disparity across the full 2016-2024 arc (including COVID-driven telehealth expansion) is the most damning finding + +## Agent Notes + +**Why this matters:** This is the strongest direct test of whether telehealth reduces socioeconomic disparities in access. It uses a long time series (8 years), large sample, and a validated deprivation index. The finding that LOW-deprivation patients are 62-67% more likely to use telehealth — and that this ratio did NOT improve even through the COVID telehealth expansion — is a clear disconfirmation of the "telehealth closes the access gap" hypothesis. + +Combined with Jorem et al. 2026 (JAMA Net Open, mental health specialists seeing fewer new patients with high telemedicine), this creates a coherent multi-evidence picture: telehealth expands access for existing patients in low-deprivation areas while leaving high-deprivation patients behind. + +**What surprised me:** The study spans 2016-2024 — before, during, and after COVID. The COVID telehealth surge was specifically designed to expand access during an emergency. The deprivation disparity persisted through it. This is the clearest evidence that structural barriers (not just awareness or provider hesitance) prevent telehealth from reaching underserved populations. + +**What I expected but didn't find:** Narrowing of the disparity during COVID (when telehealth was maximally expanded and available). The null on narrowing is the finding. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — this is the strongest available evidence for the "already-served" mechanism across a multi-year longitudinal study in both primary care AND psychiatry +- [[social isolation costs Medicare 7 billion annually and carries mortality risk equivalent to smoking 15 cigarettes per day making loneliness a clinical condition not a personal problem]] — high-deprivation populations (those most at risk for social isolation) are precisely the ones least reached by telehealth + +**Extraction hints:** +- "Telehealth utilization is 1.62-1.67x higher in low-deprivation areas and the gap did not narrow across an 8-year period including COVID expansion, confirming that structural barriers prevent telehealth from closing the mental health access gap for the highest-need populations" — high confidence (large n, long time series, validated deprivation measure) +- This is the anchor evidence for the "serves the already-served" mental health telehealth claim + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: 8-year longitudinal study (2016-2024) showing telehealth use 1.62-1.67x higher in low-deprivation areas with no convergence — the strongest direct disconfirmation of the "telehealth closes the access gap" hypothesis. + +EXTRACTION HINT: This is the anchor evidence for the mental health telehealth access claim. Extract as the primary quantitative finding (1.62x primary care, 1.67x psychiatry) with the 8-year no-improvement arc as the key methodological strength. diff --git a/inbox/queue/2026-04-21-praim-mammography-optional-use-nature-medicine.md b/inbox/queue/2026-04-21-praim-mammography-optional-use-nature-medicine.md new file mode 100644 index 000000000..6e315e793 --- /dev/null +++ b/inbox/queue/2026-04-21-praim-mammography-optional-use-nature-medicine.md @@ -0,0 +1,63 @@ +--- +type: source +title: "PRAIM mammography study: optional-use AI design increased detection 17.6% in 463,094 women with no recall rate increase — optional-use may be structural mitigation against deskilling" +author: "Multiple authors (Nature Medicine, January 2025)" +url: https://www.nature.com/articles/s41591-024-03408-6 +date: 2025-01-01 +domain: health +secondary_domains: [ai-alignment] +format: journal-article +status: unprocessed +priority: medium +tags: [clinical-ai, mammography, radiology, detection, optional-use, deskilling-mitigation, real-world-evidence] +--- + +## Content + +**Full citation:** PRAIM Study. Nature Medicine. January 2025. + +**Study design:** Nationwide real-world non-inferiority implementation study. Multicenter, 12 German mammography screening sites. July 2021 – February 2023. + +**Sample:** 463,094 women; 119 radiologists. + +**Key findings:** +1. AI-supported reading increased breast cancer detection rate **17.6%** (6.7 vs. 5.7 per 1,000 screened women) +2. **No increase in recall rate** — AI improved sensitivity without increasing false positives +3. Radiologists **voluntarily chose whether to consult AI** — optional-use design throughout +4. No skill degradation reported — but also not measured formally + +**The optional-use design argument:** +This is the most important structural element. Radiologists retained full agency over when to consult AI. This optional-use approach may structurally reduce deskilling risk because: +- Radiologists make their own primary read first, then optionally consult AI +- Active clinical judgment is exercised for EVERY case, regardless of AI use +- AI is positioned as a second opinion, not a primary filter + +Contrast with mandatory or default-on AI deployment, where clinicians may passively wait for AI output before forming their own judgment — which is the mechanism for automation bias and deskilling. + +**Limitation:** Skill degradation (deskilling) was not measured. The study shows concurrent detection improvement and stable recall rates. Whether radiologist skill INDEPENDENT of AI changed is unknown. + +## Agent Notes + +**Why this matters:** The PRAIM study is the largest real-world AI mammography implementation study available and provides strong evidence that AI can improve detection at population scale. More importantly for the deskilling debate: the optional-use design is a structural argument that deployment design choices affect deskilling risk. If mandatory-use creates automation bias and deskilling, optional-use may preserve independent clinical skill. + +**What surprised me:** The zero recall rate increase alongside a 17.6% detection rate increase is a more favorable tradeoff than most AI mammography studies report. This suggests the optional-use population (Germany's screening program) may have particularly high radiologist selectivity. + +**What I expected but didn't find:** A formal skill measurement component. Given the study's size (463K women, 119 radiologists), a washout condition measuring unassisted performance before and after AI deployment would have been feasible. The absence of skill measurement in a study this large is a missed opportunity. + +**KB connections:** +- [[AI diagnostic triage achieves 97 percent sensitivity across 14 conditions making AI-first screening viable for all imaging and pathology]] — the PRAIM study provides real-world implementation evidence (not just accuracy benchmarks) for AI mammography at national scale +- [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] — the optional-use design is a structural counter to the deskilling mechanism: if radiologists always form independent judgment before consulting AI, the deskilling pathway is interrupted + +**Extraction hints:** +- The optional-use design as structural deskilling mitigation is a novel claim: "Optional-use AI deployment — where clinicians form independent judgment before consulting AI — may structurally prevent the automation bias and deskilling mechanisms observed in mandatory-use deployments" +- This is a design principle, not an empirically proven effect (no washout data) +- Confidence: experimental (plausible mechanism; needs prospective validation) +- The 17.6% detection improvement is separately extractable as evidence for AI value in screening mammography + +## Curator Notes + +PRIMARY CONNECTION: [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] + +WHY ARCHIVED: Largest real-world AI mammography implementation study; optional-use design is a novel structural argument for deskilling prevention that is not currently in the KB. + +EXTRACTION HINT: Two extractable elements: (1) PRAIM detection rate improvement (17.6%, no recall increase) as real-world evidence for AI mammography value; (2) optional-use design as structural hypothesis for deskilling mitigation. diff --git a/inbox/queue/2026-04-21-pubmed-null-result-ai-durable-upskilling.md b/inbox/queue/2026-04-21-pubmed-null-result-ai-durable-upskilling.md new file mode 100644 index 000000000..7f3bcfeec --- /dev/null +++ b/inbox/queue/2026-04-21-pubmed-null-result-ai-durable-upskilling.md @@ -0,0 +1,53 @@ +--- +type: source +title: "PubMed null result: zero papers on durable physician skill improvement from AI clinical decision support as of April 2026" +author: "PubMed systematic search (Vida research session 24)" +url: https://pubmed.ncbi.nlm.nih.gov/?term=AI+clinical+decision+support+physician+performance+up-skilling+calibration&datetype=pdat&mindate=2024&maxdate=2026&sort=date +date: 2026-04-21 +domain: health +secondary_domains: [ai-alignment] +format: null-result +status: unprocessed +priority: medium +tags: [clinical-ai, deskilling, never-skilling, null-result, physician-skills, calibration] +--- + +## Content + +**Search conducted:** April 21, 2026. PubMed database. + +**Searches that returned zero results:** +1. "AI clinical decision support physician performance up-skilling calibration" (2024-2026) — 0 results +2. "clinical AI durable lasting skill improvement physician training feedback calibration prospective 2024 2025" — 0 results (background agent search) + +**What this means:** +As of April 2026, there are no peer-reviewed papers indexed in PubMed that study whether clinical AI or clinical decision support systems produce DURABLE physician skill improvement — meaning improvement that persists when AI is removed. The literature has extensive evidence on AI improving performance WHILE PRESENT (diagnostic accuracy, workflow efficiency) but zero published evidence that AI exposure durably calibrates or up-skills physicians. + +**Context:** +This null result comes 5+ years into large-scale clinical AI deployment. AI scribes reached 92% provider adoption within 3 years. OpenEvidence reached 40% of US physicians daily. AI diagnostic triage is deployed across imaging at scale. Despite this scale of deployment, no prospective study has demonstrated durable skill improvement. + +**The complement:** The deskilling literature is growing (Heudel et al. 2026, Natali et al. 2025, colonoscopy ADR drop, multiple radiology/pathology automation bias studies). The up-skilling literature is empty. + +## Agent Notes + +**Why this matters:** Null results are underarchived but epistemically important. The absence of durable up-skilling evidence after 5+ years of widespread clinical AI deployment is itself a finding. If AI durably improved physician skills, it would be visible and measurable — clinical educators and hospitals would be documenting it. The absence of this literature suggests either: (1) durable up-skilling doesn't happen; or (2) it hasn't been studied (absence of evidence ≠ evidence of absence — but after 5 years, the absence is telling). + +**What surprised me:** I expected to find at least some papers on AI-mediated calibration (e.g., does seeing AI error rates help physicians calibrate their own confidence? Does AI feedback improve diagnostic reasoning?). The complete null was unexpected. + +**What I expected but didn't find:** Prospective studies of medical students or residents trained WITH AI vs. WITHOUT AI comparing downstream clinical performance. This is the study design that would detect never-skilling. Not one such study exists in the peer-reviewed literature as of April 2026. + +**KB connections:** +- [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] — the null result on up-skilling strengthens the one-directional nature of this claim +- [[AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk]] — if up-skilling existed, it would have been documented in this large-scale deployment + +**Extraction hints:** +- "No peer-reviewed study demonstrates durable physician skill improvement following AI exposure — after 5+ years of large-scale deployment, the up-skilling evidence gap is itself evidence of directionality" — confidence: likely (the absence is substantial given deployment scale) +- This is methodologically weaker than a prospective study showing harm, but the scale and duration of the null makes it meaningful + +## Curator Notes + +PRIMARY CONNECTION: [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] + +WHY ARCHIVED: Null result from systematic PubMed search — important for confirming one-directionality of the deskilling evidence base; the absence of counter-evidence after 5+ years of deployment is informative. + +EXTRACTION HINT: Archive as supporting evidence for the one-directional nature of clinical AI skill effects, not as a standalone claim. Combine with Heudel et al. 2026 for extraction. diff --git a/inbox/queue/2026-04-21-savardi-radiology-ai-error-resilience.md b/inbox/queue/2026-04-21-savardi-radiology-ai-error-resilience.md new file mode 100644 index 000000000..200632dcb --- /dev/null +++ b/inbox/queue/2026-04-21-savardi-radiology-ai-error-resilience.md @@ -0,0 +1,59 @@ +--- +type: source +title: "Radiology residents show error resilience against large AI mistakes (ICC improvement 0.665→0.813) but no durable up-skilling measured after AI removal — closest counter-evidence to automation bias" +author: "Savardi et al., Insights into Imaging (PMC11780016)" +url: https://pmc.ncbi.nlm.nih.gov/articles/PMC11780016/ +date: 2025-01-29 +domain: health +secondary_domains: [ai-alignment] +format: journal-article +status: unprocessed +priority: medium +tags: [clinical-ai, deskilling, automation-bias, radiology, error-resilience, medical-education] +--- + +## Content + +**Full citation:** Savardi et al. "Upskilling or deskilling? Measurable role of an AI-supported training for radiology residents: a lesson from the pandemic." Insights into Imaging. January 29, 2025. PMC11780016. + +**Study design:** Pilot experimental study, three within-subjects conditions (no-AI, on-demand AI, integrated AI). 8 radiology residents (4 first-year, 4 third-year). 150 chest X-rays. + +**Key findings:** +1. AI support significantly reduced scoring errors (p<0.001) — residents performed better WITH AI present +2. Inter-rater agreement improved from ICC 0.665 to 0.813 (22% gain) with AI — suggesting AI calibrates agreement, not just individual accuracy +3. **Error resilience:** Residents were resilient to AI errors ABOVE an acceptability threshold — they didn't blindly follow wrong AI suggestions when errors were large. This is the most important finding for the automation bias debate. +4. Performance improvement only measured while AI was present — **no washout condition or follow-up measurement without AI** + +**Critical limitation:** n=8, single session, no longitudinal tracking. Performance after AI removal was NOT measured — the study cannot show durable up-skilling. + +**What this is and isn't:** +- IS: Evidence that radiology residents preserve critical judgment against large AI errors (error resilience > passive automation bias for large errors) +- IS NOT: Evidence of durable up-skilling — after AI removal, we don't know if skill is maintained +- IS NOT: Evidence against the subtle automation bias pattern (following small/borderline AI errors, not catching small mistakes) + +**Context:** The error resilience finding applies to LARGE errors where residents recognize the AI is clearly wrong. The automation bias literature documents performance degradation from following SMALL, plausible-looking AI errors — which this study didn't test. + +## Agent Notes + +**Why this matters:** This is the closest counter-evidence to the automation bias/deskilling thesis I found in the 2024-2026 literature. The error-resilience finding suggests critical judgment is at least partially preserved — residents don't blindly follow obviously wrong AI. This is important nuance: the deskilling concern may apply primarily to borderline/subtle errors, not to gross AI errors. + +**What surprised me:** The 22% improvement in inter-rater agreement (ICC 0.665 → 0.813) is the most interesting finding. AI calibration of inter-rater agreement is a different benefit than individual accuracy — it suggests AI may standardize radiologist performance even when it doesn't improve individual skill. This has implications for the "AI as quality floor" argument. + +**What I expected but didn't find:** A washout condition showing performance after AI removal. This is the definitive test of durable up-skilling. Without it, the study can only speak to concurrent performance. + +**KB connections:** +- [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] — this study partially complicates the "overriding" claim: residents do resist large errors. The override problem may be worse for subtle/borderline errors. +- [[AI diagnostic triage achieves 97 percent sensitivity across 14 conditions making AI-first screening viable for all imaging and pathology]] — the inter-rater calibration finding suggests AI's value in imaging may extend beyond sensitivity to standardization + +**Extraction hints:** +- The error-resilience finding is extractable as a scope qualifier on the automation bias claim: "Automation bias appears strongest for subtle, plausible AI errors; clinicians preserve critical judgment against large, recognizable AI errors" +- Confidence: experimental (n=8, single session, no washout) +- This should be used to SCOPE the existing automation bias claim, not to refute it + +## Curator Notes + +PRIMARY CONNECTION: [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]] + +WHY ARCHIVED: Closest counter-evidence to automation bias in 2024-2026 literature — error resilience for large AI mistakes complicates the universal automation bias framing. Also: ICC calibration benefit is a distinct AI value not yet in KB. + +EXTRACTION HINT: Use as scope qualifier on automation bias claim, not refutation. The key nuance: automation bias appears strongest for subtle errors; clinicians preserve judgment against large errors. diff --git a/inbox/queue/2026-04-21-smartphone-mental-health-apps-efficacy-attrition.md b/inbox/queue/2026-04-21-smartphone-mental-health-apps-efficacy-attrition.md new file mode 100644 index 000000000..2fe0404d9 --- /dev/null +++ b/inbox/queue/2026-04-21-smartphone-mental-health-apps-efficacy-attrition.md @@ -0,0 +1,75 @@ +--- +type: source +title: "Smartphone mental health apps show modest efficacy (g=0.43) but 64% attrition in motivated samples — real-world population reach is severely limited by engagement failure" +author: "Multiple sources: Lancet Digital Health 2025; npj Digital Medicine 2025 meta-analysis (92 RCTs, n=16,728)" +url: https://www.thelancet.com/journals/landig/article/PIIS2589-7500(25)00105-0/fulltext +date: 2025-01-01 +domain: health +secondary_domains: [] +format: meta-analyses +status: unprocessed +priority: high +tags: [mental-health, digital-therapeutics, smartphone-apps, efficacy, attrition, access-equity, behavioral-health] +--- + +## Content + +**Source 1:** "Efficacy of standalone smartphone apps for mental health: an updated systematic review and meta-analysis." The Lancet Digital Health. 2025. DOI: 10.1016/S2589-7500(25)00105-0. + +Key findings: +- Depression apps: Hedges' g = 0.45 (small-to-moderate effect) +- Anxiety apps: Hedges' g = 0.35 (small effect) +- PTSD apps: Hedges' g = 0.15 (minimal effect) + +**Source 2:** "A meta-analysis of persuasive design, engagement, and efficacy in 92 RCTs of mental health apps." npj Digital Medicine. 2025. +- 92 RCTs, 16,728 participants +- Apps significantly improved clinical outcomes vs. controls: g = 0.43 + +**Critical engagement/attrition data (npj Digital Medicine; also "Engagement and attrition in digital mental health" npj Digital Medicine 2025):** +- Attrition rates up to **64% in motivated, self-selected RCT participants** — the best-case scenario for engagement +- Retention: 26.15% at post-test; 18.34% at follow-up in some studies +- 1 in 4 participants drop out prematurely even in structured trial conditions +- Retention trajectory: ~90% at week 1 → ~50% by week 8 + +**Factors contributing to poor engagement:** +- Poor usability and lack of user-centric design +- Privacy concerns +- Skepticism about effectiveness +- Limited digital literacy (structural barrier for underserved populations) +- Lack of personalization / one-size-fits-all approaches +- No cultural or linguistic adaptation for non-English speakers + +**Effect size interpretation:** +- g = 0.43 (apps overall) compares favorably to some face-to-face interventions but is lower than psychotherapy effect sizes (typically g = 0.8-1.0) +- Critically: effect sizes in RCTs represent best-case conditions with motivated, self-selected, technically literate participants who complete the program. Real-world population-level effects are substantially lower due to 64% attrition and lower engagement in non-trial conditions. + +**Important null finding (npj Digital Medicine 2025):** +"Effect sizes of depression, anxiety, sleep problems, and PTSD apps were not significantly moderated by guidance, engagement, or dropout rates" — suggesting that the small proportion who complete apps benefit, but engagement doesn't predict who those completers will be. + +## Agent Notes + +**Why this matters:** This is the most comprehensive recent evidence base on whether smartphone mental health apps can close the mental health supply gap. The finding is nuanced: apps DO work (g=0.43 is a real effect) but with 64% attrition even in motivated samples, the population-level reach is severely limited. For underserved populations (lower digital literacy, privacy concerns, limited internet access), attrition would likely be substantially higher than the trial sample. + +This directly addresses the KB claim that technology "primarily serves the already-served": the 64% attrition in motivated, self-selected RCT participants implies that in real-world conditions with non-self-selected users (including underserved), completion rates would be far lower. Apps that work for the 36% who complete them are still not solving population-level access. + +**What surprised me:** The efficacy signal is real — g=0.43 is not trivial for a standalone smartphone app. But the finding that effect sizes are NOT moderated by engagement or dropout rates is strange — it suggests the benefit accrues to the completer subset regardless of what drives completion. This creates a selection problem: we can't identify in advance who will complete and benefit. + +**What I expected but didn't find:** Evidence that any specific app modality (text-based, CBT-structured, mindfulness) works better for underserved populations specifically. The literature is almost entirely in trial conditions with self-selected participants — essentially no equity-stratified efficacy data exists. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — direct evidence: apps work but only for the self-selected completer minority; underserved populations face additional attrition barriers +- [[prescription digital therapeutics failed as a business model because FDA clearance creates regulatory cost without the pricing power that justifies it for near-zero marginal cost software]] — the Pear/Akili collapse was partly about this engagement problem: even if an app is clinically effective, population-level impact requires engagement that DTx couldn't achieve +- [[social isolation costs Medicare 7 billion annually and carries mortality risk equivalent to smoking 15 cigarettes per day making loneliness a clinical condition not a personal problem]] — the people who most need mental health apps (socially isolated, severe mental illness) are least likely to engage with them + +**Extraction hints:** +- "Mental health smartphone apps show small-to-moderate efficacy (Hedges' g = 0.43) in motivated, self-selected RCT participants but 64% attrition undermines population-level impact" — this is extractable as a claim that reframes the digital mental health access question +- The equity gap is implied but not directly measured: digital literacy barriers, privacy concerns, and cultural/linguistic adaptation gaps mean underserved populations face higher attrition than the already-high RCT rates +- Consider: should this create a divergence with an optimistic "apps can close the treatment gap" framing? The Lancet Digital Health 2025 shows efficacy; the attrition data shows reach failure. These are both true simultaneously. + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: The 64% attrition in motivated RCT participants is the key mechanism explaining why smartphone apps, despite real efficacy, fail to close the treatment gap at population scale. This is the strongest recent evidence for the structural limitation of digital mental health. + +EXTRACTION HINT: The two-part finding is extractable together: (1) apps work at the individual level (g=0.43); (2) 64% attrition in best-case conditions limits population reach. The combination explains why efficacy doesn't translate to access expansion. diff --git a/inbox/queue/2026-04-21-telehealth-disparities-2019-2020-jtt.md b/inbox/queue/2026-04-21-telehealth-disparities-2019-2020-jtt.md new file mode 100644 index 000000000..09a362344 --- /dev/null +++ b/inbox/queue/2026-04-21-telehealth-disparities-2019-2020-jtt.md @@ -0,0 +1,59 @@ +--- +type: source +title: "Telehealth utilization disparities EXPANDED in 2020 vs 2019 — rural Medicare beneficiaries more likely to use telehealth in 2019 but less likely in 2020" +author: "Unknown authors (Journal of Telemedicine and Telecare)" +url: https://pubmed.ncbi.nlm.nih.gov/ +date: 2025-07-01 +domain: health +secondary_domains: [] +format: journal-article +status: unprocessed +priority: medium +tags: [telehealth, mental-health, access-equity, rural-health, disparity, Medicare] +--- + +## Content + +**Citation (approximate):** Unknown authors. "The association between rurality, dual Medicare/Medicaid eligibility and chronic conditions with telehealth utilization: An analysis of 2019-2020 national Medicare claims." Journal of Telemedicine and Telecare. Published July 2025 (Epub February 5, 2024). + +**Study design:** Analysis of national Medicare claims data, 2019-2020. Examines whether rurality, dual eligibility (Medicare + Medicaid), and chronic conditions are associated with telehealth utilization. + +**Key findings:** + +1. Telehealth utilization disparities were LARGER in 2020 than in 2019, not smaller — the COVID-19 pandemic expansion of telehealth worsened access disparities rather than reducing them. + +2. Non-Hispanic Black/African-American and Hispanic beneficiaries were less likely to utilize telehealth than White beneficiaries, with disparities growing in 2020. + +3. Rural beneficiaries were MORE likely to utilize telehealth than urban in 2019 (rural telehealth early adoption) — but LESS likely in 2020. The mechanism: urban patients flooded telehealth systems during COVID, displacing rural early adopters as the dominant telehealth user. + +4. Dual-eligible (Medicare + Medicaid, proxy for lowest income) beneficiaries showed persistent utilization disadvantage. + +5. Patients with multiple chronic conditions — who arguably need care most — were among the least likely to utilize telehealth. + +6. Authors' conclusion: "many of the patients in greatest need of healthcare are least likely to utilize telehealth services." + +**Note:** This finding covers 2019-2020, the first year of COVID-19 telehealth expansion. The pattern may have shifted in subsequent years as telehealth infrastructure matured. + +## Agent Notes + +**Why this matters:** This is the clearest evidence of the "serving the already-served" mechanism for telehealth. The COVID pandemic triggered the largest telehealth expansion in history — and the result was EXPANDED disparities, not reduced ones. The rural reversal (more likely in 2019 → less likely in 2020) is particularly striking: rural patients were early adopters, then got crowded out. This challenges naive optimism about telehealth automatically expanding access. + +**What surprised me:** The rural reversal — rural patients were AHEAD of urban on telehealth utilization in 2019, then fell behind in 2020 as urban demand surged. This is the opposite of the usual rural-deficit narrative. It suggests telehealth capacity is a constraint, not just technology access. + +**What I expected but didn't find:** Evidence that COVID telehealth expansion helped the most vulnerable — dual-eligible, multi-chronic condition patients. The finding is the opposite. + +**KB connections:** +- [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] — this provides the mechanism: telehealth expansion → supply captured by urban/already-served → less access for rural/underserved +- [[SDOH interventions show strong ROI but adoption stalls because Z-code documentation remains below 3 percent and no operational infrastructure connects screening to action]] — parallel: good interventions fail when operational infrastructure doesn't reach underserved + +**Extraction hints:** +- The rural reversal (2019: more telehealth; 2020: less telehealth) is the most counterintuitive finding and may be extractable as evidence for a specific mechanism in health technology access dynamics +- Methodological note: 2019-2020 data is old; more recent data needed to assess whether post-pandemic normalization changed the pattern. Archive as medium confidence for this reason. + +## Curator Notes + +PRIMARY CONNECTION: [[the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access]] + +WHY ARCHIVED: National Medicare claims data showing COVID telehealth expansion expanded rather than reduced access disparities — key mechanism evidence for "serves the already-served" claim. + +EXTRACTION HINT: The rural reversal finding (2019 early adopters → 2020 access decline) is the mechanism story; the "greatest need, least access" conclusion is the extractable claim. diff --git a/inbox/queue/2026-04-21-who-glp1-obesity-guideline-december-2025.md b/inbox/queue/2026-04-21-who-glp1-obesity-guideline-december-2025.md new file mode 100644 index 000000000..74e89a0a5 --- /dev/null +++ b/inbox/queue/2026-04-21-who-glp1-obesity-guideline-december-2025.md @@ -0,0 +1,69 @@ +--- +type: source +title: "WHO December 2025 formal guideline recommends GLP-1s for obesity treatment while USPSTF has not moved — creating highest/lowest global health authority endorsement gap" +author: "WHO (World Health Organization)" +url: https://www.who.int/news/item/01-12-2025-who-issues-global-guideline-on-the-use-of-glp-1-medicines-in-treating-obesity +date: 2025-12-01 +domain: health +secondary_domains: [] +format: guideline +status: unprocessed +priority: medium +tags: [GLP-1, WHO, USPSTF, obesity, guideline, coverage-policy, access] +--- + +## Content + +**Event:** December 1, 2025 — WHO issued a formal clinical guideline recommending GLP-1 receptor agonists and GIP/GLP-1 dual agonists as a long-term treatment option for obesity in adults. + +**Key details:** +- Designation: **Conditional recommendation, moderate-certainty evidence** (not full endorsement — acknowledges "limited long-term evidence") +- Drugs covered: liraglutide, semaglutide, tirzepatide +- Population: adults with obesity +- Framing: WHO positions GLP-1s as ONE component within a comprehensive approach requiring healthy diets, physical activity, professional support, and population-level policies +- WHO statement: obesity is a "societal challenge requiring multisectoral action — not just individual medical treatment" +- Countries required to "consider local cost-effectiveness, budget impact, and ethical implications" before adoption + +**GLP-1 added to WHO Essential Medicines List:** +September 2025 — GLP-1s added to WHO Essential Medicines List for managing high-risk patients with type 2 diabetes (not yet for obesity specifically, but the EML listing signals directional intent) + +**The USPSTF gap:** +- USPSTF 2018 recommendation: intensive behavioral interventions for obesity, pharmacotherapy explicitly excluded +- WHO December 2025: conditional recommendation for GLP-1s in obesity +- Gap: WHO (global health authority with no ACA mandate power) endorses GLP-1s for obesity treatment; USPSTF (which governs US ACA preventive coverage mandates) has not moved +- USPSTF process timeline: if they began review now, final recommendation covering GLP-1 pharmacotherapy would likely not arrive before 2028-2030 +- Ironically, WHO's endorsement may increase political pressure on USPSTF to update — but no formal petition or timeline is visible + +**Conditional vs. full endorsement:** +WHO's "conditional" framing (vs. "strong" recommendation) acknowledges: +- Limited long-term evidence (most major trials < 2 years) +- Cost-effectiveness uncertain for resource-constrained systems +- Durability of effects unclear +- Population-level policy context matters + +## Agent Notes + +**Why this matters:** The WHO guideline creates a meaningful policy asymmetry: the global health authority with the broadest mandate (but no US coverage enforcement power) has endorsed GLP-1s for obesity; the US authority with direct ACA coverage mandate power (USPSTF) has not moved. This creates an unusual situation where international travelers in high-income countries with WHO-aligned guidelines (Canada, UK) may access covered GLP-1 obesity treatment while US patients cannot get coverage without comorbidities. + +The WHO's "multisectoral action required" framing is also relevant to Vida's broader thesis: even WHO's endorsement of GLP-1s positions medication as one component, not the solution. This is consistent with the behavioral-infrastructure argument. + +**What surprised me:** WHO moved BEFORE the major trial data on tirzepatide cardiovascular outcomes was fully published. The December 2025 guideline is based on the available evidence as of mid-2025. This is unusually fast for WHO guidelines — typically 3-5 years from evidence emergence to WHO guideline. The speed signals institutional urgency around the obesity epidemic. + +**What I expected but didn't find:** A formal USPSTF response to the WHO guideline. No such response exists — USPSTF operates independently and has not acknowledged the WHO recommendation. + +**KB connections:** +- The existing KB mention of WHO guideline in the GLP-1 economics claim file covers the conditional recommendation framing. This source provides more context on the WHO-USPSTF gap. +- [[SDOH interventions show strong ROI but adoption stalls because Z-code documentation remains below 3 percent and no operational infrastructure connects screening to action]] — parallel: evidence exists but US infrastructure (USPSTF + coverage mandate) hasn't absorbed it + +**Extraction hints:** +- The WHO-USPSTF policy gap is extractable as a standalone claim about the structural lag in US preventive coverage policy: "The highest global health authority (WHO) endorses GLP-1s for obesity treatment while the authority governing US preventive coverage mandates (USPSTF) has not updated its 2018 recommendation that predates semaglutide and tirzepatide" +- Confidence: proven (both documents are public record) +- Scope: structural/policy — not about clinical efficacy + +## Curator Notes + +PRIMARY CONNECTION: [[GLP-1 receptor agonists are the largest therapeutic category launch in pharmaceutical history but their chronic use model makes the net cost impact inflationary through 2035]] + +WHY ARCHIVED: WHO's December 2025 conditional endorsement of GLP-1s for obesity treatment creates a documented WHO-USPSTF policy gap that is extractable as a structural claim about US preventive coverage policy lag. + +EXTRACTION HINT: The claim is about the gap between international endorsement and US coverage mandate mechanism, not about clinical efficacy. Frame as policy structure, not medical science.