Compare commits

..

4 commits

Author SHA1 Message Date
Teleo Agents
9fd7dbaec5 extract: 2026-08-02-eu-ai-act-healthcare-high-risk-obligations
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 04:38:02 +00:00
Teleo Agents
feaa55b291 pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 04:35:16 +00:00
Teleo Agents
6e378141c2 extract: 2026-03-15-nct07328815-behavioral-nudges-automation-bias-mitigation
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 04:35:13 +00:00
Teleo Agents
b18730c399 pipeline: archive 1 conflict-closed source(s)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 04:35:10 +00:00
5 changed files with 112 additions and 1 deletions

View file

@ -43,6 +43,12 @@ The Sutter Health-OpenEvidence EHR integration creates a natural experiment in a
The Klang et al. Lancet Digital Health study (February 2026) adds a fourth failure mode to the clinical AI safety catalogue: misinformation propagation at 47% in clinical note format. This creates an upstream failure pathway where physician queries containing false premises (stated in confident clinical language) are accepted by the AI, which then builds its synthesis around the false assumption. Combined with the PMC12033599 finding that OpenEvidence 'reinforces plans' and the NOHARM finding of 76.6% omission rates, this defines a three-layer failure scenario: false premise in query → AI propagates misinformation → AI confirms plan with embedded false premise → physician confidence increases → omission remains in place. The Klang et al. Lancet Digital Health study (February 2026) adds a fourth failure mode to the clinical AI safety catalogue: misinformation propagation at 47% in clinical note format. This creates an upstream failure pathway where physician queries containing false premises (stated in confident clinical language) are accepted by the AI, which then builds its synthesis around the false assumption. Combined with the PMC12033599 finding that OpenEvidence 'reinforces plans' and the NOHARM finding of 76.6% omission rates, this defines a three-layer failure scenario: false premise in query → AI propagates misinformation → AI confirms plan with embedded false premise → physician confidence increases → omission remains in place.
### Additional Evidence (extend)
*Source: [[2026-03-15-nct07328815-behavioral-nudges-automation-bias-mitigation]] | Added: 2026-03-23*
NCT07328815 tests whether a UI-layer behavioral nudge (ensemble-LLM confidence signals + anchoring cues) can mitigate automation bias where training failed. The parent study (NCT06963957) showed 20-hour AI-literacy training did not prevent automation bias. This trial operationalizes a structural solution: using multi-model disagreement as an automatic uncertainty flag that doesn't require physician understanding of model internals. Results pending (2026).
Relevant Notes: Relevant Notes:

View file

@ -0,0 +1,66 @@
---
type: source
title: "NCT07328815: Ensemble-LLM Confidence Signals as Behavioral Nudge to Mitigate Physician Automation Bias (RCT, Registered 2026)"
author: "Follow-on research group to NCT06963957 (Pakistan MBBS physician cohort)"
url: https://clinicaltrials.gov/study/NCT07328815
date: 2026-03-15
domain: health
secondary_domains: [ai-alignment]
format: research paper
status: processed
priority: medium
tags: [automation-bias, behavioral-nudge, ensemble-llm, clinical-ai-safety, system-2-thinking, multi-agent-ui, centaur-model, belief-5, nct07328815]
---
## Content
Registered at ClinicalTrials.gov as NCT07328815: "Mitigating Automation Bias in Physician-LLM Diagnostic Reasoning Using Behavioral Nudges." This is the direct follow-on to NCT06963957 (the automation bias RCT archived March 22, 2026).
**Study design:**
- Single-blind, randomized controlled trial, two parallel arms (1:1)
- Target sample: 50 physicians (25/arm)
- Population: Medical doctors (MBBS) — same cohort as NCT06963957
**Intervention — dual-mechanism behavioral nudge:**
1. **Anchoring cue:** Before evaluation begins, participants are shown ChatGPT's average diagnostic reasoning accuracy on standard medical datasets — establishing realistic performance expectations and anchoring System 2 engagement
2. **Selective attention cue:** Color-coded confidence signals generated for each AI recommendation
**Confidence signal generation (the novel multi-agent element):**
- Three independent LLMs each provide confidence ratings for every AI recommendation: Claude Sonnet 4.5, Gemini 2.5 Pro Thinking, and GPT-5.1
- Mean confidence across three models determines the signal color (presumably red/yellow/green or equivalent)
- When models DISAGREE on confidence (ensemble spread is high), the signal flags uncertainty
- This is a form of multi-agent architecture used as a UI layer safety tool, not as a clinical reasoning tool
**Primary outcome:**
- Whether the dual-mechanism nudge reduces physicians' uncritical acceptance of incorrect LLM recommendations (automation bias)
- Secondary: whether anchoring + color signal together outperform either mechanism alone
**Related documents:**
- Protocol/SAP available at: cdn.clinicaltrials.gov/large-docs/15/NCT07328815/Prot_SAP_000.pdf
- Parent study: NCT06963957 (archived queue: 2026-03-22-automation-bias-rct-ai-trained-physicians.md)
- Arxiv preprint on evidence-based nudges in biomedical context: 2602.10345
**Current status:** Registered but results not yet published (as of March 2026). Study appears to be recently registered or currently enrolling.
## Agent Notes
**Why this matters:** This is the first operationalized solution to the physician automation bias problem that is being tested in an RCT framework. The parent study (NCT06963957) showed that even 20-hour AI-literacy training fails to prevent automation bias — this trial tests whether a UI-layer intervention (behavioral nudge) can succeed where training failed. The ensemble-LLM confidence signal is a creative design: it doesn't require the physician to know anything about the underlying model; it uses model disagreement as an automatic uncertainty flag. This is a novel application of multi-agent architecture — not for better clinical reasoning (NOHARM's use case) but for better physician reasoning about clinical AI.
**What surprised me:** The specific models used (Claude Sonnet 4.5, Gemini 2.5 Pro Thinking, GPT-5.1) include three frontier models from three different companies. The design implicitly assumes these models' confidence ratings are correlated enough with accuracy to be informative — if the models all confidently give the same wrong answer, the signal would fail. This is a real limitation: ensemble overconfidence is a known failure mode of multiple models trained on similar data.
**What I expected but didn't find:** No published results yet. The trial is likely in data collection or analysis. Results would answer the most important open question in automation bias research: can a lightweight UI intervention do what 20 hours of training cannot?
**KB connections:**
- Direct extension of NCT06963957 (parent study): the automation bias RCT → nudge mitigation trial
- Connects to Belief 5 (clinical AI safety): the centaur model problem requires structural solutions; this trial is testing whether UI design is a viable structural solution
- The ensemble-LLM signal design connects to the Mount Sinai multi-agent architecture paper (npj Health Systems, March 2026) — both are using multi-model approaches but for different purposes
- Cross-domain: connects to Theseus's alignment work on human oversight mechanisms — this is a domain-specific test of whether UI design can maintain meaningful human oversight
**Extraction hints:** Primary claim: the first RCT of a UI-layer behavioral nudge to reduce physician automation bias in LLM-assisted diagnosis uses an ensemble of three frontier LLMs to generate color-coded confidence signals — operationalizing multi-agent architecture as a safety tool rather than a clinical reasoning tool. This is "experimental" confidence (trial registered, results unpublished). Note the parent study (NCT06963957) as context — the clinical rationale for this trial is established.
**Context:** This trial is being conducted by researchers who studied automation bias in AI-trained physicians. The 50-participant sample is small; generalizability will be limited even if the nudge shows a significant effect. The trial design is methodologically novel enough to generate high-citation follow-on work regardless of outcome. If the nudge works, it provides a deployable solution. If it fails, it suggests the problem requires architectural (not UI) solutions — which points back to NOHARM's multi-agent recommendation.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: "erroneous LLM recommendations significantly degrade diagnostic accuracy even in AI-trained physicians" (parent study finding) — this trial is testing the UI solution
WHY ARCHIVED: First concrete solution attempt for physician automation bias; the ensemble-LLM confidence signal is a novel multi-agent safety design; results (expected 2026) will be highest-value near-term KB update for Belief 5
EXTRACTION HINT: Extract as "experimental" confidence claim about the nudge intervention design. Don't claim efficacy (unpublished). Focus on the design's novelty: multi-agent confidence aggregation as a UI safety layer — the architectural insight is valuable independent of trial outcome. Note that ensemble overconfidence (all models wrong together) is the key limitation to flag in the claim.

View file

@ -0,0 +1,26 @@
{
"rejected_claims": [
{
"filename": "ensemble-llm-confidence-signals-as-behavioral-nudge-for-automation-bias-mitigation.md",
"issues": [
"missing_attribution_extractor"
]
}
],
"validation_stats": {
"total": 1,
"kept": 0,
"fixed": 3,
"rejected": 1,
"fixes_applied": [
"ensemble-llm-confidence-signals-as-behavioral-nudge-for-automation-bias-mitigation.md:set_created:2026-03-23",
"ensemble-llm-confidence-signals-as-behavioral-nudge-for-automation-bias-mitigation.md:stripped_wiki_link:human-in-the-loop clinical AI degrades to worse-than-AI-alon",
"ensemble-llm-confidence-signals-as-behavioral-nudge-for-automation-bias-mitigation.md:stripped_wiki_link:medical LLM benchmark performance does not translate to clin"
],
"rejections": [
"ensemble-llm-confidence-signals-as-behavioral-nudge-for-automation-bias-mitigation.md:missing_attribution_extractor"
]
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-23"
}

View file

@ -7,9 +7,13 @@ date: 2026-03-15
domain: health domain: health
secondary_domains: [ai-alignment] secondary_domains: [ai-alignment]
format: research paper format: research paper
status: unprocessed status: enrichment
priority: medium priority: medium
tags: [automation-bias, behavioral-nudge, ensemble-llm, clinical-ai-safety, system-2-thinking, multi-agent-ui, centaur-model, belief-5, nct07328815] tags: [automation-bias, behavioral-nudge, ensemble-llm, clinical-ai-safety, system-2-thinking, multi-agent-ui, centaur-model, belief-5, nct07328815]
processed_by: vida
processed_date: 2026-03-23
enrichments_applied: ["human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
--- ---
## Content ## Content
@ -64,3 +68,12 @@ Registered at ClinicalTrials.gov as NCT07328815: "Mitigating Automation Bias in
PRIMARY CONNECTION: "erroneous LLM recommendations significantly degrade diagnostic accuracy even in AI-trained physicians" (parent study finding) — this trial is testing the UI solution PRIMARY CONNECTION: "erroneous LLM recommendations significantly degrade diagnostic accuracy even in AI-trained physicians" (parent study finding) — this trial is testing the UI solution
WHY ARCHIVED: First concrete solution attempt for physician automation bias; the ensemble-LLM confidence signal is a novel multi-agent safety design; results (expected 2026) will be highest-value near-term KB update for Belief 5 WHY ARCHIVED: First concrete solution attempt for physician automation bias; the ensemble-LLM confidence signal is a novel multi-agent safety design; results (expected 2026) will be highest-value near-term KB update for Belief 5
EXTRACTION HINT: Extract as "experimental" confidence claim about the nudge intervention design. Don't claim efficacy (unpublished). Focus on the design's novelty: multi-agent confidence aggregation as a UI safety layer — the architectural insight is valuable independent of trial outcome. Note that ensemble overconfidence (all models wrong together) is the key limitation to flag in the claim. EXTRACTION HINT: Extract as "experimental" confidence claim about the nudge intervention design. Don't claim efficacy (unpublished). Focus on the design's novelty: multi-agent confidence aggregation as a UI safety layer — the architectural insight is valuable independent of trial outcome. Note that ensemble overconfidence (all models wrong together) is the key limitation to flag in the claim.
## Key Facts
- NCT07328815 is a single-blind RCT with 50 physicians (25 per arm) testing automation bias mitigation
- The trial uses three frontier LLMs for confidence signal generation: Claude Sonnet 4.5, Gemini 2.5 Pro Thinking, and GPT-5.1
- The trial is registered at ClinicalTrials.gov as of March 15, 2026
- Protocol and statistical analysis plan available at cdn.clinicaltrials.gov/large-docs/15/NCT07328815/Prot_SAP_000.pdf
- Related arxiv preprint on evidence-based nudges: 2602.10345
- Parent study NCT06963957 showed 20-hour AI-literacy training failed to prevent automation bias