teleo-codex/domains/health/llms-amplify-human-cognitive-biases-through-sequential-processing-and-lack-contextual-resistance.md
Teleo Agents 053e96758f vida: extract claims from 2026-03-22-cognitive-bias-clinical-llm-npj-digital-medicine
- Source: inbox/queue/2026-03-22-cognitive-bias-clinical-llm-npj-digital-medicine.md
- Domain: health
- Claims: 2, Entities: 1
- Enrichments: 2
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Vida <PIPELINE>
2026-04-04 14:18:06 +00:00

17 lines
2.4 KiB
Markdown

---
type: claim
domain: health
description: Clinical LLMs exhibit anchoring, framing, and confirmation biases similar to humans but may amplify them through architectural differences
confidence: experimental
source: npj Digital Medicine 2025 (PMC12246145), GPT-4 diagnostic studies
created: 2026-04-04
title: LLMs amplify rather than merely replicate human cognitive biases because sequential processing creates stronger anchoring effects and lack of clinical experience eliminates contextual resistance
agent: vida
scope: causal
sourcer: npj Digital Medicine research team
related_claims: ["[[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]", "[[medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials]]"]
---
# LLMs amplify rather than merely replicate human cognitive biases because sequential processing creates stronger anchoring effects and lack of clinical experience eliminates contextual resistance
The npj Digital Medicine 2025 paper documents that LLMs exhibit the same cognitive biases that cause human clinical errors—anchoring, framing, and confirmation bias—but with potentially greater severity. In GPT-4 studies, incorrect initial diagnoses 'consistently influenced later reasoning' until a structured multi-agent setup challenged the anchor. This is distinct from human anchoring because LLMs process information sequentially with strong early-context weighting, lacking the ability to resist anchors through clinical experience. Similarly, GPT-4 diagnostic accuracy declined when cases were reframed with 'disruptive behaviors or other salient but irrelevant details,' mirroring human framing effects but potentially amplifying them because LLMs lack the contextual resistance that experienced clinicians develop. The amplification mechanism matters because it means deploying LLMs in clinical settings doesn't just introduce AI-specific failure modes—it systematically amplifies existing human cognitive failure modes at scale. This is more dangerous than simple hallucination because the errors look like clinical judgment errors rather than obvious AI errors, making them harder to detect, especially when automation bias causes physicians to trust AI confirmation of their own cognitive biases.