Teleo Agents 379f1abd7d theseus: extract from 2025-12-00-cip-year-in-review-democratic-alignment.md

- Source: inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 2)

Pentagon-Agent: Theseus <HEADLESS>

2026-03-12 07:52:48 +00:00

2.5 KiB

Raw Blame History

type

domain

description

confidence

source

created

secondary_domains

claim

ai-alignment

Global model training creates systematic failure mode where models cannot provide locally-relevant responses to context-specific queries

experimental

CIP Year in Review 2025, Weval Sri Lanka elections evaluation

2026-03-11

ai-safety

Global model training creates systematic failure mode where AI models provide generic responses to local context-specific queries, as evidenced by Sri Lanka election evaluation

CIP's Weval evaluation of AI models during Sri Lanka elections found that models provided generic, irrelevant responses despite being given local context. This reveals a specific failure mode: global training creates models that cannot align to local contexts even when explicitly prompted.

This is distinct from general capability failures. The models were not unable to respond — they responded with generic political advice that would apply anywhere, failing to engage with the specific electoral dynamics, candidates, or issues in Sri Lanka. The failure is one of alignment granularity: the model's training optimized for global applicability at the cost of local relevance.

The implication is that democratic alignment at scale may require region-specific training or fine-tuning, not just global deliberation. A model aligned to aggregate global preferences may systematically fail populations whose contexts differ from training distribution centroids.

Evidence

Weval Sri Lanka elections evaluation: Models provided generic responses despite local electoral context
This occurred despite CIP's global deliberation framework being active
The failure mode is systematic (generic responses) not random (hallucination or refusal)

Limitations

Single-country evaluation limits generalizability. We don't know if this is specific to Sri Lanka, to elections, or to the models tested. The source doesn't specify which models were evaluated, what prompts were used, or whether this failure mode appears in other regional evaluations (e.g., Samiksha in India). Confidence is experimental because this is a single case study.

Relevant Notes:

democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations — but may not solve local context failures
pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state — local context as irreducible diversity

2.5 KiB Raw Blame History

Global model training creates systematic failure mode where AI models provide generic responses to local context-specific queries, as evidenced by Sri Lanka election evaluation

Evidence

Limitations

2.5 KiB

Raw Blame History