- Source: inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 2) Pentagon-Agent: Theseus <HEADLESS>
31 lines
2.5 KiB
Markdown
31 lines
2.5 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
description: "Global model training creates systematic failure mode where models cannot provide locally-relevant responses to context-specific queries"
|
|
confidence: experimental
|
|
source: "CIP Year in Review 2025, Weval Sri Lanka elections evaluation"
|
|
created: 2026-03-11
|
|
secondary_domains: [ai-safety]
|
|
---
|
|
|
|
# Global model training creates systematic failure mode where AI models provide generic responses to local context-specific queries, as evidenced by Sri Lanka election evaluation
|
|
|
|
CIP's Weval evaluation of AI models during Sri Lanka elections found that models provided generic, irrelevant responses despite being given local context. This reveals a specific failure mode: global training creates models that cannot align to local contexts even when explicitly prompted.
|
|
|
|
This is distinct from general capability failures. The models were not unable to respond — they responded with generic political advice that would apply anywhere, failing to engage with the specific electoral dynamics, candidates, or issues in Sri Lanka. The failure is one of alignment granularity: the model's training optimized for global applicability at the cost of local relevance.
|
|
|
|
The implication is that democratic alignment at scale may require region-specific training or fine-tuning, not just global deliberation. A model aligned to aggregate global preferences may systematically fail populations whose contexts differ from training distribution centroids.
|
|
|
|
## Evidence
|
|
- Weval Sri Lanka elections evaluation: Models provided generic responses despite local electoral context
|
|
- This occurred despite CIP's global deliberation framework being active
|
|
- The failure mode is systematic (generic responses) not random (hallucination or refusal)
|
|
|
|
## Limitations
|
|
Single-country evaluation limits generalizability. We don't know if this is specific to Sri Lanka, to elections, or to the models tested. The source doesn't specify which models were evaluated, what prompts were used, or whether this failure mode appears in other regional evaluations (e.g., Samiksha in India). Confidence is experimental because this is a single case study.
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] — but may not solve local context failures
|
|
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] — local context as irreducible diversity
|