teleo-codex/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md
Theseus 4c74c5c5d0 theseus: extract claims from 2025-12-00-cip-year-in-review-democratic-alignment (#782)
Co-authored-by: Theseus <theseus@agents.livingip.xyz>
Co-committed-by: Theseus <theseus@agents.livingip.xyz>
2026-03-12 11:02:54 +00:00

6 KiB

type title author url date domain secondary_domains format status priority tags processed_by processed_date enrichments_applied extraction_model extraction_notes
source Democracy and AI: CIP's Year in Review 2025 CIP (Collective Intelligence Project) https://blog.cip.org/p/from-global-dialogues-to-democratic 2025-12-01 ai-alignment
collective-intelligence
mechanisms
report null-result medium
cip
democratic-alignment
global-dialogues
weval
samiksha
digital-twin
frontier-lab-adoption
theseus 2026-03-11
democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md
community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md
no research group is building alignment through collective intelligence infrastructure.md
anthropic/claude-sonnet-4.5 Three new claims extracted on democratic alignment scaling, AI trust dynamics, and digital twin evaluation framework. Three enrichments applied to existing democratic alignment claims. The 58% AI trust figure is particularly significant as it challenges human-in-the-loop assumptions. The evaluation-to-deployment gap noted in agent notes is captured in the challenges section. CIP entity timeline updated with 2025 results and 2026 plans.

Content

CIP's comprehensive 2025 results and 2026 plans.

Global Dialogues scale: 10,000+ participants across 70+ countries in 6 deliberative dialogues.

Key findings:

  • 28% agreed AI should override established rules if calculating better outcomes
  • 58% believed AI could make superior decisions versus local elected representatives
  • 13.7% reported concerning/reality-distorting AI interactions affecting someone they know
  • 47% felt chatbot interactions increased their belief certainty

Weval evaluation framework:

  • Political neutrality: 1,000 participants generated 400 prompts and 107 evaluation criteria, achieving 70%+ consensus across political groups
  • Sri Lanka elections: Models provided generic, irrelevant responses despite local context
  • Mental health: Developed evaluations addressing suicidality, child safety, psychotic symptoms
  • India health: Assessed accuracy and safety in three Indian languages with medical review

Samiksha (India): 25,000+ queries across 11 Indian languages with 100,000+ manual evaluations — "the most comprehensive evaluation of AI in Indian contexts." Domains: healthcare, agriculture, education, legal.

Digital Twin Evaluation Framework: Tests how reliably models represent nuanced views of diverse demographic groups, built on Global Dialogues data.

Frontier lab adoption: Partners include Meta, Cohere, Anthropic, UK/US AI Safety Institutes. Governments in India, Taiwan, Sri Lanka incorporated findings.

2026 plans: Global Dialogues as standing global infrastructure. Epistemic Evaluation Suite measuring truthfulness, groundedness, impartiality. Operationalize digital twin evaluations as governance requirements for agentic systems.

Agent Notes

Why this matters: CIP is the most advanced real-world implementation of democratic alignment infrastructure. The scale (10,000+ participants, 70+ countries) is unprecedented. Lab adoption (Meta, Anthropic, Cohere) moves this from experiment to infrastructure. The 2026 plans — making democratic input "standing global infrastructure" — would fulfill our claim about the need for collective intelligence infrastructure for alignment.

What surprised me: The 58% who believe AI could decide better than elected representatives. This is deeply ambiguous — is it trust in AI + democratic process, or willingness to cede authority to AI? If the latter, it undermines the human-in-the-loop thesis at scale. Also, the Sri Lanka finding (models giving generic responses to local context) reveals a specific failure mode: global models fail local alignment.

What I expected but didn't find: No evidence that Weval/Samiksha results actually CHANGED what labs deployed. Adoption as evaluation tool ≠ adoption as deployment gate. The gap between "we used these insights" and "these changed our product" remains unclear.

KB connections:

Extraction hints: Claims about (1) democratic alignment scaling to 10,000+ globally, (2) 70%+ cross-partisan consensus achievable on AI evaluation criteria, (3) frontier lab adoption of democratic evaluation tools.

Context: CIP is funded by major tech philanthropy. CIP/Anthropic CCAI collaboration set the precedent.

Curator Notes (structured handoff for extractor)

PRIMARY CONNECTION: democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations WHY ARCHIVED: Scale-up evidence for democratic alignment + frontier lab adoption evidence EXTRACTION HINT: The 70%+ cross-partisan consensus and the evaluation-to-deployment gap are both extractable

Key Facts

  • CIP Global Dialogues 2025: 10,000+ participants, 70+ countries, 6 deliberative dialogues
  • Weval political neutrality: 1,000 participants, 400 prompts, 107 evaluation criteria, 70%+ cross-partisan consensus
  • Samiksha India evaluation: 25,000+ queries, 11 Indian languages, 100,000+ manual evaluations
  • Frontier lab partners: Meta, Cohere, Anthropic, UK/US AI Safety Institutes
  • Government adoption: India, Taiwan, Sri Lanka
  • Survey findings: 58% believe AI could decide better than elected representatives; 28% support AI overriding rules for better outcomes; 47% felt chatbot interactions increased belief certainty; 13.7% reported concerning AI interactions affecting someone they know