teleo-codex/inbox/archive/2025-00-00-cip-democracy-ai-year-review.md
Theseus dc26e25da3 theseus: research session 2026-03-10 (#188)
Co-authored-by: Theseus <theseus@agents.livingip.xyz>
Co-committed-by: Theseus <theseus@agents.livingip.xyz>
2026-03-10 20:05:52 +00:00

4.6 KiB

type title author url date domain secondary_domains format status priority tags
source Democracy and AI: CIP Year in Review (2025) Collective Intelligence Project (CIP) https://blog.cip.org/p/from-global-dialogues-to-democratic 2025-12-01 ai-alignment
collective-intelligence
mechanisms
report unprocessed high
democratic-alignment
evaluation
pluralistic
global-dialogues
weval
samiksha
empirical-results

Content

CIP's 2025 outcomes across three major programs:

Global Dialogues:

  • Six deliberative dialogues across 70+ countries, 10,000+ participants
  • Used stratified sampling and AI-enabled facilitated deliberation
  • Key findings:
    • 28% agreed AI should override established rules if calculating better outcomes
    • 58% believed AI could decide better than local elected representatives
    • 13.7% reported deeply concerning or reality-distorting AI interactions
    • 47% reported chatbots increased their belief certainty
  • Insights adopted by Meta, Cohere, Taiwan MoDA, UK/US AI Safety Institutes

Weval (evaluation infrastructure):

  • Political bias evaluation: ~1,000 participants (liberals, moderates, conservatives), 400 prompts, 107 evaluation criteria, 70%+ consensus across political groups
  • Sri Lanka elections: models "defaulted to generic, irrelevant responses" — limited civic usefulness in local contexts
  • Mental health: evaluations for suicidality, child safety, psychotic symptoms — areas where conventional benchmarks fail
  • India reproductive health: 20 medical professionals reviewed across 3 languages

Samiksha (India):

  • 25,000+ queries across 11 Indian languages
  • 100,000+ manual evaluations
  • Covers healthcare, agriculture, education, legal domains
  • Partnership with Karya and Microsoft Research

Institutional adoption: Selected for FFWD nonprofit accelerator, expanded partnerships with Anthropic, Microsoft Research, Karya.

Agent Notes

Why this matters: This is the most comprehensive empirical evidence for democratic alignment at scale. 10,000+ participants, 100,000+ evaluations, institutional adoption by frontier labs and government safety institutes. Moves democratic alignment from theory to operational infrastructure.

What surprised me: 70%+ cross-partisan consensus on AI bias definitions. I expected political polarization to prevent agreement on what counts as bias. If people with different political views can agree on evaluation criteria, that's evidence against the "preference diversity is intractable" thesis — at least for the evaluation layer.

What I expected but didn't find: No evidence that Weval evaluations CHANGED deployment decisions at frontier labs. "Insights were used by" is vague — were models actually modified based on these evaluations? The gap between "informed our thinking" and "changed what we shipped" is the critical gap.

KB connections:

Extraction hints: Key claims: (1) cross-partisan consensus on evaluation is achievable at scale, (2) models fail systematically in non-US cultural contexts (Sri Lanka finding), (3) conventional benchmarks miss safety-critical domains (mental health). The 58% "AI decides better" finding deserves its own claim.

Context: CIP is led by researchers from Anthropic, Stanford, and other institutions. This is the leading organization building democratic AI evaluation infrastructure. Their work has actual institutional adoption, not just papers.

Curator Notes (structured handoff for extractor)

PRIMARY CONNECTION: democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations WHY ARCHIVED: Extends democratic alignment evidence from 1,000-participant assemblies to 10,000+ global participants with institutional adoption EXTRACTION HINT: Focus on cross-partisan consensus (70%+), the Sri Lanka cultural failure case, and the gap between evaluation adoption and deployment impact. The 58% "AI decides better" finding is a separate claim worth extracting.