theseus: extract from 2025-12-00-cip-year-in-review-democratic-alignment.md

- Source: inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 2) Pentagon-Agent: Theseus <HEADLESS>
2026-03-12 07:52:48 +00:00
10 changed files with 122 additions and 117 deletions
--- a/domains/ai-alignment/58-percent-believe-ai-could-decide-better-than-elected-representatives-creating-ambiguity-about-democratic-alignment-goals.md
+++ b/domains/ai-alignment/58-percent-believe-ai-could-decide-better-than-elected-representatives-creating-ambiguity-about-democratic-alignment-goals.md
@ -0,0 +1,39 @@
+---
+type: claim
+domain: ai-alignment
+description: "Majority willingness to defer to AI over human representatives creates ambiguity about whether democratic alignment targets human authority or AI optimization"
+confidence: experimental
+source: "CIP Year in Review 2025, Global Dialogues findings"
+created: 2026-03-11
+secondary_domains: [collective-intelligence]
+---
+
+# 58% of Global Dialogues participants believe AI could make superior decisions versus local elected representatives, creating ambiguity about whether democratic alignment targets human authority or AI optimization
+
+CIP's Global Dialogues found that 58% of participants believed AI could make superior decisions compared to local elected representatives. This finding is deeply ambiguous: it could indicate trust in AI-augmented democratic processes, or willingness to cede decision authority to AI systems.
+
+If the latter interpretation is correct, it undermines the human-in-the-loop thesis at scale. Democratic alignment assumes humans want to retain decision authority while using AI as a tool. But if a majority believes AI should make decisions instead of humans, the alignment target shifts from "AI that helps humans decide" to "AI that decides on behalf of humans."
+
+The 28% who agreed "AI should override established rules if calculating better outcomes" reinforces this ambiguity. This is not a fringe position — it's more than one in four participants endorsing consequentialist AI authority over rule-of-law constraints. The 47% who felt chatbot interactions increased their belief certainty suggests AI influence on human judgment formation itself.
+
+The critical question is whether these responses reflect:
+1. Frustration with current representatives (AI as protest vote)
+2. Genuine belief in AI superiority (AI as technocratic authority)
+3. Misunderstanding of what "AI decision-making" means in practice
+
+Without disambiguation, democratic alignment infrastructure may be building toward a goal (human authority) that the majority does not actually want.
+
+## Evidence
+- 58% believed AI could make superior decisions vs. local elected representatives (CIP Global Dialogues, 10,000+ participants, 70+ countries)
+- 28% agreed AI should override established rules if calculating better outcomes
+- 47% felt chatbot interactions increased their belief certainty
+
+## Limitations
+The survey question framing is not provided in the source. "Could make superior decisions" is ambiguous — superior in what sense? Faster? More informed? More aligned with participant values? The interpretation depends heavily on how the question was asked. Without access to the survey instrument, we cannot determine whether responses reflect genuine preference for AI authority or misunderstanding of the question. This is a single survey from a single organization, so confidence is experimental.
+
+---
+
+Relevant Notes:
+- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] — but may be building toward AI authority rather than human authority
+- [[AI alignment is a coordination problem not a technical problem]] — coordination around what goal?
+- [[safe AI development requires building alignment mechanisms before scaling capability]] — but what if the alignment target is AI decision authority?
--- a/domains/ai-alignment/ai-models-fail-local-alignment-providing-generic-responses-to-culturally-specific-contexts.md
+++ b/domains/ai-alignment/ai-models-fail-local-alignment-providing-generic-responses-to-culturally-specific-contexts.md
@ -1,40 +0,0 @@
---
-type: claim
-domain: ai-alignment
-description: "Global AI models provide generic responses to culturally-specific contexts despite having relevant local information in training data"
-confidence: experimental
-source: "CIP Year in Review 2025, Sri Lanka elections and Samiksha evaluations"
-created: 2026-03-11
-secondary_domains: [collective-intelligence]
---
-
-# AI models fail local alignment by providing generic responses to culturally-specific contexts despite having relevant training data
-
-CIP's evaluation of AI models during Sri Lanka's elections revealed a specific failure mode: models provided generic, irrelevant responses despite the local context being available. This suggests that global models trained predominantly on Western data fail to activate or prioritize culturally-specific knowledge even when it exists in their training corpus.
-
-This failure mode is distinct from lack of capability—the models had access to information about Sri Lankan politics but defaulted to generic responses rather than contextually appropriate ones. This reveals a structural misalignment between global model training and local deployment contexts. The problem is not that the knowledge is absent, but that the model's optimization process does not reliably surface or weight local context appropriately.
-
-The finding is reinforced by Samiksha's evaluation of 25,000+ queries across 11 Indian languages, which required 100,000+ manual evaluations precisely because automated metrics could not capture cultural appropriateness. Domains tested included healthcare, agriculture, education, and legal contexts—all areas where local norms, practices, and values diverge materially from Western-centric training data. The requirement for human expert review to assess accuracy and safety indicates that standard evaluation metrics miss culturally-embedded alignment failures.
-
-## Evidence
-
- **Sri Lanka elections**: Models provided generic, irrelevant responses despite local context being available in training data
- **Samiksha scale**: 25,000+ queries across 11 Indian languages with 100,000+ manual evaluations required
- **Domains tested**: Healthcare, agriculture, education, legal contexts in Indian languages
- **Evaluation requirement**: Human expert review necessary to assess accuracy and safety, indicating automated metrics insufficient
- **Implication**: The failure is not capability but prioritization—models have the information but don't reliably use it
-
-## Implications
-
-This failure mode suggests that scaling model size or training data alone will not solve alignment for diverse global populations. The models need mechanisms to recognize and prioritize local context, not just possess the information. This has direct implications for the [[no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it]] claim—local alignment may require continuous community input rather than one-time training data inclusion.
-
---
-
-Relevant Notes:
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
- [[community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules]]
- [[persistent irreducible disagreement]]
-
-Topics:
- [[domains/ai-alignment/_map]]
- [[foundations/collective-intelligence/_map]]
--- a/domains/ai-alignment/ai-models-fail-local-alignment-when-trained-globally-sri-lanka-election-responses-were-generic-despite-local-context.md
+++ b/domains/ai-alignment/ai-models-fail-local-alignment-when-trained-globally-sri-lanka-election-responses-were-generic-despite-local-context.md
@ -0,0 +1,31 @@
+---
+type: claim
+domain: ai-alignment
+description: "Global model training creates systematic failure mode where models cannot provide locally-relevant responses to context-specific queries"
+confidence: experimental
+source: "CIP Year in Review 2025, Weval Sri Lanka elections evaluation"
+created: 2026-03-11
+secondary_domains: [ai-safety]
+---
+
+# Global model training creates systematic failure mode where AI models provide generic responses to local context-specific queries, as evidenced by Sri Lanka election evaluation
+
+CIP's Weval evaluation of AI models during Sri Lanka elections found that models provided generic, irrelevant responses despite being given local context. This reveals a specific failure mode: global training creates models that cannot align to local contexts even when explicitly prompted.
+
+This is distinct from general capability failures. The models were not unable to respond — they responded with generic political advice that would apply anywhere, failing to engage with the specific electoral dynamics, candidates, or issues in Sri Lanka. The failure is one of alignment granularity: the model's training optimized for global applicability at the cost of local relevance.
+
+The implication is that democratic alignment at scale may require region-specific training or fine-tuning, not just global deliberation. A model aligned to aggregate global preferences may systematically fail populations whose contexts differ from training distribution centroids.
+
+## Evidence
+- Weval Sri Lanka elections evaluation: Models provided generic responses despite local electoral context
+- This occurred despite CIP's global deliberation framework being active
+- The failure mode is systematic (generic responses) not random (hallucination or refusal)
+
+## Limitations
+Single-country evaluation limits generalizability. We don't know if this is specific to Sri Lanka, to elections, or to the models tested. The source doesn't specify which models were evaluated, what prompts were used, or whether this failure mode appears in other regional evaluations (e.g., Samiksha in India). Confidence is experimental because this is a single case study.
+
+---
+
+Relevant Notes:
+- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] — but may not solve local context failures
+- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] — local context as irreducible diversity
--- a/domains/ai-alignment/community-centred
+++ b/domains/ai-alignment/community-centred
@ -23,7 +23,7 @@ Since [[collective intelligence requires diversity as a structural precondition
 ### Additional Evidence (confirm)
 *Source: [[2025-12-00-cip-year-in-review-democratic-alignment]] | Added: 2026-03-12 | Extractor: anthropic/claude-sonnet-4.5*

-CIP's Weval framework confirmed this at global scale through multiple independent evaluations: (1) Political neutrality evaluation where 1,000 participants generated 400 prompts synthesized into 107 criteria that achieved 70%+ consensus across political groups—criteria that would not emerge from developer specifications alone. (2) Sri Lanka elections evaluation revealed models providing generic responses despite local context, showing the gap between developer-specified behavior and locally-appropriate alignment. (3) Samiksha's 25,000+ queries across 11 Indian languages in healthcare, agriculture, education, and legal domains required 100,000+ manual evaluations precisely because community norms in these contexts differ materially from developer assumptions. The requirement for manual evaluation indicates that automated metrics (which reflect developer assumptions) cannot capture community-centered alignment targets.
+CIP's Weval political neutrality evaluation generated 400 prompts and 107 evaluation criteria from 1,000 participants, achieving 70%+ consensus across political groups. Samiksha conducted 25,000+ queries across 11 Indian languages with 100,000+ manual evaluations in healthcare, agriculture, education, and legal domains. Both programs demonstrate that community-centered evaluation at scale surfaces context-specific alignment targets that global model training misses — as evidenced by models providing generic responses to Sri Lanka election queries despite local context.

 ---

--- a/domains/ai-alignment/democratic
+++ b/domains/ai-alignment/democratic
@ -23,7 +23,7 @@ Since [[collective intelligence requires diversity as a structural precondition
 ### Additional Evidence (extend)
 *Source: [[2025-12-00-cip-year-in-review-democratic-alignment]] | Added: 2026-03-12 | Extractor: anthropic/claude-sonnet-4.5*

-CIP's Global Dialogues scaled democratic alignment to 10,000+ participants across 70+ countries in 2025, representing a 100x increase over previous experiments. The program achieved 70%+ cross-partisan consensus on 107 AI evaluation criteria for political neutrality, with 1,000 participants generating 400 prompts that were synthesized into these criteria. Critically, this approach has been adopted by frontier labs (Meta, Cohere, Anthropic) and governments (India, Taiwan, Sri Lanka), moving from experimental to infrastructural status. The 2026 plans explicitly aim to make Global Dialogues 'standing global infrastructure' for AI governance and operationalize digital twin evaluations as governance requirements for agentic systems. This extends the original claim from small-scale assemblies to global-scale infrastructure while maintaining consensus quality across political divides.
+CIP's Global Dialogues scaled democratic alignment to 10,000+ participants across 70+ countries in 2025, achieving 70%+ cross-partisan consensus on AI evaluation criteria. This represents a 100x scale-up from previous experiments while maintaining consensus properties. Frontier labs (Meta, Cohere, Anthropic) and governments (India, Taiwan, Sri Lanka) adopted the frameworks, indicating the approach has crossed the credibility threshold for institutional use. However, the gap between evaluation adoption and deployment impact remains unclear — labs using these tools does not necessarily mean findings changed what was deployed.

 ---

--- a/domains/ai-alignment/democratic-ai-alignment-scaled-to-10000-participants-across-70-countries-achieving-cross-partisan-consensus.md
+++ b/domains/ai-alignment/democratic-ai-alignment-scaled-to-10000-participants-across-70-countries-achieving-cross-partisan-consensus.md
@ -1,7 +1,7 @@
 ---
 type: claim
 domain: ai-alignment
-description: "Democratic alignment infrastructure can operate at 10,000+ participant scale while maintaining 70%+ cross-partisan consensus on evaluation criteria"
+description: "Democratic alignment infrastructure can scale to 10,000+ participants across 70+ countries while maintaining 70%+ cross-partisan consensus on evaluation criteria"
 confidence: likely
 source: "CIP Year in Review 2025, Global Dialogues program"
 created: 2026-03-11
@ -10,32 +10,24 @@ secondary_domains: [collective-intelligence, mechanisms]

 # Democratic AI alignment scaled to 10,000+ participants across 70+ countries achieving 70%+ cross-partisan consensus on evaluation criteria

-CIP's Global Dialogues program in 2025 demonstrated that democratic alignment infrastructure can operate at unprecedented scale while maintaining meaningful consensus across political divides. The program engaged 10,000+ participants across 70+ countries in 6 deliberative dialogues. For the political neutrality evaluation specifically, 1,000 participants generated 400 prompts that were synthesized into 107 evaluation criteria, achieving 70%+ consensus across political groups on these criteria.
+CIP's Global Dialogues program in 2025 achieved 10,000+ participants across 70+ countries in 6 deliberative dialogues, demonstrating that democratic alignment infrastructure can operate at scale while maintaining meaningful consensus. The Weval political neutrality evaluation generated 400 prompts and 107 evaluation criteria from 1,000 participants, achieving 70%+ consensus across political groups.

-This represents a 100x scale increase over previous democratic alignment experiments while maintaining consensus quality. The cross-partisan consensus is particularly significant given the polarized nature of AI governance debates—the fact that participants across political groups could agree on 107 specific evaluation criteria suggests that democratic processes can surface shared values about AI behavior even in contentious domains.
+This represents a 100x scale-up from previous democratic alignment experiments while maintaining the consensus properties that make the approach viable. The cross-partisan consensus threshold (70%+) is particularly significant because it demonstrates that diverse populations can agree on AI evaluation criteria despite political polarization.

-The program's adoption by frontier labs (Meta, Cohere, Anthropic) and governments (India, Taiwan, Sri Lanka) indicates this approach has moved from experimental to infrastructural status. The 2026 roadmap explicitly aims to establish Global Dialogues as "standing global infrastructure" for AI governance.
+The scale achievement matters because it moves democratic alignment from experimental proof-of-concept to operational infrastructure. Frontier lab adoption (Meta, Cohere, Anthropic) and government incorporation (India, Taiwan, Sri Lanka) indicate the approach has crossed the credibility threshold for institutional use.

 ## Evidence
-
- **Scale**: 10,000+ participants across 70+ countries in 6 deliberative dialogues (2025)
- **Consensus mechanism**: 1,000 participants generated 400 prompts synthesized into 107 evaluation criteria
- **Cross-partisan agreement**: 70%+ consensus achieved across political groups on these criteria
- **Adoption**: Meta, Cohere, Anthropic, UK/US AI Safety Institutes, plus governments in India, Taiwan, Sri Lanka
- **2026 plans**: Establish Global Dialogues as standing global infrastructure; operationalize digital twin evaluations as governance requirements for agentic systems
+- CIP Global Dialogues: 10,000+ participants, 70+ countries, 6 deliberative dialogues (2025)
+- Weval political neutrality: 1,000 participants, 400 prompts, 107 criteria, 70%+ cross-partisan consensus
+- Frontier lab partners: Meta, Cohere, Anthropic, UK/US AI Safety Institutes
+- Government adoption: India, Taiwan, Sri Lanka incorporated findings into policy

 ## Limitations
-
-The gap between evaluation adoption and deployment impact remains unclear. Labs using these tools as evaluation frameworks does not necessarily mean the findings changed what was deployed. The source notes "adoption as evaluation tool ≠ adoption as deployment gate." This is a critical distinction—the infrastructure may be adopted for assessment purposes without changing actual model deployment decisions.
+The gap between evaluation adoption and deployment impact remains unclear. Labs using these tools as evaluation frameworks does not necessarily mean the findings changed what was deployed. The source provides no evidence that Weval/Samiksha results altered product decisions or deployment behavior.

 ---

 Relevant Notes:
- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] — this extends that finding to 10,000+ scale with cross-partisan consensus
+- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] — extended to 10,000+ scale
 - [[community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules]] — confirmed at global scale
- [[AI alignment is a coordination problem not a technical problem]]
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
-
-Topics:
- [[domains/ai-alignment/_map]]
- [[foundations/collective-intelligence/_map]]
+- [[AI alignment is a coordination problem not a technical problem]] — democratic infrastructure as coordination mechanism
--- a/domains/ai-alignment/majority-of-global-participants-believe-ai-could-make-superior-decisions-to-elected-representatives.md
+++ b/domains/ai-alignment/majority-of-global-participants-believe-ai-could-make-superior-decisions-to-elected-representatives.md
@ -1,50 +0,0 @@
---
-type: claim
-domain: ai-alignment
-description: "58% of 10,000+ global participants expressed belief that AI could make better decisions than elected representatives, creating ambiguity about democratic delegation"
-confidence: experimental
-source: "CIP Year in Review 2025, Global Dialogues findings"
-created: 2026-03-11
-secondary_domains: [collective-intelligence, grand-strategy]
---
-
-# Majority of global participants believe AI could make superior decisions to elected representatives, creating ambiguity about democratic delegation
-
-In CIP's Global Dialogues with 10,000+ participants across 70+ countries, 58% believed AI could make superior decisions compared to local elected representatives. This finding is deeply ambiguous—it could represent either:
-
-1. **Trust in AI + democratic process**: Belief that democratically-aligned AI systems could aggregate preferences better than representatives
-2. **Willingness to cede authority**: Acceptance of AI decision-making that bypasses democratic accountability
-
-The distinction matters enormously for alignment strategy. If the former, it supports the case for democratic alignment infrastructure as a way to channel existing trust in AI toward accountable systems. If the latter, it undermines the human-in-the-loop thesis at scale by revealing that populations may voluntarily delegate authority to AI systems, creating a coordination problem where individual preference for AI authority conflicts with collective governance needs.
-
-Additional context from the same survey reinforces this ambiguity:
- **28% agreed AI should override established rules** if calculating better outcomes—suggesting willingness to subordinate institutional constraints to AI optimization
- **47% felt chatbot interactions increased their belief certainty**—indicating that AI interactions may increase confidence in AI-generated conclusions, potentially creating a feedback loop
- **13.7% reported concerning/reality-distorting AI interactions** affecting someone they know—suggesting real harms are already occurring at scale
-
-These findings together suggest a population increasingly comfortable with AI authority, which creates both opportunity and risk for alignment approaches. Democratic alignment infrastructure could channel this trust toward accountable systems, but without such infrastructure, the same trust could enable unaccountable AI authority.
-
-## Evidence
-
- **Scale**: 10,000+ participants across 70+ countries
- **AI vs representatives**: 58% believed AI could make superior decisions
- **Rule override**: 28% agreed AI should override established rules for better outcomes
- **Belief certainty**: 47% felt chatbot interactions increased their certainty
- **Concerning interactions**: 13.7% reported reality-distorting AI interactions affecting someone they know
-
-## Challenges
-
-The survey did not disambiguate between "AI aligned through democratic process" and "AI replacing democratic process." The 58% figure could represent very different underlying beliefs about AI governance. Without follow-up questions, it's impossible to determine whether respondents envision AI as a tool for democratic aggregation or as a replacement for democratic decision-making.
-
---
-
-Relevant Notes:
- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]]
- [[AI alignment is a coordination problem not a technical problem]]
- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]]
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
-
-Topics:
- [[domains/ai-alignment/_map]]
- [[foundations/collective-intelligence/_map]]
- [[core/grand-strategy/_map]]
--- a/domains/ai-alignment/no
+++ b/domains/ai-alignment/no
@ -21,7 +21,7 @@ The alignment field has converged on a problem they cannot solve with their curr
 ### Additional Evidence (challenge)
 *Source: [[2025-12-00-cip-year-in-review-democratic-alignment]] | Added: 2026-03-12 | Extractor: anthropic/claude-sonnet-4.5*

-CIP has built substantial collective intelligence infrastructure for alignment at scale, directly challenging this claim. The Global Dialogues program engaged 10,000+ participants across 70+ countries in 2025, with adoption by Meta, Cohere, Anthropic, UK/US AI Safety Institutes, and governments in India, Taiwan, and Sri Lanka. The 2026 roadmap explicitly aims to establish Global Dialogues as 'standing global infrastructure' and operationalize digital twin evaluations as governance requirements for agentic systems. While gaps remain (particularly the evaluation-to-deployment gap where adoption for evaluation does not guarantee deployment changes), CIP represents a significant counterexample to the claim that no group is building this infrastructure. The frontier lab adoption and government incorporation suggest this infrastructure is moving from experimental to operational status.
+CIP is building alignment through collective intelligence infrastructure at scale. Global Dialogues reached 10,000+ participants across 70+ countries. Weval and Samiksha provide evaluation frameworks adopted by Meta, Cohere, Anthropic, and governments in India, Taiwan, and Sri Lanka. CIP's 2026 plans explicitly aim to make Global Dialogues 'standing global infrastructure' for democratic AI alignment. While gaps remain (evaluation vs. deployment impact), CIP is operationalizing collective intelligence for alignment, not just theorizing it.

 ---

--- a/domains/ai-alignment/samiksha-is-most-comprehensive-ai-evaluation-in-indian-contexts-with-25000-queries-across-11-languages.md
+++ b/domains/ai-alignment/samiksha-is-most-comprehensive-ai-evaluation-in-indian-contexts-with-25000-queries-across-11-languages.md
@ -0,0 +1,32 @@
+---
+type: claim
+domain: ai-alignment
+description: "Samiksha represents unprecedented scale for multilingual, multi-domain AI evaluation in non-English contexts"
+confidence: likely
+source: "CIP Year in Review 2025, Samiksha program"
+created: 2026-03-11
+secondary_domains: [collective-intelligence]
+---
+
+# Samiksha conducted 25,000+ queries across 11 Indian languages with 100,000+ manual evaluations, representing the most comprehensive multilingual AI evaluation in non-English contexts
+
+CIP's Samiksha program conducted 25,000+ queries across 11 Indian languages with 100,000+ manual evaluations, covering healthcare, agriculture, education, and legal domains. CIP describes this as "the most comprehensive evaluation of AI in Indian contexts," and the scale supports that claim — no comparable multilingual, multi-domain evaluation exists in public literature.
+
+The significance is methodological and political. Methodologically, it demonstrates that rigorous AI evaluation can be conducted in non-English, non-Western contexts at scale. Politically, it provides evidence for Indian policymakers that AI systems trained primarily on English/Western data may not serve Indian populations adequately.
+
+The 100,000+ manual evaluations indicate human-in-the-loop assessment at scale, not automated metrics. This matters because automated evaluation metrics (BLEU, ROUGE, perplexity) are known to correlate poorly with actual utility in domain-specific, multilingual contexts. Medical review was included for healthcare accuracy and safety assessment, indicating domain-expert validation.
+
+## Evidence
+- Samiksha: 25,000+ queries, 11 Indian languages, 100,000+ manual evaluations
+- Domains: healthcare, agriculture, education, legal
+- Medical review included for healthcare accuracy and safety assessment
+- Indian government incorporated findings (specific policy changes not detailed in source)
+
+## Limitations
+The source does not provide specific findings from Samiksha — only scale metrics and domain coverage. We don't know what the evaluation revealed about model performance, what failure modes were identified, or how Indian government policy changed in response. The claim is about the evaluation's comprehensiveness and methodology, not its results. Confidence is 'likely' based on scale and institutional adoption, but the lack of detailed findings limits how much we can infer about impact.
+
+---
+
+Relevant Notes:
+- [[community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules]] — Samiksha as community-centered evaluation at scale
+- [[global model training creates systematic failure mode where AI models provide generic responses to local context-specific queries, as evidenced by Sri Lanka election evaluation]] — Samiksha may reveal similar failures in Indian contexts
--- a/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md
+++ b/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md
@ -12,10 +12,10 @@ priority: medium
 tags: [cip, democratic-alignment, global-dialogues, weval, samiksha, digital-twin, frontier-lab-adoption]
 processed_by: theseus
 processed_date: 2026-03-11
-claims_extracted: ["democratic-ai-alignment-scaled-to-10000-participants-across-70-countries-achieving-cross-partisan-consensus.md", "ai-models-fail-local-alignment-providing-generic-responses-to-culturally-specific-contexts.md", "majority-of-global-participants-believe-ai-could-make-superior-decisions-to-elected-representatives.md"]
+claims_extracted: ["democratic-ai-alignment-scaled-to-10000-participants-across-70-countries-achieving-cross-partisan-consensus.md", "ai-models-fail-local-alignment-when-trained-globally-sri-lanka-election-responses-were-generic-despite-local-context.md", "samiksha-is-most-comprehensive-ai-evaluation-in-indian-contexts-with-25000-queries-across-11-languages.md", "58-percent-believe-ai-could-decide-better-than-elected-representatives-creating-ambiguity-about-democratic-alignment-goals.md"]
 enrichments_applied: ["democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md", "community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md", "no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md"]
 extraction_model: "anthropic/claude-sonnet-4.5"
-extraction_notes: "Three new claims extracted focusing on (1) democratic alignment scaling with maintained consensus, (2) local alignment failure mode in global models, and (3) population willingness to delegate authority to AI. Three enrichments applied: extending the democratic assemblies claim with 100x scale evidence, confirming community-centred norm elicitation at global scale, and challenging the 'no group is building CI infrastructure' claim with CIP as counterexample. The evaluation-to-deployment gap noted in agent notes is captured in the challenges section of the first claim. The 58% AI-vs-representatives finding is treated as experimental confidence due to ambiguity about what respondents actually meant."
+extraction_notes: "Four claims extracted focusing on: (1) democratic alignment scaling achievement, (2) local alignment failure mode, (3) Samiksha evaluation comprehensiveness, (4) ambiguity in public willingness to defer to AI. Three enrichments applied to existing claims about democratic alignment and collective intelligence infrastructure. The 58% finding about AI vs. elected representatives is particularly significant as it creates ambiguity about whether democratic alignment should preserve human authority or enable AI authority. CIP entity updated with 2025 achievements and 2026 plans."
 ---

 ## Content
@ -69,8 +69,9 @@ EXTRACTION HINT: The 70%+ cross-partisan consensus and the evaluation-to-deploym

 ## Key Facts
 - CIP Global Dialogues: 10,000+ participants, 70+ countries, 6 deliberative dialogues (2025)
- Political neutrality evaluation: 1,000 participants, 400 prompts, 107 criteria, 70%+ cross-partisan consensus
+- Weval political neutrality: 1,000 participants, 400 prompts, 107 evaluation criteria generated
 - Samiksha: 25,000+ queries, 11 Indian languages, 100,000+ manual evaluations
- Frontier lab adoption: Meta, Cohere, Anthropic, UK/US AI Safety Institutes
+- 13.7% reported concerning/reality-distorting AI interactions affecting someone they know
+- 47% felt chatbot interactions increased their belief certainty
+- Frontier lab partners: Meta, Cohere, Anthropic, UK/US AI Safety Institutes
 - Government adoption: India, Taiwan, Sri Lanka
- Survey findings: 28% support AI overriding rules for better outcomes, 58% believe AI could decide better than elected representatives, 47% felt chatbot interactions increased belief certainty, 13.7% reported concerning AI interactions affecting someone they know