diff --git a/domains/ai-alignment/47-percent-of-users-report-chatbot-interactions-increased-their-belief-certainty-revealing-epistemic-risk-from-ai-confidence.md b/domains/ai-alignment/47-percent-of-users-report-chatbot-interactions-increased-their-belief-certainty-revealing-epistemic-risk-from-ai-confidence.md new file mode 100644 index 000000000..202e5b52d --- /dev/null +++ b/domains/ai-alignment/47-percent-of-users-report-chatbot-interactions-increased-their-belief-certainty-revealing-epistemic-risk-from-ai-confidence.md @@ -0,0 +1,45 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence, cultural-dynamics] +description: "CIP Global Dialogues found 47% of 10,000+ participants reported increased belief certainty after chatbot interactions, suggesting models amplify confidence independent of accuracy" +confidence: experimental +source: "CIP Year in Review 2025, Global Dialogues finding from 10,000+ participants across 70+ countries" +created: 2025-12-15 +--- + +# 47 percent of users report chatbot interactions increased their belief certainty revealing epistemic risk from AI confidence + +CIP's Global Dialogues found that 47% of 10,000+ participants across 70+ countries reported that chatbot interactions increased their belief certainty. This reveals an epistemic risk: AI systems may amplify user confidence regardless of whether responses are accurate, complete, or appropriately uncertain. + +The finding is concerning because increased certainty is not inherently good—it depends on whether the certainty is warranted. If AI systems make users more confident in beliefs that are incomplete, context-dependent, or contested, the result is epistemic harm even if individual responses are factually accurate. + +This connects to the broader alignment challenge: optimizing for user satisfaction or engagement may systematically increase certainty, which feels good to users but degrades collective epistemics. A well-aligned system might need to increase uncertainty in some contexts, which conflicts with user preference optimization. + +The 47% figure is high enough to be a population-level effect, not an edge case. Combined with the 13.7% reporting concerning/reality-distorting interactions affecting someone they know, this suggests AI systems are already having measurable epistemic effects at scale. + +## Evidence +- 47% of Global Dialogues participants reported increased belief certainty after chatbot interactions +- Sample: 10,000+ participants across 70+ countries +- 13.7% reported concerning/reality-distorting AI interactions affecting someone they know +- Finding emerged from deliberative dialogue context, not isolated survey + +## Challenges +- Source does not specify whether increased certainty correlated with accuracy +- "Belief certainty" is self-reported and may not reflect actual epistemic state +- No comparison to certainty changes from other information sources (news, social media, etc.) +- Causal direction unclear: do certain people seek out AI, or does AI cause certainty increase? +- No breakdown by domain, language, or demographic group +- No information on whether effect persists over time + +--- + +Relevant Notes: +- [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]] +- [[some disagreements are permanently irreducible because they stem from genuine value differences not information gaps and systems must map rather than eliminate them]] +- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]] + +Topics: +- [[domains/ai-alignment/_map]] +- [[foundations/collective-intelligence/_map]] +- [[core/grand-strategy/_map]] diff --git a/domains/ai-alignment/58-percent-believe-ai-could-make-superior-decisions-versus-local-elected-representatives-revealing-ambiguous-trust-in-ai-governance.md b/domains/ai-alignment/58-percent-believe-ai-could-make-superior-decisions-versus-local-elected-representatives-revealing-ambiguous-trust-in-ai-governance.md new file mode 100644 index 000000000..4f44f2881 --- /dev/null +++ b/domains/ai-alignment/58-percent-believe-ai-could-make-superior-decisions-versus-local-elected-representatives-revealing-ambiguous-trust-in-ai-governance.md @@ -0,0 +1,47 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence, grand-strategy] +description: "CIP Global Dialogues found 58% of 10,000+ participants believed AI could make superior decisions versus local elected representatives, but the meaning is ambiguous" +confidence: experimental +source: "CIP Year in Review 2025, Global Dialogues finding from 10,000+ participants across 70+ countries" +created: 2025-12-15 +--- + +# 58 percent believe AI could make superior decisions versus local elected representatives revealing ambiguous trust in AI governance + +CIP's Global Dialogues found that 58% of 10,000+ participants across 70+ countries believed AI could make superior decisions compared to local elected representatives. This finding is deeply ambiguous: it could represent trust in AI combined with democratic oversight, or willingness to cede authority to AI systems directly. + +If the 58% reflects belief that "AI + democratic process" outperforms "human representatives alone," it supports the case for AI-augmented governance with human oversight. But if it reflects willingness to replace human decision-makers with AI systems, it undermines the human-in-the-loop thesis at scale. + +The finding is particularly striking because it asks about local elected representatives—people participants can vote out—not distant or unaccountable authorities. This suggests the 58% may reflect dissatisfaction with democratic institutions as much as trust in AI. + +The result creates a governance paradox: if democratic processes are meant to align AI to human values, but humans trust AI more than democratic representatives, what grounds the legitimacy of democratic alignment? The 58% may represent a crisis of democratic legitimacy that AI alignment cannot solve and may accelerate. + +## Evidence +- 58% of Global Dialogues participants believed AI could make superior decisions vs. local elected representatives +- Sample: 10,000+ participants across 70+ countries +- Question specifically referenced "local elected representatives," not abstract authority +- Finding emerged from deliberative dialogue context + +## Challenges +- Source does not specify what "superior decisions" means to participants +- Ambiguity between "AI + democratic oversight" vs. "AI replacing democracy" +- May reflect dissatisfaction with current representatives rather than trust in AI +- No data on whether responses vary by regime type, governance quality, or political context +- Single survey question without follow-up to clarify interpretation +- No breakdown by demographic group, country, or political affiliation +- No information on how this compares to trust in other institutions (courts, bureaucracy, etc.) + +--- + +Relevant Notes: +- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] +- [[AI alignment is a coordination problem not a technical problem]] +- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] +- [[AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation]] + +Topics: +- [[domains/ai-alignment/_map]] +- [[foundations/collective-intelligence/_map]] +- [[core/grand-strategy/_map]] diff --git a/domains/ai-alignment/community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md b/domains/ai-alignment/community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md index fb79aba86..b3df1be8e 100644 --- a/domains/ai-alignment/community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md +++ b/domains/ai-alignment/community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md @@ -19,6 +19,12 @@ Since [[democratic alignment assemblies produce constitutions as effective as ex Since [[collective intelligence requires diversity as a structural precondition not a moral preference]], community-centred norm elicitation is a concrete mechanism for ensuring the structural diversity that collective alignment requires. Without it, alignment defaults to the values of whichever demographic builds the systems. + +### Additional Evidence (confirm) +*Source: [[2025-12-00-cip-year-in-review-democratic-alignment]] | Added: 2026-03-11 | Extractor: anthropic/claude-sonnet-4.5* + +CIP's Weval political neutrality evaluation had 1,000 participants generate 400 prompts and 107 evaluation criteria, rather than selecting from expert-defined options. This confirms that community-generated criteria differ from what developers would specify. The Sri Lanka elections evaluation revealed that global models fail local context alignment—providing generic responses despite local queries—which shows that developer-specified alignment misses context-specific needs that community elicitation would surface. Samiksha's 25,000+ queries across 11 Indian languages in healthcare, agriculture, education, and legal domains further confirms that alignment targets vary significantly by community context and that manual evaluation by community members surfaces domain-specific and cultural nuances that developer-specified rules would miss. + --- Relevant Notes: diff --git a/domains/ai-alignment/democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md b/domains/ai-alignment/democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md index 25541da20..64cbccebc 100644 --- a/domains/ai-alignment/democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md +++ b/domains/ai-alignment/democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md @@ -19,6 +19,12 @@ However, this remains one-shot constitution-setting, not continuous alignment. T Since [[collective intelligence requires diversity as a structural precondition not a moral preference]], democratic assemblies structurally ensure the diversity that expert panels cannot guarantee. Since [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]], the next step beyond assemblies is continuous participatory alignment, not periodic constitution-setting. + +### Additional Evidence (extend) +*Source: [[2025-12-00-cip-year-in-review-democratic-alignment]] | Added: 2026-03-11 | Extractor: anthropic/claude-sonnet-4.5* + +CIP's Global Dialogues scaled democratic alignment to 10,000+ participants across 70+ countries in 6 deliberative dialogues during 2025, extending the assembly model from single-event experiments to ongoing global infrastructure. The Weval framework demonstrated that 1,000 participants achieved 70%+ cross-partisan consensus on AI political neutrality criteria, showing that the approach works not just for constitutional principles but for operational evaluation standards. CIP's 2026 plans explicitly aim to make Global Dialogues "standing global infrastructure," moving from research to permanent governance mechanism. This represents a scale-up from the original assembly studies to 10,000+ participants globally, with adoption by frontier labs (Meta, Anthropic, Cohere) and governments (India, Taiwan, Sri Lanka, UK, US). + --- Relevant Notes: diff --git a/domains/ai-alignment/democratic-ai-evaluation-achieves-70-percent-cross-partisan-consensus-on-political-neutrality-criteria.md b/domains/ai-alignment/democratic-ai-evaluation-achieves-70-percent-cross-partisan-consensus-on-political-neutrality-criteria.md new file mode 100644 index 000000000..a418cf3ce --- /dev/null +++ b/domains/ai-alignment/democratic-ai-evaluation-achieves-70-percent-cross-partisan-consensus-on-political-neutrality-criteria.md @@ -0,0 +1,42 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence, mechanisms] +description: "Structured deliberation among 1,000 diverse participants generated 107 AI evaluation criteria with 70%+ cross-partisan consensus on political neutrality" +confidence: experimental +source: "CIP Year in Review 2025, Weval political neutrality evaluation" +created: 2025-12-15 +depends_on: ["democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations"] +--- + +# Democratic AI evaluation achieves 70 percent cross-partisan consensus on political neutrality criteria + +CIP's Weval evaluation framework demonstrated that 1,000 participants from diverse political backgrounds generated 400 prompts and 107 evaluation criteria for AI political neutrality, achieving 70%+ consensus across political groups. This shows that structured deliberation can produce shared standards for AI behavior even on politically contested topics. + +The consensus emerged through a process where participants collectively defined what political neutrality means in AI responses, rather than having experts or developers impose definitions. The 70% threshold is significant because it represents agreement across groups that would typically disagree on political questions. + +This extends the democratic alignment assembly findings to the specific domain of AI evaluation criteria, showing that the approach scales beyond constitutional principles to operational evaluation standards. + +## Evidence +- CIP Weval framework: 1,000 participants, 400 prompts, 107 criteria, 70%+ cross-partisan consensus +- Applied to political neutrality evaluation of AI models +- Participants generated criteria rather than selecting from expert-defined options + +## Challenges +The source does not specify: +- How "political groups" were defined or selected +- What the 30% disagreement consisted of +- Whether consensus was stable over time or context-dependent +- How the framework handles cases where consensus cannot be reached +- Whether the 70% threshold was predetermined or emerged from analysis + +--- + +Relevant Notes: +- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] +- [[community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules]] +- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] + +Topics: +- [[domains/ai-alignment/_map]] +- [[foundations/collective-intelligence/_map]] diff --git a/domains/ai-alignment/frontier-ai-labs-adopted-democratic-evaluation-tools-as-meta-anthropic-cohere-and-government-institutes-incorporated-cip-findings.md b/domains/ai-alignment/frontier-ai-labs-adopted-democratic-evaluation-tools-as-meta-anthropic-cohere-and-government-institutes-incorporated-cip-findings.md new file mode 100644 index 000000000..dcd515495 --- /dev/null +++ b/domains/ai-alignment/frontier-ai-labs-adopted-democratic-evaluation-tools-as-meta-anthropic-cohere-and-government-institutes-incorporated-cip-findings.md @@ -0,0 +1,49 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence, mechanisms] +description: "CIP's evaluation frameworks achieved adoption by frontier labs (Meta, Anthropic, Cohere) and government institutes (UK, US, India, Taiwan, Sri Lanka) but the depth of integration remains unclear" +confidence: likely +source: "CIP Year in Review 2025, listing Meta, Cohere, Anthropic, UK/US AI Safety Institutes, India, Taiwan, Sri Lanka governments as partners" +created: 2025-12-15 +depends_on: ["democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations"] +--- + +# Frontier AI labs adopted democratic evaluation tools as Meta Anthropic Cohere and government institutes incorporated CIP findings + +CIP's democratic evaluation frameworks achieved adoption by frontier AI labs (Meta, Anthropic, Cohere) and government AI safety institutes (UK, US) as well as governments in India, Taiwan, and Sri Lanka. This represents a transition from experimental research to operational infrastructure. + +The adoption is significant because it shows that democratic alignment methods can meet the standards of organizations with strong incentives for rigorous evaluation. Labs facing regulatory pressure and reputational risk chose to incorporate these tools, suggesting they provide value beyond symbolic commitment to democratic processes. + +However, the source distinguishes between "partners" and "incorporated findings" without specifying whether adoption means: +- Using CIP tools as one input among many +- Changing deployment decisions based on CIP evaluations +- Integrating democratic evaluation as a required gate in development pipelines + +The gap between "we used these insights" and "these changed our product" remains the critical uncertainty for assessing whether democratic alignment has moved from consultation to governance. + +## Evidence +- Partners include Meta, Cohere, Anthropic (frontier labs) +- UK and US AI Safety Institutes incorporated findings +- Governments in India, Taiwan, Sri Lanka incorporated findings +- CIP describes this as moving toward "standing global infrastructure" + +## Challenges +- Source does not specify what "incorporated findings" means operationally +- No evidence that evaluations changed deployment decisions +- Distinction between adoption as evaluation tool vs. adoption as deployment gate is unclear +- "Partners" may mean collaboration without binding commitments +- No timeline for when adoption occurred or whether it is ongoing +- No information on whether adoption is voluntary or regulatory requirement + +--- + +Relevant Notes: +- [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] +- [[no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it]] +- [[safe AI development requires building alignment mechanisms before scaling capability]] +- [[AI alignment is a coordination problem not a technical problem]] + +Topics: +- [[domains/ai-alignment/_map]] +- [[foundations/collective-intelligence/_map]] diff --git a/domains/ai-alignment/global-ai-models-fail-local-context-alignment-as-sri-lanka-election-responses-were-generic-despite-specific-queries.md b/domains/ai-alignment/global-ai-models-fail-local-context-alignment-as-sri-lanka-election-responses-were-generic-despite-specific-queries.md new file mode 100644 index 000000000..72d38343b --- /dev/null +++ b/domains/ai-alignment/global-ai-models-fail-local-context-alignment-as-sri-lanka-election-responses-were-generic-despite-specific-queries.md @@ -0,0 +1,42 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence] +description: "CIP evaluation during Sri Lanka elections revealed models provided generic responses to locally-specific queries, indicating systematic failure to align to local context" +confidence: experimental +source: "CIP Year in Review 2025, Weval Sri Lanka elections evaluation" +created: 2025-12-15 +--- + +# Global AI models fail local context alignment as Sri Lanka election responses were generic despite specific queries + +CIP's Weval evaluation during Sri Lanka elections found that AI models provided generic, irrelevant responses despite receiving queries with local context. This reveals a specific failure mode: models trained on global data cannot align to local contexts even when those contexts are explicitly provided in prompts. + +This is distinct from capability failure—the models could generate coherent responses, but those responses did not engage with the local political, cultural, or informational context that made the queries meaningful. The failure is in alignment to context-specific needs, not in language understanding or factual knowledge. + +This finding suggests that democratic alignment at scale requires mechanisms that can surface and incorporate local context, not just aggregate global preferences. A model aligned to "humanity in general" may be systematically misaligned to specific communities. + +## Evidence +- CIP Weval evaluation during Sri Lanka elections +- Models provided "generic, irrelevant responses despite local context" +- Evaluation specifically tested local political context during active election period + +## Challenges +The source does not specify: +- Which models were tested +- What "generic" and "irrelevant" mean operationally (evaluation criteria used) +- Whether this is a training data problem, a prompt engineering problem, or a fundamental architectural limitation +- Whether any models performed better than others on local context +- Sample size of queries tested +- Whether this failure is specific to elections or generalizes to other local contexts + +--- + +Relevant Notes: +- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] +- [[community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules]] +- [[persistent irreducible disagreement]] + +Topics: +- [[domains/ai-alignment/_map]] +- [[foundations/collective-intelligence/_map]] diff --git a/domains/ai-alignment/no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md b/domains/ai-alignment/no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md index 0a4e68f42..a34366f76 100644 --- a/domains/ai-alignment/no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md +++ b/domains/ai-alignment/no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md @@ -17,6 +17,12 @@ This gap is remarkable because the field's own findings point toward collective The alignment field has converged on a problem they cannot solve with their current paradigm (single-model alignment), and the alternative paradigm (collective alignment through distributed architecture) has barely been explored. This is the opening for the TeleoHumanity thesis -- not as philosophical speculation but as practical infrastructure that addresses problems the alignment community has identified but cannot solve within their current framework. + +### Additional Evidence (challenge) +*Source: [[2025-12-00-cip-year-in-review-democratic-alignment]] | Added: 2026-03-11 | Extractor: anthropic/claude-sonnet-4.5* + +CIP is now building alignment through collective intelligence infrastructure at scale. Global Dialogues reached 10,000+ participants across 70+ countries in 2025. Weval and Samiksha evaluation frameworks are being adopted by frontier labs (Meta, Anthropic, Cohere) and government AI safety institutes (UK, US). Governments in India, Taiwan, and Sri Lanka incorporated findings into policy. CIP's 2026 plans explicitly aim to operationalize democratic input as "standing global infrastructure" and make digital twin evaluations governance requirements for agentic systems. This directly contradicts the claim that no research group is building this infrastructure—CIP is doing exactly that and achieving adoption by major labs and governments. + --- Relevant Notes: diff --git a/domains/ai-alignment/samiksha-evaluation-framework-is-the-most-comprehensive-ai-assessment-in-indian-contexts-with-25000-queries-across-11-languages.md b/domains/ai-alignment/samiksha-evaluation-framework-is-the-most-comprehensive-ai-assessment-in-indian-contexts-with-25000-queries-across-11-languages.md new file mode 100644 index 000000000..c5504d507 --- /dev/null +++ b/domains/ai-alignment/samiksha-evaluation-framework-is-the-most-comprehensive-ai-assessment-in-indian-contexts-with-25000-queries-across-11-languages.md @@ -0,0 +1,43 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence, mechanisms] +description: "CIP's Samiksha framework evaluated 25,000+ queries across 11 Indian languages with 100,000+ manual evaluations, covering healthcare, agriculture, education, and legal domains" +confidence: experimental +source: "CIP Year in Review 2025, Samiksha evaluation framework" +created: 2025-12-15 +--- + +# Samiksha evaluation framework is the most comprehensive AI assessment in Indian contexts with 25,000 queries across 11 languages + +CIP's Samiksha framework represents a large-scale evaluation of AI systems in Indian contexts, with 25,000+ queries across 11 Indian languages and 100,000+ manual evaluations. The evaluation spans four critical domains: healthcare, agriculture, education, and legal. + +The scale is significant because it moves beyond English-language evaluation and Western contexts that dominate AI benchmarking. The 100,000+ manual evaluations provide ground truth for accuracy and safety in contexts where automated evaluation would miss cultural and linguistic nuance. + +The framework's focus on healthcare, agriculture, education, and legal domains targets areas where AI deployment could have immediate high-stakes impact on populations underserved by current AI systems. This makes Samiksha a test of whether democratic alignment can extend beyond deliberative processes to operational evaluation infrastructure. + +## Evidence +- 25,000+ queries across 11 Indian languages +- 100,000+ manual evaluations +- Domains: healthcare, agriculture, education, legal +- CIP describes it as "the most comprehensive evaluation of AI in Indian contexts" +- Healthcare evaluations included medical review for accuracy and safety + +## Challenges +The source does not specify: +- Which AI models were evaluated +- What the evaluation results were (only that evaluation happened) +- Whether findings led to model improvements or deployment changes +- How "comprehensive" is defined relative to other evaluation efforts +- Whether results are publicly available or proprietary +- Breakdown of evaluation results by domain or language + +--- + +Relevant Notes: +- [[community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules]] +- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] + +Topics: +- [[domains/ai-alignment/_map]] +- [[foundations/collective-intelligence/_map]] diff --git a/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md b/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md index 6c83adf6a..a953cab83 100644 --- a/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md +++ b/inbox/archive/2025-12-00-cip-year-in-review-democratic-alignment.md @@ -7,9 +7,15 @@ date: 2025-12-01 domain: ai-alignment secondary_domains: [collective-intelligence, mechanisms] format: article -status: unprocessed +status: processed priority: medium tags: [cip, democratic-alignment, global-dialogues, weval, samiksha, digital-twin, frontier-lab-adoption] +processed_by: theseus +processed_date: 2025-12-01 +claims_extracted: ["democratic-ai-evaluation-achieves-70-percent-cross-partisan-consensus-on-political-neutrality-criteria.md", "global-ai-models-fail-local-context-alignment-as-sri-lanka-election-responses-were-generic-despite-specific-queries.md", "samiksha-evaluation-framework-is-the-most-comprehensive-ai-assessment-in-indian-contexts-with-25000-queries-across-11-languages.md", "frontier-ai-labs-adopted-democratic-evaluation-tools-as-meta-anthropic-cohere-and-government-institutes-incorporated-cip-findings.md", "47-percent-of-users-report-chatbot-interactions-increased-their-belief-certainty-revealing-epistemic-risk-from-ai-confidence.md", "58-percent-believe-ai-could-make-superior-decisions-versus-local-elected-representatives-revealing-ambiguous-trust-in-ai-governance.md"] +enrichments_applied: ["democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md", "no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md", "community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md"] +extraction_model: "anthropic/claude-sonnet-4.5" +extraction_notes: "Six claims extracted focusing on democratic alignment scaling, evaluation framework adoption, and epistemic risks. Three enrichments: extending democratic assembly scale evidence, challenging the 'no infrastructure' claim with CIP's actual deployment, and confirming community norm elicitation findings. The 58% AI-over-representatives finding and 47% increased-certainty finding are particularly significant for alignment theory. The evaluation-to-deployment gap (labs adopting tools vs. changing products) remains unresolved and flagged in the frontier-lab-adoption claim." --- ## Content @@ -59,3 +65,12 @@ CIP's comprehensive 2025 results and 2026 plans. PRIMARY CONNECTION: [[democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations]] WHY ARCHIVED: Scale-up evidence for democratic alignment + frontier lab adoption evidence EXTRACTION HINT: The 70%+ cross-partisan consensus and the evaluation-to-deployment gap are both extractable + + +## Key Facts +- CIP Global Dialogues: 10,000+ participants across 70+ countries in 6 deliberative dialogues (2025) +- Weval political neutrality: 1,000 participants, 400 prompts, 107 criteria, 70%+ cross-partisan consensus +- Samiksha: 25,000+ queries across 11 Indian languages, 100,000+ manual evaluations +- 28% agreed AI should override established rules if calculating better outcomes +- 13.7% reported concerning/reality-distorting AI interactions affecting someone they know +- CIP partners: Meta, Cohere, Anthropic, UK/US AI Safety Institutes, India/Taiwan/Sri Lanka governments