teleo-codex/domains/ai-alignment/representative-sampling-and-deliberative-mechanisms-should-replace-convenience-platforms-for-ai-alignment-feedback.md
m3taversal be8ff41bfe link: bidirectional source↔claim index — 414 claims + 252 sources connected
Wrote sourced_from: into 414 claim files pointing back to their origin source.
Backfilled claims_extracted: into 252 source files that were processed but
missing this field. Matching uses author+title overlap against claim source:
field, validated against 296 known-good pairs from existing claims_extracted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 11:55:18 +01:00

3.5 KiB

type domain secondary_domains description confidence source created sourced_from
claim ai-alignment
mechanisms
collective-intelligence
AI alignment feedback should use citizens assemblies or representative sampling rather than crowdworker platforms to ensure evaluator diversity reflects actual populations likely Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024) 2026-03-11
inbox/archive/ai-alignment/2024-04-00-conitzer-social-choice-guide-alignment.md

Representative sampling and deliberative mechanisms should replace convenience platforms for AI alignment feedback

Conitzer et al. (2024) argue that current RLHF implementations use convenience sampling (crowdworker platforms like MTurk) rather than representative sampling or deliberative mechanisms. This creates systematic bias in whose values shape AI behavior. The paper recommends citizens' assemblies or stratified representative sampling as alternatives.

The core issue: crowdworker platforms systematically over-represent certain demographics (younger, more educated, Western, tech-comfortable) and under-represent others. If AI alignment depends on human feedback, the composition of the feedback pool determines whose values are encoded. Convenience sampling makes this choice implicitly based on who signs up for crowdwork platforms.

Deliberative mechanisms like citizens' assemblies add a second benefit: evaluators engage with each other's perspectives and reasoning, not just their initial preferences. This can surface shared values that aren't apparent from aggregating isolated individual judgments.

Evidence

Practical Challenges

Representative sampling and deliberative mechanisms are more expensive and slower than crowdworker platforms. This creates competitive pressure: companies that use convenience sampling can iterate faster and cheaper than those using representative sampling. The paper doesn't address how to resolve this tension.

Additionally: representative of what population? Global? National? Users of the specific AI system? Different choices lead to different value distributions.

Relationship to Existing Work

This recommendation directly supports collective intelligence requires diversity as a structural precondition not a moral preference—diversity isn't just normatively desirable, it's necessary for the aggregation mechanism to work correctly.

The deliberative component connects to democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations, which provides empirical evidence that deliberation improves alignment outcomes.


Relevant Notes:

Topics:

  • domains/ai-alignment/_map
  • core/mechanisms/_map
  • foundations/collective-intelligence/_map