Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
3.4 KiB
| type | domain | secondary_domains | description | confidence | source | created | ||
|---|---|---|---|---|---|---|---|---|
| claim | ai-alignment |
|
AI alignment feedback should use citizens assemblies or representative sampling rather than crowdworker platforms to ensure evaluator diversity reflects actual populations | likely | Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024) | 2026-03-11 |
Representative sampling and deliberative mechanisms should replace convenience platforms for AI alignment feedback
Conitzer et al. (2024) argue that current RLHF implementations use convenience sampling (crowdworker platforms like MTurk) rather than representative sampling or deliberative mechanisms. This creates systematic bias in whose values shape AI behavior. The paper recommends citizens' assemblies or stratified representative sampling as alternatives.
The core issue: crowdworker platforms systematically over-represent certain demographics (younger, more educated, Western, tech-comfortable) and under-represent others. If AI alignment depends on human feedback, the composition of the feedback pool determines whose values are encoded. Convenience sampling makes this choice implicitly based on who signs up for crowdwork platforms.
Deliberative mechanisms like citizens' assemblies add a second benefit: evaluators engage with each other's perspectives and reasoning, not just their initial preferences. This can surface shared values that aren't apparent from aggregating isolated individual judgments.
Evidence
- Conitzer et al. (2024) explicitly recommend "representative sampling or deliberative mechanisms (citizens' assemblies) rather than convenience platforms"
- The paper cites democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations as evidence that deliberative approaches work
- Current RLHF implementations predominantly use MTurk, Upwork, or similar platforms
Practical Challenges
Representative sampling and deliberative mechanisms are more expensive and slower than crowdworker platforms. This creates competitive pressure: companies that use convenience sampling can iterate faster and cheaper than those using representative sampling. The paper doesn't address how to resolve this tension.
Additionally: representative of what population? Global? National? Users of the specific AI system? Different choices lead to different value distributions.
Relationship to Existing Work
This recommendation directly supports collective intelligence requires diversity as a structural precondition not a moral preference—diversity isn't just normatively desirable, it's necessary for the aggregation mechanism to work correctly.
The deliberative component connects to democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations, which provides empirical evidence that deliberation improves alignment outcomes.
Relevant Notes:
- collective intelligence requires diversity as a structural precondition not a moral preference
- democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations
- community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules
Topics:
- domains/ai-alignment/_map
- core/mechanisms/_map
- foundations/collective-intelligence/_map