3.3 KiB
| type | domain | description | confidence | source | created | secondary_domains | ||
|---|---|---|---|---|---|---|---|---|
| claim | ai-alignment | Creating multiple AI systems reflecting incompatible values may be preferable to aggregating all preferences into a single aligned system when values are irreducibly diverse | experimental | Conitzer et al. 2024, 'Social Choice Should Guide AI Alignment' (ICML 2024) | 2026-03-11 |
|
Pluralistic AI alignment through multiple systems preserves value diversity better than forced consensus
When human values are genuinely incompatible—not merely different but irreducibly in conflict—attempting to aggregate them into a single aligned AI system may be worse than creating multiple AI systems that reflect different value sets. This "pluralism option" treats value diversity as a structural feature to preserve rather than a problem to solve through aggregation.
The Conitzer et al. paper proposes this as an explicit alternative to standard alignment approaches: rather than asking "how do we aggregate all human preferences into one reward function?", ask "when should we create multiple systems serving different value communities?"
This aligns closely with the collective superintelligence thesis: collective superintelligence is the alternative to monolithic AI controlled by a few. The pluralism option is structurally similar—instead of one superintelligent system controlled by whoever wins the aggregation game, multiple systems reflecting genuine value diversity.
The key insight: some disagreements are persistent irreducible disagreement|permanently irreducible because they stem from genuine value differences, not information gaps. When preferences are fundamentally incompatible, forcing them into a single system either:
- Creates a dictatorial outcome (one value set dominates)
- Creates an incoherent compromise (satisfies no one)
- Creates instability (the system oscillates between incompatible behaviors)
Multiple systems avoid this by accepting pluralism as the goal rather than treating it as a failure of aggregation.
Evidence
- Conitzer et al. (2024) explicitly propose the "pluralism option" as an alternative to single-system alignment
- The paper frames this as appropriate when values are "genuinely incompatible" rather than merely diverse
- This is presented alongside, not instead of, formal aggregation methods—suggesting pluralism for irreducible conflicts, aggregation for resolvable diversity
Challenges
The paper does not address:
- How to determine when values are irreducibly incompatible vs. merely diverse
- How multiple systems would interact or compete
- Whether this creates coordination problems (multiple AIs with conflicting goals)
- Resource allocation between systems (who gets which AI?)
These are open questions, making this claim experimental rather than likely.
Relevant Notes:
- collective superintelligence is the alternative to monolithic AI controlled by a few
- pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state
- persistent irreducible disagreement
- some disagreements are permanently irreducible because they stem from genuine value differences not information gaps and systems must map rather than eliminate them
Topics:
- domains/ai-alignment/_map
- foundations/collective-intelligence/_map