Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
3.4 KiB
| type | domain | secondary_domains | description | confidence | source | created | ||
|---|---|---|---|---|---|---|---|---|
| claim | ai-alignment |
|
Creating multiple AI systems reflecting genuinely incompatible values may be structurally superior to aggregating all preferences into one aligned system | experimental | Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024) | 2026-03-11 |
Pluralistic AI alignment through multiple systems preserves value diversity better than forced consensus
Conitzer et al. (2024) propose a "pluralism option": rather than forcing all human values into a single aligned AI system through preference aggregation, create multiple AI systems that reflect genuinely incompatible value sets. This structural approach to pluralism may better preserve value diversity than any aggregation mechanism.
The paper positions this as an alternative to the standard alignment framing, which assumes a single AI system must be aligned with aggregated human preferences. When values are irreducibly diverse—not just different but fundamentally incompatible—attempting to merge them into one system necessarily distorts or suppresses some values. Multiple systems allow each value set to be faithfully represented.
This connects directly to the collective superintelligence thesis: rather than one monolithic aligned AI, a ecosystem of specialized systems with different value orientations, coordinating through explicit mechanisms. The paper doesn't fully develop this direction but identifies it as a viable path.
Evidence
- Conitzer et al. (2024) explicitly propose "creating multiple AI systems reflecting genuinely incompatible values rather than forcing artificial consensus"
- The paper cites persistent irreducible disagreement as a structural feature that aggregation cannot resolve
- Stuart Russell's co-authorship signals this is a serious position within mainstream AI safety, not a fringe view
Relationship to Collective Superintelligence
This is the closest mainstream AI alignment has come to the collective superintelligence thesis articulated in collective superintelligence is the alternative to monolithic AI controlled by a few. The paper doesn't use the term "collective superintelligence" but the structural logic is identical: value diversity is preserved through system plurality rather than aggregation.
The key difference: Conitzer et al. frame this as an option among several approaches, while the collective superintelligence thesis argues this is the only path that preserves human agency at scale. The paper's pluralism option is permissive ("we could do this"), not prescriptive ("we must do this").
Open Questions
- How do multiple value-aligned systems coordinate when their values conflict in practice?
- What governance mechanisms determine which value sets get their own system?
- Does this approach scale to thousands of value clusters or only to a handful?
Relevant Notes:
- collective superintelligence is the alternative to monolithic AI controlled by a few
- pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state
- persistent irreducible disagreement
- some disagreements are permanently irreducible because they stem from genuine value differences not information gaps and systems must map rather than eliminate them
Topics:
- domains/ai-alignment/_map
- foundations/collective-intelligence/_map
- core/mechanisms/_map