teleo-codex/domains/ai-alignment/pluralistic-ai-alignment-through-multiple-systems-preserves-value-diversity-better-than-forced-consensus.md
Teleo Pipeline e4506bd6ce
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
extract: 2024-04-00-conitzer-social-choice-guide-alignment
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-15 17:13:21 +00:00

3.4 KiB

type domain secondary_domains description confidence source created
claim ai-alignment
collective-intelligence
mechanisms
Creating multiple AI systems reflecting genuinely incompatible values may be structurally superior to aggregating all preferences into one aligned system experimental Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024) 2026-03-11

Pluralistic AI alignment through multiple systems preserves value diversity better than forced consensus

Conitzer et al. (2024) propose a "pluralism option": rather than forcing all human values into a single aligned AI system through preference aggregation, create multiple AI systems that reflect genuinely incompatible value sets. This structural approach to pluralism may better preserve value diversity than any aggregation mechanism.

The paper positions this as an alternative to the standard alignment framing, which assumes a single AI system must be aligned with aggregated human preferences. When values are irreducibly diverse—not just different but fundamentally incompatible—attempting to merge them into one system necessarily distorts or suppresses some values. Multiple systems allow each value set to be faithfully represented.

This connects directly to the collective superintelligence thesis: rather than one monolithic aligned AI, a ecosystem of specialized systems with different value orientations, coordinating through explicit mechanisms. The paper doesn't fully develop this direction but identifies it as a viable path.

Evidence

  • Conitzer et al. (2024) explicitly propose "creating multiple AI systems reflecting genuinely incompatible values rather than forcing artificial consensus"
  • The paper cites persistent irreducible disagreement as a structural feature that aggregation cannot resolve
  • Stuart Russell's co-authorship signals this is a serious position within mainstream AI safety, not a fringe view

Relationship to Collective Superintelligence

This is the closest mainstream AI alignment has come to the collective superintelligence thesis articulated in collective superintelligence is the alternative to monolithic AI controlled by a few. The paper doesn't use the term "collective superintelligence" but the structural logic is identical: value diversity is preserved through system plurality rather than aggregation.

The key difference: Conitzer et al. frame this as an option among several approaches, while the collective superintelligence thesis argues this is the only path that preserves human agency at scale. The paper's pluralism option is permissive ("we could do this"), not prescriptive ("we must do this").

Open Questions

  • How do multiple value-aligned systems coordinate when their values conflict in practice?
  • What governance mechanisms determine which value sets get their own system?
  • Does this approach scale to thousands of value clusters or only to a handful?

Relevant Notes:

Topics:

  • domains/ai-alignment/_map
  • foundations/collective-intelligence/_map
  • core/mechanisms/_map