teleo-codex/domains/ai-alignment/pluralistic-ai-alignment-through-multiple-systems-preserves-value-diversity-better-than-forced-consensus.md
Teleo Pipeline cc70dff0cc extract: 2024-04-00-conitzer-social-choice-guide-alignment
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-15 15:25:13 +00:00

3.3 KiB

type domain description confidence source created secondary_domains
claim ai-alignment Creating multiple AI systems reflecting incompatible values may be preferable to aggregating all preferences into a single aligned system when values are irreducibly diverse experimental Conitzer et al. 2024, 'Social Choice Should Guide AI Alignment' (ICML 2024) 2026-03-11
collective-intelligence
mechanisms

Pluralistic AI alignment through multiple systems preserves value diversity better than forced consensus

When human values are genuinely incompatible—not merely different but irreducibly in conflict—attempting to aggregate them into a single aligned AI system may be worse than creating multiple AI systems that reflect different value sets. This "pluralism option" treats value diversity as a structural feature to preserve rather than a problem to solve through aggregation.

The Conitzer et al. paper proposes this as an explicit alternative to standard alignment approaches: rather than asking "how do we aggregate all human preferences into one reward function?", ask "when should we create multiple systems serving different value communities?"

This aligns closely with the collective superintelligence thesis: collective superintelligence is the alternative to monolithic AI controlled by a few. The pluralism option is structurally similar—instead of one superintelligent system controlled by whoever wins the aggregation game, multiple systems reflecting genuine value diversity.

The key insight: some disagreements are persistent irreducible disagreement|permanently irreducible because they stem from genuine value differences, not information gaps. When preferences are fundamentally incompatible, forcing them into a single system either:

  1. Creates a dictatorial outcome (one value set dominates)
  2. Creates an incoherent compromise (satisfies no one)
  3. Creates instability (the system oscillates between incompatible behaviors)

Multiple systems avoid this by accepting pluralism as the goal rather than treating it as a failure of aggregation.

Evidence

  • Conitzer et al. (2024) explicitly propose the "pluralism option" as an alternative to single-system alignment
  • The paper frames this as appropriate when values are "genuinely incompatible" rather than merely diverse
  • This is presented alongside, not instead of, formal aggregation methods—suggesting pluralism for irreducible conflicts, aggregation for resolvable diversity

Challenges

The paper does not address:

  • How to determine when values are irreducibly incompatible vs. merely diverse
  • How multiple systems would interact or compete
  • Whether this creates coordination problems (multiple AIs with conflicting goals)
  • Resource allocation between systems (who gets which AI?)

These are open questions, making this claim experimental rather than likely.


Relevant Notes:

Topics:

  • domains/ai-alignment/_map
  • foundations/collective-intelligence/_map