teleo-codex/domains/ai-alignment/post-arrow-social-choice-mechanisms-work-by-weakening-independence-of-irrelevant-alternatives.md at ca00aeee083473d72651d8173413b43ddf2a4e7c

Sync Graph Data to teleo-app / sync (push) Waiting to run

Details

extract: 2024-04-00-conitzer-social-choice-guide-alignment

Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>

2026-03-15 17:13:21 +00:00

3.1 KiB

Raw Blame History

type

domain

secondary_domains

description

confidence

source

created

claim

ai-alignment

mechanisms

collective-intelligence

Practical voting methods like Borda Count and Ranked Pairs avoid Arrow's impossibility by sacrificing IIA rather than claiming to overcome the theorem

proven

Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)

2026-03-11

Arrow's impossibility theorem proves that no ordinal preference aggregation method can simultaneously satisfy unrestricted domain, Pareto efficiency, independence of irrelevant alternatives (IIA), and non-dictatorship. Rather than claiming to overcome this theorem, post-Arrow social choice theory has spent 70 years developing practical mechanisms that work by deliberately weakening IIA.

Conitzer et al. (2024) emphasize this key insight: "for ordinal preference aggregation, in order to avoid dictatorships, oligarchies and vetoers, one must weaken IIA." Practical voting methods like Borda Count, Instant Runoff Voting, and Ranked Pairs all sacrifice IIA to achieve other desirable properties. This is not a failure—it's a principled tradeoff that enables functional collective decision-making.

The paper recommends examining specific voting methods that have been formally analyzed for their properties rather than searching for a mythical "perfect" aggregation method that Arrow proved cannot exist. Different methods make different tradeoffs, and the choice should depend on the specific alignment context.

Evidence

Arrow's impossibility theorem (1951) establishes the fundamental constraint
Conitzer et al. (2024) explicitly state: "Rather than claiming to overcome Arrow's theorem, the paper leverages post-Arrow social choice theory"
Specific mechanisms recommended: Borda Count, Instant Runoff, Ranked Pairs—all formally analyzed for their properties
The paper proposes RLCHF variants that use these established social welfare functions rather than inventing new aggregation methods

Practical Implications

This resolves a common confusion in AI alignment discussions: people often cite Arrow's theorem as proof that preference aggregation is impossible, when the actual lesson is that perfect aggregation is impossible and we must choose which properties to prioritize. The 70-year history of social choice theory provides a menu of well-understood options.

For AI alignment, this means: (1) stop searching for a universal aggregation method, (2) explicitly choose which Arrow conditions to relax based on the deployment context, (3) use established voting methods with known properties rather than ad-hoc aggregation.

Relevant Notes:

Topics:

domains/ai-alignment/_map
core/mechanisms/_map
foundations/collective-intelligence/_map

3.1 KiB Raw Blame History

Post-Arrow social choice mechanisms work by weakening independence of irrelevant alternatives

Evidence

Practical Implications

3.1 KiB

Raw Blame History