--- type: claim domain: ai-alignment secondary_domains: [critical-systems] description: "If fair intelligence benchmarks are mathematically impossible, then alignment efforts lack a coherent target specification independent of the preference aggregation problem" confidence: experimental source: "Derived from Oswald, Ferguson, & Bringsjord (2025), 'On the Arrowian Impossibility of Machine Intelligence Measures', AGI 2025, Springer LNCS vol. 16058" created: 2026-03-11 enrichments: ["arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible.md"] --- # Intelligence measurement impossibility implies alignment targets are structurally underspecified The Oswald et al. (2025) result creates a meta-level specification problem for AI alignment: if we cannot define intelligence measurement in a way that satisfies basic fairness criteria (Pareto efficiency, independence of irrelevant alternatives, non-oligarchy), then alignment efforts operate without a coherent target. The standard alignment framing assumes: 1. We can measure AI capability/intelligence 2. We can specify human values/preferences 3. The challenge is aligning (1) to (2) But if (1) itself faces Arrow-type impossibilities, the problem compounds: - We cannot aggregate diverse human preferences into a single objective ([[safe AI development requires building alignment mechanisms before scaling capability]]) - We cannot measure intelligence in a way that satisfies fairness conditions ([[arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible]]) - Therefore, we cannot even specify what "a capable system aligned to human values" means in a way that satisfies basic coherence requirements This is not merely a technical measurement challenge—it's a structural impossibility in defining the alignment target itself. ## Implications for Benchmark-Driven Development Current AI development heavily relies on benchmarks (ARC, MMLU, HumanEval, etc.) as proxies for intelligence and capability. If these benchmarks face Arrow-type impossibilities, then: 1. **Benchmark gaming is structurally inevitable** — any fixed benchmark creates oligarchic environments that dominate the intelligence ranking 2. **Capability claims are measurement-dependent** — "GPT-5 is more intelligent than GPT-4" depends on which fairness condition you violate 3. **Safety evaluations inherit the impossibility** — if we cannot measure intelligence fairly, we cannot measure alignment fairly either This suggests that benchmark-driven development may be fundamentally misguided, not just in need of better benchmarks. ## Relationship to Pluralistic Alignment The measurement impossibility strengthens the case for [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]. If even intelligence measurement cannot satisfy universal fairness criteria, then: - Alignment cannot be a single target state - Different measurement frameworks will yield different capability rankings - Systems must navigate measurement pluralism, not converge to a universal standard This shifts alignment from "find the right objective" to "build systems that can operate under measurement uncertainty and value pluralism." ## Open Questions - Does the impossibility suggest abandoning universal intelligence measures in favor of domain-specific or context-dependent measures? - Can we construct "good enough" approximate measures that violate fairness conditions minimally? - Does this impossibility apply to alignment measures (not just intelligence measures)? --- Relevant Notes: - [[arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible]] - [[safe AI development requires building alignment mechanisms before scaling capability]] - [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]] - [[specifying human values in code is intractable because our goals contain hidden complexity comparable to visual perception]] - [[AI alignment is a coordination problem not a technical problem]]