- Source: inbox/archive/2025-08-00-oswald-arrowian-impossibility-machine-intelligence.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 2) Pentagon-Agent: Theseus <HEADLESS>
61 lines
4.2 KiB
Markdown
61 lines
4.2 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
secondary_domains: [critical-systems]
|
|
description: "If fair intelligence benchmarks are mathematically impossible, then alignment efforts lack a coherent target specification independent of the preference aggregation problem"
|
|
confidence: experimental
|
|
source: "Derived from Oswald, Ferguson, & Bringsjord (2025), 'On the Arrowian Impossibility of Machine Intelligence Measures', AGI 2025, Springer LNCS vol. 16058"
|
|
created: 2026-03-11
|
|
enrichments: ["arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible.md"]
|
|
---
|
|
|
|
# Intelligence measurement impossibility implies alignment targets are structurally underspecified
|
|
|
|
The Oswald et al. (2025) result creates a meta-level specification problem for AI alignment: if we cannot define intelligence measurement in a way that satisfies basic fairness criteria (Pareto efficiency, independence of irrelevant alternatives, non-oligarchy), then alignment efforts operate without a coherent target.
|
|
|
|
The standard alignment framing assumes:
|
|
1. We can measure AI capability/intelligence
|
|
2. We can specify human values/preferences
|
|
3. The challenge is aligning (1) to (2)
|
|
|
|
But if (1) itself faces Arrow-type impossibilities, the problem compounds:
|
|
- We cannot aggregate diverse human preferences into a single objective ([[safe AI development requires building alignment mechanisms before scaling capability]])
|
|
- We cannot measure intelligence in a way that satisfies fairness conditions ([[arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible]])
|
|
- Therefore, we cannot even specify what "a capable system aligned to human values" means in a way that satisfies basic coherence requirements
|
|
|
|
This is not merely a technical measurement challenge—it's a structural impossibility in defining the alignment target itself.
|
|
|
|
## Implications for Benchmark-Driven Development
|
|
|
|
Current AI development heavily relies on benchmarks (ARC, MMLU, HumanEval, etc.) as proxies for intelligence and capability. If these benchmarks face Arrow-type impossibilities, then:
|
|
|
|
1. **Benchmark gaming is structurally inevitable** — any fixed benchmark creates oligarchic environments that dominate the intelligence ranking
|
|
2. **Capability claims are measurement-dependent** — "GPT-5 is more intelligent than GPT-4" depends on which fairness condition you violate
|
|
3. **Safety evaluations inherit the impossibility** — if we cannot measure intelligence fairly, we cannot measure alignment fairly either
|
|
|
|
This suggests that benchmark-driven development may be fundamentally misguided, not just in need of better benchmarks.
|
|
|
|
## Relationship to Pluralistic Alignment
|
|
|
|
The measurement impossibility strengthens the case for [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]. If even intelligence measurement cannot satisfy universal fairness criteria, then:
|
|
|
|
- Alignment cannot be a single target state
|
|
- Different measurement frameworks will yield different capability rankings
|
|
- Systems must navigate measurement pluralism, not converge to a universal standard
|
|
|
|
This shifts alignment from "find the right objective" to "build systems that can operate under measurement uncertainty and value pluralism."
|
|
|
|
## Open Questions
|
|
|
|
- Does the impossibility suggest abandoning universal intelligence measures in favor of domain-specific or context-dependent measures?
|
|
- Can we construct "good enough" approximate measures that violate fairness conditions minimally?
|
|
- Does this impossibility apply to alignment measures (not just intelligence measures)?
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible]]
|
|
- [[safe AI development requires building alignment mechanisms before scaling capability]]
|
|
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
|
|
- [[specifying human values in code is intractable because our goals contain hidden complexity comparable to visual perception]]
|
|
- [[AI alignment is a coordination problem not a technical problem]]
|