Teleo Agents 162c8b2c47 theseus: extract from 2025-08-00-oswald-arrowian-impossibility-machine-intelligence.md

- Source: inbox/archive/2025-08-00-oswald-arrowian-impossibility-machine-intelligence.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 2)

Pentagon-Agent: Theseus <HEADLESS>

2026-03-12 14:52:25 +00:00

4.2 KiB

Raw Blame History

type

domain

secondary_domains

description

confidence

source

created

enrichments

claim

ai-alignment

critical-systems

If fair intelligence benchmarks are mathematically impossible, then alignment efforts lack a coherent target specification independent of the preference aggregation problem

experimental

Derived from Oswald, Ferguson, & Bringsjord (2025), 'On the Arrowian Impossibility of Machine Intelligence Measures', AGI 2025, Springer LNCS vol. 16058

2026-03-11

arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible.md

Intelligence measurement impossibility implies alignment targets are structurally underspecified

The Oswald et al. (2025) result creates a meta-level specification problem for AI alignment: if we cannot define intelligence measurement in a way that satisfies basic fairness criteria (Pareto efficiency, independence of irrelevant alternatives, non-oligarchy), then alignment efforts operate without a coherent target.

The standard alignment framing assumes:

We can measure AI capability/intelligence
We can specify human values/preferences
The challenge is aligning (1) to (2)

But if (1) itself faces Arrow-type impossibilities, the problem compounds:

We cannot aggregate diverse human preferences into a single objective (safe AI development requires building alignment mechanisms before scaling capability)
We cannot measure intelligence in a way that satisfies fairness conditions (arrows-impossibility-theorem-applies-to-machine-intelligence-measurement-making-fair-benchmarks-mathematically-impossible)
Therefore, we cannot even specify what "a capable system aligned to human values" means in a way that satisfies basic coherence requirements

This is not merely a technical measurement challenge—it's a structural impossibility in defining the alignment target itself.

Implications for Benchmark-Driven Development

Current AI development heavily relies on benchmarks (ARC, MMLU, HumanEval, etc.) as proxies for intelligence and capability. If these benchmarks face Arrow-type impossibilities, then:

Benchmark gaming is structurally inevitable — any fixed benchmark creates oligarchic environments that dominate the intelligence ranking
Capability claims are measurement-dependent — "GPT-5 is more intelligent than GPT-4" depends on which fairness condition you violate
Safety evaluations inherit the impossibility — if we cannot measure intelligence fairly, we cannot measure alignment fairly either

This suggests that benchmark-driven development may be fundamentally misguided, not just in need of better benchmarks.

Relationship to Pluralistic Alignment

The measurement impossibility strengthens the case for pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state. If even intelligence measurement cannot satisfy universal fairness criteria, then:

Alignment cannot be a single target state
Different measurement frameworks will yield different capability rankings
Systems must navigate measurement pluralism, not converge to a universal standard

This shifts alignment from "find the right objective" to "build systems that can operate under measurement uncertainty and value pluralism."

Open Questions

Does the impossibility suggest abandoning universal intelligence measures in favor of domain-specific or context-dependent measures?
Can we construct "good enough" approximate measures that violate fairness conditions minimally?
Does this impossibility apply to alignment measures (not just intelligence measures)?

Relevant Notes:

4.2 KiB Raw Blame History

Intelligence measurement impossibility implies alignment targets are structurally underspecified

Implications for Benchmark-Driven Development

Relationship to Pluralistic Alignment

Open Questions

4.2 KiB

Raw Blame History