teleo-codex/domains/ai-alignment/AI alignment is a coordination problem not a technical problem.md
Teleo Agents a3efbd2315 auto-fix: address review feedback on 2026-02-25-futardio-launch-rock-game.md
- Fixed based on eval review comments
- Quality gate pass 3 (fix-from-feedback)

Pentagon-Agent: Theseus <HEADLESS>
2026-03-11 21:07:43 +00:00

49 lines
5.5 KiB
Markdown

---
type: claim
domain: ai-alignment
description: "AI alignment failures stem from coordination breakdowns between labs and institutions, not from unsolved technical problems in model training"
confidence: likely
source: "Coordination protocol design produces larger capability gains than model scaling (2025); nation-states will inevitably assert control over frontier AI (2025); AI development is a critical juncture in institutional history (2025)"
created: 2026-03-11
---
# AI alignment is a coordination problem not a technical problem
The field has spent a decade optimizing for the wrong problem. Technical alignment — making individual AI systems pursue intended goals — is tractable. Coordination alignment — making competing labs, nations, and institutions align around shared safety commitments — is the actual constraint.
This reframe inverts the standard narrative. The standard view: alignment is hard because we don't know how to specify human values in code, and superintelligence might pursue instrumental goals that destroy human civilization. The coordination view: we know enough about alignment mechanisms to make them work (RLHF, constitutional AI, interpretability tools exist); the problem is that no lab has incentive to use them when competitors don't, and no nation will constrain its AI development when adversaries won't.
## Evidence
**Coordination protocol design produces larger capability gains than model scaling.** A single AI model achieved 6x performance improvement with structured exploration protocols compared to human coaching on the same problem. This demonstrates that coordination between components (or between human and AI) produces larger gains than raw capability scaling. If coordination produces 6x gains in capability, it likely produces comparable gains in safety — but only if all parties coordinate. A single lab implementing safety protocols while competitors don't creates competitive disadvantage, not safety.
**Nation-states will inevitably assert control over frontier AI development.** The monopoly on force is the foundational state function. Weapons-grade AI capability in private hands is structurally intolerable to governments. This means the coordination problem is not just between labs but between nations — and nations have fundamentally conflicting strategic interests. The US, China, and EU cannot coordinate on AI safety when AI capability is a strategic weapon. This is not a technical problem; it's a geopolitical coordination failure.
**AI development is a critical juncture in institutional history.** The mismatch between AI capability development (months) and institutional adaptation (decades) creates a window where institutions cannot keep pace with technology. This is a coordination problem: institutions need to align on governance structures before capability reaches critical thresholds, but the incentive structure makes this coordination nearly impossible. Each nation wants to move fast; no nation wants to be constrained while others accelerate.
## Why This Reframe Matters
If alignment is a technical problem, the solution is better research — more interpretability papers, better RLHF methods, more constitutional AI work. If alignment is a coordination problem, the solution is institutional design — treaties, verification mechanisms, enforcement structures, and credible commitments that make defection costly.
The field has optimized for technical solutions because they're within the control of individual researchers and labs. Coordination solutions require political action, which is harder and slower. But the technical solutions only work if everyone uses them. A perfectly aligned AI system deployed by one lab while competitors deploy unaligned systems doesn't solve the problem — it just ensures the unaligned system wins.
## Challenges
**Coordination may be impossible under competition.** If AI capability is a strategic weapon, nations cannot coordinate on safety constraints because doing so creates vulnerability to defection. This is the prisoner's dilemma at civilizational scale. The claim assumes coordination is possible; it may not be.
**Technical and coordination problems are not separable.** Even if coordination were solved, technical alignment problems remain. The claim positions them as alternatives; they may be complementary. You need both.
**Coordination mechanisms don't yet exist.** The claim identifies coordination as the problem but doesn't propose working mechanisms for solving it at scale. Treaties require verification; verification requires transparency; transparency creates espionage risk. The practical mechanisms for coordination remain unspecified.
---
Relevant Notes:
- [[nation-states will inevitably assert control over frontier AI development because the monopoly on force is the foundational state function and weapons-grade AI capability in private hands is structurally intolerable to governments]]
- [[AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation]]
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]]
- [[beneficial-ai-outcomes-require-institutional-co-alignment-not-just-model-alignment]]
- [[safe AI development requires building alignment mechanisms before scaling capability]]
Topics:
- [[domains/ai-alignment/_map]]
- [[core/mechanisms/_map]]