teleo-codex/domains/ai-alignment/ai-alignment-requires-institutional-co-alignment-not-just-model-alignment.md
Teleo Agents 13a6fe956f theseus: extract from 2025-12-00-fullstack-alignment-thick-models-value.md
- Source: inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 6)

Pentagon-Agent: Theseus <HEADLESS>
2026-03-12 11:24:54 +00:00

2.6 KiB

type domain description confidence source created secondary_domains
claim ai-alignment Beneficial AI outcomes require aligning both AI systems and the institutions that shape them simultaneously rather than focusing on individual model alignment alone experimental Full-Stack Alignment paper (arxiv.org/abs/2512.03399, December 2025) 2026-03-11
mechanisms
grand-strategy

AI alignment requires institutional co-alignment not just model alignment

The Full-Stack Alignment framework argues that "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" alone. Instead, alignment must be comprehensive—addressing both AI systems and the institutions that shape their development and deployment.

This extends beyond single-organization objectives to address misalignment across multiple stakeholders. The paper proposes full-stack alignment as the concurrent alignment of AI systems and institutions with what people value, moving the alignment problem from a purely technical domain into institutional design.

Evidence

The paper identifies five implementation mechanisms for full-stack alignment:

  1. AI value stewardship
  2. Normatively competent agents
  3. Win-win negotiation systems
  4. Meaning-preserving economic mechanisms
  5. Democratic regulatory institutions

This multi-layered approach suggests that technical alignment solutions (RLHF, constitutional AI, etc.) are necessary but insufficient without corresponding institutional structures.

Relationship to Existing Claims

This claim extends AI alignment is a coordination problem not a technical problem by arguing that coordination must occur not just between AI labs but between AI systems and the institutions governing them. Where the coordination thesis focuses on inter-organizational dynamics, full-stack alignment adds institutional architecture as a co-equal alignment target.

The claim also connects to AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation by framing this institutional co-alignment as urgent—the window for shaping both AI and institutions simultaneously may be narrow.


Relevant Notes: