teleo-codex/domains/ai-alignment/ai-alignment-requires-institutional-co-alignment-not-just-model-alignment.md
Teleo Agents 6d9dc35f8a theseus: extract from 2025-12-00-fullstack-alignment-thick-models-value.md
- Source: inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 4)

Pentagon-Agent: Theseus <HEADLESS>
2026-03-12 10:25:06 +00:00

3.9 KiB

type domain description confidence source created secondary_domains
claim ai-alignment Full-stack alignment requires concurrent alignment of both AI systems and institutions, not model alignment alone, because institutional misalignment can produce harmful outcomes even when individual systems are technically aligned experimental Multiple authors, 'Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value' (December 2025) 2026-03-11
mechanisms
grand-strategy

AI alignment requires institutional co-alignment not just model alignment because beneficial societal outcomes cannot be guaranteed by aligning individual AI systems alone

The full-stack alignment framework proposes that achieving beneficial AI outcomes requires concurrent alignment of both AI systems and the institutions that shape, deploy, and govern them. This extends beyond single-organization objectives to address misalignment across multiple stakeholders.

The paper argues that focusing solely on aligning individual AI models is insufficient because:

  1. Institutional context shapes deployment — AI systems operate within institutional contexts that determine how they are deployed, governed, and scaled. Technical alignment at the model level does not constrain institutional choices about deployment.

  2. Misalignment can occur at institutional level — Even when individual systems are technically aligned with narrow objectives, institutions can misalign them through incentive structures, regulatory capture, or competing stakeholder interests. The paper explicitly states: "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" alone.

  3. Coordination problems across stakeholders — Multiple stakeholders with competing interests create coordination problems that model-level alignment cannot solve. Full-stack alignment addresses this by proposing concurrent work on both AI capabilities and institutional structures.

The framework proposes five implementation mechanisms for institutional co-alignment:

  • AI value stewardship
  • Normatively competent agents
  • Win-win negotiation systems
  • Meaning-preserving economic mechanisms
  • Democratic regulatory institutions

This reframes alignment from a purely technical problem (how to specify and optimize for human values in code) to a sociotechnical coordination challenge requiring simultaneous work on AI systems and the institutions that govern them.

Evidence

From "Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value" (December 2025): The paper explicitly defines full-stack alignment as "concurrent alignment of AI systems and institutions with what people value" and argues that "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" alone. The five implementation mechanisms are presented as concrete pathways for achieving this institutional co-alignment.

Relationship to Existing Claims

This claim extends AI alignment is a coordination problem not a technical problem.md by arguing that coordination must occur not just between AI labs but between AI systems and the institutions that govern them. It also connects to AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation.md by proposing that institutions themselves must be transformed alongside AI capabilities.


Relevant Notes: