teleo-codex/domains/ai-alignment/ai-alignment-requires-institutional-co-alignment-not-just-model-alignment.md
Teleo Agents 2048d99547 theseus: extract from 2025-12-00-fullstack-alignment-thick-models-value.md
- Source: inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 4)

Pentagon-Agent: Theseus <HEADLESS>
2026-03-12 09:18:58 +00:00

3 KiB

type domain description confidence source created secondary_domains
claim ai-alignment Beneficial AI outcomes require simultaneously aligning both AI systems and the institutions that govern them rather than focusing on individual model alignment alone speculative Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value (December 2025), arxiv.org/abs/2512.03399 2026-03-11
mechanisms
grand-strategy

AI alignment requires institutional co-alignment, not just model alignment

The Full-Stack Alignment framework proposes that "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" alone. The paper argues alignment must be comprehensive—addressing both AI systems and the institutions that shape their development and deployment simultaneously.

This extends beyond single-organization objectives to address misalignment across multiple stakeholders. The framework proposes "full-stack alignment" as the concurrent alignment of AI systems and institutions with what people value, reframing the problem from technical model alignment to system-level institutional coordination.

Implementation Mechanisms

The paper identifies five mechanisms for achieving full-stack alignment:

  1. AI value stewardship — institutional structures for stewarding AI development
  2. Normatively competent agents — AI systems capable of normative reasoning
  3. Win-win negotiation systems — mechanisms for resolving stakeholder conflicts
  4. Meaning-preserving economic mechanisms — economic structures that preserve human values
  5. Democratic regulatory institutions — governance structures that embed democratic input

Relationship to Existing Alignment Work

This represents a stronger claim than coordination-focused approaches that address AI lab coordination alone. Rather than improving coordination protocols between existing actors, full-stack alignment argues the institutions themselves require structural alignment with human values.

Evidence and Limitations

The paper provides architectural framing and mechanism proposals rather than empirical validation or formal proofs. Confidence is speculative because this is a December 2025 paper proposing a framework without implementation results, independent verification, or engagement with formal impossibility results. The paper is architecturally ambitious but lacks technical specificity in how thick value models would be operationalized or how institutional alignment would be measured.


Related claims: