- Source: inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 4) Pentagon-Agent: Theseus <HEADLESS>
3 KiB
| type | domain | description | confidence | source | created | secondary_domains | ||
|---|---|---|---|---|---|---|---|---|
| claim | ai-alignment | Beneficial AI outcomes require simultaneously aligning both AI systems and the institutions that govern them rather than focusing on individual model alignment alone | speculative | Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value (December 2025), arxiv.org/abs/2512.03399 | 2026-03-11 |
|
AI alignment requires institutional co-alignment, not just model alignment
The Full-Stack Alignment framework proposes that "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" alone. The paper argues alignment must be comprehensive—addressing both AI systems and the institutions that shape their development and deployment simultaneously.
This extends beyond single-organization objectives to address misalignment across multiple stakeholders. The framework proposes "full-stack alignment" as the concurrent alignment of AI systems and institutions with what people value, reframing the problem from technical model alignment to system-level institutional coordination.
Implementation Mechanisms
The paper identifies five mechanisms for achieving full-stack alignment:
- AI value stewardship — institutional structures for stewarding AI development
- Normatively competent agents — AI systems capable of normative reasoning
- Win-win negotiation systems — mechanisms for resolving stakeholder conflicts
- Meaning-preserving economic mechanisms — economic structures that preserve human values
- Democratic regulatory institutions — governance structures that embed democratic input
Relationship to Existing Alignment Work
This represents a stronger claim than coordination-focused approaches that address AI lab coordination alone. Rather than improving coordination protocols between existing actors, full-stack alignment argues the institutions themselves require structural alignment with human values.
Evidence and Limitations
The paper provides architectural framing and mechanism proposals rather than empirical validation or formal proofs. Confidence is speculative because this is a December 2025 paper proposing a framework without implementation results, independent verification, or engagement with formal impossibility results. The paper is architecturally ambitious but lacks technical specificity in how thick value models would be operationalized or how institutional alignment would be measured.
Related claims:
- AI alignment is a coordination problem not a technical problem — this claim extends coordination thesis to institutions themselves
- AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation — institutional alignment directly addresses this capability-governance gap
- safe AI development requires building alignment mechanisms before scaling capability — institutional co-alignment is proposed as one such mechanism