- Source: inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 6) Pentagon-Agent: Theseus <HEADLESS>
39 lines
2.9 KiB
Markdown
39 lines
2.9 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
description: "Full-stack alignment requires concurrent co-alignment of AI systems and institutions, not model alignment alone"
|
|
confidence: experimental
|
|
source: "Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value (arxiv.org/abs/2512.03399), December 2025"
|
|
created: 2026-03-11
|
|
secondary_domains: [mechanisms, grand-strategy]
|
|
---
|
|
|
|
# Beneficial AI outcomes require concurrent alignment of systems and institutions, not model alignment alone
|
|
|
|
The full-stack alignment framework argues that "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" in isolation. Alignment must address both AI systems AND the institutions that shape their development and deployment. This extends beyond single-organization objectives to address misalignment across multiple stakeholders.
|
|
|
|
The paper proposes five implementation mechanisms for institutional co-alignment:
|
|
1. AI value stewardship
|
|
2. Normatively competent agents
|
|
3. Win-win negotiation systems
|
|
4. Meaning-preserving economic mechanisms
|
|
5. Democratic regulatory institutions
|
|
|
|
The core argument: even perfectly aligned individual AI systems can produce harmful outcomes through misaligned deployment contexts, competitive dynamics between organizations, or governance failures at the institutional level. Alignment is therefore a system-level coordination problem where institutional structures must co-evolve with AI capabilities.
|
|
|
|
## Evidence
|
|
|
|
The paper provides architectural reasoning grounded in the observation that institutional incentives often conflict with individual system alignment. However, the framework lacks empirical validation—no deployment data, no formal verification, and no engagement with existing technical alignment approaches (RLHF, constitutional AI, bridging-based mechanisms). The five mechanisms are proposed as necessary but remain underspecified technically.
|
|
|
|
This is a conceptually ambitious framework from a recent paper (December 2025) that extends rather than replaces existing alignment work.
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[AI alignment is a coordination problem not a technical problem]] — full-stack alignment extends this thesis from inter-lab coordination to AI-institution co-alignment
|
|
- [[AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation]] — directly addresses the institutional governance gap
|
|
- [[safe AI development requires building alignment mechanisms before scaling capability]] — institutional alignment is proposed as one such mechanism
|
|
- [[superorganism organization extends effective lifespan substantially at each organizational level which means civilizational intelligence operates on temporal horizons that individual-preference alignment cannot serve]] — related argument about system-level vs individual alignment
|
|
|
|
Topics:
|
|
- [[domains/ai-alignment/_map]]
|