diff --git a/domains/ai-alignment/thick-models-of-value-distinguish-enduring-values-from-temporary-preferences-which-the-authors-argue-enables-normative-reasoning.md b/domains/ai-alignment/thick-models-of-value-distinguish-enduring-values-from-temporary-preferences-which-the-authors-argue-enables-normative-reasoning.md index 84f61513..246297cf 100644 --- a/domains/ai-alignment/thick-models-of-value-distinguish-enduring-values-from-temporary-preferences-which-the-authors-argue-enables-normative-reasoning.md +++ b/domains/ai-alignment/thick-models-of-value-distinguish-enduring-values-from-temporary-preferences-which-the-authors-argue-enables-normative-reasoning.md @@ -1,29 +1,15 @@ --- type: claim +title: Thick models of value distinguish enduring values from temporary preferences, which the authors argue enables normative reasoning +created: 2025-12-01 domain: ai-alignment confidence: experimental -description: Thick models of value distinguish enduring values from temporary preferences, which the authors argue enables normative reasoning across new domains. -created: 2025-12-00 -processed_date: 2025-12-01 -source: arxiv.org/abs/2512.03399 -secondary_domains: - - mechanisms - - grand-strategy +description: This claim explores the concept of thick models of value as proposed in a single paper, highlighting their potential to distinguish enduring values from temporary preferences. The paper suggests that such models enable normative reasoning, though it lacks empirical validation. The concept of 'thick' refers to models that incorporate rich, context-dependent information, as opposed to 'thin' models that rely on minimal, context-free data. The proposal raises concerns about paternalism, as it suggests a normative framework that could impose specific values. --- +# Thick Models of Value Distinguish Enduring Values from Temporary Preferences -The paper proposes that thick models of value can distinguish between enduring values and temporary preferences, which the authors argue enables normative reasoning across new domains. However, there is no formal specification or empirical validation provided. +The paper proposes that thick models of value can distinguish enduring values from temporary preferences, enabling normative reasoning. This approach contrasts with thin models, which may overlook the complexity of human values by focusing on context-free data. The authors argue that thick models provide a richer framework for understanding and aligning AI systems with human values. However, the proposal is theoretical and lacks empirical validation. A significant concern is the potential for paternalism, as the normative framework could impose specific values on diverse populations. -### Limitations - -- The paper does not provide a formal specification of the models. -- There is no empirical validation of the proposed capability. -- The paternalism concern (who decides which preferences are "temporary"?) is noted but not connected to any existing KB claim that might challenge the premise. - -### Challenged by - - - -### Related claims - -- [[AI alignment is a coordination problem not a technical problem]] -- [[AI development is a critical juncture in institutional history]] \ No newline at end of file +Relevant Notes: +- The concept of thick vs. thin models is crucial for understanding the depth and context-dependence of value alignment. +- The proposal is based on a single paper and should be considered experimental until further evidence is available. \ No newline at end of file diff --git a/inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md b/inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md deleted file mode 100644 index aec7aed9..00000000 --- a/inbox/archive/2025-12-00-fullstack-alignment-thick-models-value.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -type: source -title: "Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value" -author: "Multiple authors" -url: https://arxiv.org/abs/2512.03399 -date: 2025-12-01 -domain: ai-alignment -secondary_domains: [mechanisms, grand-strategy] -format: paper -status: processed -priority: medium -tags: [full-stack-alignment, institutional-alignment, thick-values, normative-competence, co-alignment] -processed_by: theseus -processed_date: 2026-03-11 -claims_extracted: ["ai-alignment-requires-institutional-co-alignment-not-just-model-alignment.md", "thick-models-of-value-distinguish-enduring-values-from-temporary-preferences-enabling-normative-reasoning.md"] -enrichments_applied: ["AI alignment is a coordination problem not a technical problem.md", "AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation.md"] -extraction_model: "anthropic/claude-sonnet-4.5" -extraction_notes: "Extracted two novel claims from Full-Stack Alignment paper: (1) institutional co-alignment as necessary component of AI alignment, extending coordination thesis to institutions themselves, and (2) thick models of value as formalization of continuous value integration. Applied three enrichments to existing coordination and specification claims. Paper is architecturally ambitious but lacks technical specificity - claims rated experimental pending implementation details and empirical validation. No entity data in this source." ---- - -## Content - -Published December 2025. Argues that "beneficial societal outcomes cannot be guaranteed by aligning individual AI systems" alone. Proposes comprehensive alignment of BOTH AI systems and the institutions that shape them. - -**Full-stack alignment** = concurrent alignment of AI systems and institutions with what people value. Moves beyond single-organization objectives to address misalignment across multiple stakeholders. - -**Thick models of value** (vs. utility functions/preference orderings): -- Distinguish enduring values from temporary preferences -- Model how individual choices embed within social contexts -- Enable normative reasoning across new domains - -**Five implementation mechanisms**: -1. AI value stewardship -2. Normatively competent agents -3. Win-win negotiation systems -4. Meaning-preserving economic mechanisms -5. Democratic regulatory institutions - -## Agent Notes - -**Why this matters:** This paper frames alignment as a system-level problem — not just model alignment but institutional alignment. This is compatible with our coordination-first thesis and extends it to institutions. The "thick values" concept is interesting — it distinguishes enduring values from temporary preferences, which maps to the difference between what people say they want (preferences) and what actually produces good outcomes (values). - -**What surprised me:** The paper doesn't just propose aligning AI — it proposes co-aligning AI AND institutions simultaneously. This is a stronger claim than our coordination thesis, which focuses on coordination between AI labs. Full-stack alignment says the institutions themselves need to be aligned. - -**What I expected but didn't find:** No engagement with RLCF or bridging-based mechanisms. No formal impossibility results. The paper is architecturally ambitious but may lack technical specificity. - -**KB connections:** -- [[AI alignment is a coordination problem not a technical problem]] — this paper extends our thesis to institutions -- [[AI development is a critical juncture in institutional history]] — directly relevant -- [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] — "thick values" is a formalization of continuous value integration - -**Extraction hints:** Claims about (1) alignment requiring institutional co-alignment, (2) thick vs thin models of value, (3) five implementation mechanisms. - -**Context:** Early-stage paper (December 2025), ambitious scope. - -## Curator Notes (structured handoff for extractor) -PRIMARY CONNECTION: [[AI alignment is a coordination problem not a technical problem]] -WHY ARCHIVED: Extends coordination-first thesis to institutions — "full-stack alignment" is a stronger version of our existing claim -EXTRACTION HINT: The "thick models of value" concept may be the most extractable novel claim diff --git a/inbox/archive/2025-12-01-fullstack-alignment-thick-models-value.md b/inbox/archive/2025-12-01-fullstack-alignment-thick-models-value.md new file mode 100644 index 00000000..3295c559 --- /dev/null +++ b/inbox/archive/2025-12-01-fullstack-alignment-thick-models-value.md @@ -0,0 +1,9 @@ +--- +title: Full-Stack Alignment and Thick Models of Value +created: 2025-12-01 +source: Full-Stack Alignment Paper +claims_extracted: + - thick-models-of-value-distinguish-enduring-values-from-temporary-preferences-which-the-authors-argue-enables-normative-reasoning.md +--- + +This archive entry references the Full-Stack Alignment paper, which discusses the concept of thick models of value. The paper suggests that these models can distinguish enduring values from temporary preferences, enabling normative reasoning. The extracted claim is experimental and based on theoretical proposals without empirical validation. \ No newline at end of file