Compare commits
1 commit
10aae2f4d9
...
d696365872
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d696365872 |
2 changed files with 4 additions and 10 deletions
|
|
@ -21,12 +21,6 @@ This phased approach is also a practical response to the observation that since
|
|||
|
||||
Anthropic's RSP rollback demonstrates the opposite pattern in practice: the company scaled capability while weakening its pre-commitment to adequate safety measures. The original RSP required guaranteeing safety measures were adequate *before* training new systems. The rollback removes this forcing function, allowing capability development to proceed with safety work repositioned as aspirational ('we hope to create a forcing function') rather than mandatory. This provides empirical evidence that even safety-focused organizations prioritize capability scaling over alignment-first development when competitive pressure intensifies, suggesting the claim may be normatively correct but descriptively violated by actual frontier labs under market conditions.
|
||||
|
||||
|
||||
### Additional Evidence (extend)
|
||||
*Source: [[2026-02-00-yamamoto-full-formal-arrow-impossibility]] | Added: 2026-03-12 | Extractor: anthropic/claude-sonnet-4.5*
|
||||
|
||||
Arrow's impossibility theorem now has a full formal representation using proof calculus in formal logic (Yamamoto, PLOS One, February 2026). This provides machine-checkable verification of the theorem's validity, strengthening the mathematical foundation for claims that universal alignment faces fundamental constraints. The formal proof complements existing computer-aided proofs (AAAI 2008) and simplified proofs via Condorcet's paradox, but represents the first complete logical derivation revealing the global structure of the social welfare function central to the theorem.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -7,14 +7,14 @@ date: 2026-02-01
|
|||
domain: ai-alignment
|
||||
secondary_domains: [critical-systems]
|
||||
format: paper
|
||||
status: enrichment
|
||||
status: null-result
|
||||
priority: medium
|
||||
tags: [arrows-theorem, formal-proof, proof-calculus, social-choice]
|
||||
processed_by: theseus
|
||||
processed_date: 2026-03-11
|
||||
enrichments_applied: ["safe AI development requires building alignment mechanisms before scaling capability.md"]
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
extraction_notes: "Pure formal verification paper with no direct AI alignment discussion. Strengthens mathematical foundation for existing Arrow's impossibility claims by providing machine-checkable proof. No new claims warranted—this is infrastructure for existing arguments rather than novel insight. Curator correctly identified this as enrichment material rather than standalone claim."
|
||||
extraction_notes: "Pure formal verification paper with no direct AI alignment discussion. Enriches existing Arrow's impossibility claim by providing machine-checkable proof foundation. No new claims warranted—this is methodological advancement (formal verification) rather than novel theoretical insight. The timing (Feb 2026) is notable as formal proof tradition catches up to applied alignment work, but the paper itself contains no KB-relevant arguable propositions beyond strengthening the mathematical rigor of existing claims."
|
||||
---
|
||||
|
||||
## Content
|
||||
|
|
@ -39,5 +39,5 @@ EXTRACTION HINT: Likely enrichment to existing claim rather than standalone —
|
|||
|
||||
## Key Facts
|
||||
- Arrow's impossibility theorem received full formal representation using proof calculus (Yamamoto, PLOS One, February 2026)
|
||||
- Formal proof complements AAAI 2008 computer-aided proofs and Condorcet's paradox simplifications
|
||||
- Derivation reveals global structure of social welfare function central to the theorem
|
||||
- Formal proof complements existing computer-aided proofs from AAAI 2008 and simplified proofs via Condorcet's paradox
|
||||
- Paper published in PLOS One (open-access, peer-reviewed journal)
|
||||
|
|
|
|||
Loading…
Reference in a new issue