teleo-codex/foundations
m3taversal 08dea4249f theseus: extract 4 NEW claims + 1 enrichment from Christiano core alignment research
Phase 2 of 5-phase AI alignment research program. Christiano's prosaic
alignment counter-position to Yudkowsky. Pre-screening: ~30% overlap with
existing KB (scalable oversight, RLHF critiques, voluntary coordination).

NEW claims:
1. Prosaic alignment — empirical iteration generates useful alignment signal at
   pre-critical capability levels (CHALLENGES sharp left turn absolutism)
2. Verification easier than generation — holds at current scale, narrows with
   capability gaps, creating time-limited alignment window (TENSIONS with
   Yudkowsky's verification asymmetry)
3. ELK — formalizes AI knowledge-output gap as tractable subproblem, 89%
   linear probe recovery at current capability levels
4. IDA — recursive human+AI amplification preserves alignment through
   distillation iterations but compounding errors make guarantee probabilistic

ENRICHMENT:
- Scalable oversight claim: added Christiano's debate theory (PSPACE
  amplification with poly-time judges) as theoretical basis that empirical
  data challenges

Source: Paul Christiano, Alignment Forum (2016-2022), arXiv:1805.00899,
arXiv:1706.03741, ARC ELK report (2021), Yudkowsky-Christiano takeoff debate

Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>
2026-04-05 20:16:59 +01:00
..
collective-intelligence theseus: extract 4 NEW claims + 1 enrichment from Christiano core alignment research 2026-04-05 20:16:59 +01:00
critical-systems reweave: connect 13 orphan claims via vector similarity 2026-04-04 12:52:43 +00:00
cultural-dynamics reweave: connect 22 orphan claims via vector similarity 2026-04-04 12:44:45 +00:00
teleological-economics leo: extract 9 Moloch sprint claims across grand-strategy, internet-finance, and foundations 2026-04-04 13:31:00 +01:00