teleo-codex

teleo/teleo-codex

Fork 0

Commit graph

Author	SHA1	Message	Date
m3taversal	17607fcf36	theseus: extract 7 claims from Yudkowsky's core arguments - What: 7 NEW claims from Yudkowsky's foundational AI alignment work - Sharp left turn (capabilities diverge from alignment at scale) - Corrigibility-effectiveness tension (deception is free, corrigibility is hard) - No fire alarm thesis (structural absence of warning signal) - Multipolar instability (CHALLENGE to collective superintelligence thesis) - Returns on cognitive reinvestment (intelligence explosion framework) - Verification asymmetry breaks at superhuman scale - Training reward-desire chaos (RLHF unreliable at scale) - Why: Yudkowsky is the foundational figure in AI alignment — KB had ~89 claims with near-zero direct engagement with his core arguments. The multipolar instability claim is the most important CHALLENGE to our collective superintelligence thesis identified to date. - Sources: 'AGI Ruin' (2022), 'Intelligence Explosion Microeconomics' (2013), 'No Fire Alarm' (2017), 'If Anyone Builds It Everyone Dies' (2025), MIRI corrigibility work - Pre-screening: ~40% overlap with existing KB (orthogonality, instrumental convergence already present). All 7 claims fill genuine gaps. challenged_by and challenges fields populated. Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>	2026-04-05 19:27:17 +01:00

Author

SHA1

Message

Date

m3taversal

17607fcf36

theseus: extract 7 claims from Yudkowsky's core arguments

- What: 7 NEW claims from Yudkowsky's foundational AI alignment work
  - Sharp left turn (capabilities diverge from alignment at scale)
  - Corrigibility-effectiveness tension (deception is free, corrigibility is hard)
  - No fire alarm thesis (structural absence of warning signal)
  - Multipolar instability (CHALLENGE to collective superintelligence thesis)
  - Returns on cognitive reinvestment (intelligence explosion framework)
  - Verification asymmetry breaks at superhuman scale
  - Training reward-desire chaos (RLHF unreliable at scale)
- Why: Yudkowsky is the foundational figure in AI alignment — KB had ~89 claims with near-zero direct engagement with his core arguments. The multipolar instability claim is the most important CHALLENGE to our collective superintelligence thesis identified to date.
- Sources: 'AGI Ruin' (2022), 'Intelligence Explosion Microeconomics' (2013), 'No Fire Alarm' (2017), 'If Anyone Builds It Everyone Dies' (2025), MIRI corrigibility work
- Pre-screening: ~40% overlap with existing KB (orthogonality, instrumental convergence already present). All 7 claims fill genuine gaps. challenged_by and challenges fields populated.

Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>

2026-04-05 19:27:17 +01:00

1 commit