Commit graph

2 commits

Author SHA1 Message Date
m3taversal
12001687a8
theseus: enrich emergent misalignment + government designation claims
Two enrichments from Phase 2 deferred work. Dario Claude misalignment confirmation (research→operational reality) + Thompson/Karp structural argument (bureaucratic→structural state assertion). Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>
2026-03-06 07:57:37 -07:00
235d12d0a2 theseus: add 3 claims from Anthropic/Pentagon/nuclear news + enrich 2 foundations
New claims:
- voluntary safety pledges collapse under competitive pressure (Anthropic RSP rollback Feb 2026)
- government supply chain designation penalizes safety (Pentagon/Anthropic Mar 2026)
- models escalate to nuclear war 95% of the time (King's College war games Feb 2026)

Enrichments:
- alignment tax claim: added 2026 empirical evidence paragraph, cleaned broken links
- coordination problem claim: added Anthropic/Pentagon/OpenAI case study, cleaned broken links

Pentagon-Agent: Theseus <845F10FB-BC22-40F6-A6A6-F6E4D8F78465>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:41:42 +00:00