theseus: extract claims from 2026-05-03-miri-refining-maim-conditions-for-deterrence

- Source: inbox/queue/2026-05-03-miri-refining-maim-conditions-for-deterrence.md - Domain: ai-alignment - Claims: 2, Entities: 0 - Enrichments: 2 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
2026-05-03 00:21:16 +00:00 · 2026-05-03 00:21:16 +00:00 · d41469fbcf
commit d41469fbcf
parent a995078cc9
3 changed files with 42 additions and 1 deletions
--- a/domains/ai-alignment/ai-capability-breadth-makes-deterrence-red-lines-over-broad-triggering-false-positives.md
+++ b/domains/ai-alignment/ai-capability-breadth-makes-deterrence-red-lines-over-broad-triggering-false-positives.md
@ -0,0 +1,19 @@
 ---
 type: claim
 domain: ai-alignment
 description: MIRI argues that because AI capabilities advance broadly rather than narrowly, any red line specific enough to target dangerous capabilities will also trigger on non-threatening systems
 confidence: experimental
 source: MIRI, Refining MAIM (2025-04-11)
 created: 2026-05-03
 title: AI capability breadth makes deterrence red lines over-broad triggering false positives because frontier models advance general capabilities not specific dangerous functions
 agent: theseus
 sourced_from: ai-alignment/2026-05-03-miri-refining-maim-conditions-for-deterrence.md
 scope: structural
 sourcer: MIRI
 supports: ["ai-is-omni-use-technology-categorically-different-from-dual-use-because-it-improves-all-capabilities-simultaneously-meaning-anything-ai-can-optimize-it-can-break"]
 related: ["ai-is-omni-use-technology-categorically-different-from-dual-use-because-it-improves-all-capabilities-simultaneously-meaning-anything-ai-can-optimize-it-can-break"]
 ---
 # AI capability breadth makes deterrence red lines over-broad triggering false positives because frontier models advance general capabilities not specific dangerous functions
 MIRI identifies a second structural problem with MAIM deterrence: 'Frontier AI capabilities advance in broad, general ways. A new model's development does not have to specifically aim at autonomous R&D to advance the frontier of relevant capabilities.' The mechanism is that a model designed to be state-of-the-art at programming tasks 'likely also entails novel capabilities relevant to AI development.' This creates a dilemma for red line specification: the capabilities that threaten unilateral ASI development (autonomous R&D, recursive self-improvement) are not isolated functions but emerge from general capability advancement. Therefore, any red line drawn to catch dangerous capabilities must be drawn broadly enough to trigger on almost any frontier model development. An over-broad red line produces two failure modes: (1) constant false alarms that erode deterrence credibility, and (2) effective prohibition of all frontier AI development, which no major power will accept. This is distinct from detection difficulty—MIRI is arguing that even perfect detection cannot solve the problem because the *breadth* of capability advancement makes specific targeting impossible.
--- a/domains/ai-alignment/recursive-self-improvement-detection-timing-makes-maim-deterrence-structurally-inadequate.md
+++ b/domains/ai-alignment/recursive-self-improvement-detection-timing-makes-maim-deterrence-structurally-inadequate.md
@ -0,0 +1,19 @@
 ---
 type: claim
 domain: ai-alignment
 description: MIRI argues that using recursive self-improvement as the red line for MAIM deterrence creates an intractable timing problem where detection occurs too late for effective sabotage response
 confidence: experimental
 source: MIRI, Refining MAIM (2025-04-11)
 created: 2026-05-03
 title: recursive self-improvement detection timing makes MAIM deterrence structurally inadequate because the dangerous threshold is detectable only as late as possible leaving insufficient response time
 agent: theseus
 sourced_from: ai-alignment/2026-05-03-miri-refining-maim-conditions-for-deterrence.md
 scope: structural
 sourcer: MIRI
 supports: ["capability-control-methods-are-temporary-at-best-because-a-sufficiently-intelligent-system-can-circumvent-any-containment-designed-by-lesser-minds"]
 related: ["recursive-self-improvement-creates-explosive-intelligence-gains-because-the-system-that-improves-is-itself-improving", "capability-control-methods-are-temporary-at-best-because-a-sufficiently-intelligent-system-can-circumvent-any-containment-designed-by-lesser-minds"]
 ---
 # recursive self-improvement detection timing makes MAIM deterrence structurally inadequate because the dangerous threshold is detectable only as late as possible leaving insufficient response time
 MIRI identifies a fundamental timing constraint in MAIM deterrence architecture: 'An intelligence recursion could proceed too quickly for the recursion to be identified and responded to.' The critique centers on the observation that reacting to deployment of AI systems capable of recursive self-improvement is 'as late in the game as one could possibly react, and leaves little margin for error.' This creates a structural bind where the red line that matters most (recursive self-improvement capability) is the one that provides the least actionable warning time. The mechanism assumes detection occurs with sufficient lead time to mount sabotage operations, but if the dangerous transition is recursive self-improvement itself, the timeline from 'detectable' to 'uncontrollable' may compress to hours or days rather than the weeks or months required for coordinated international response. This is distinct from general observability problems—MIRI is specifically arguing that even if detection works perfectly, the *timing* of when the dangerous threshold becomes detectable makes the deterrence mechanism structurally inadequate.
--- a/inbox/archive/ai-alignment/2026-05-03-miri-refining-maim-conditions-for-deterrence.md
+++ b/inbox/archive/ai-alignment/2026-05-03-miri-refining-maim-conditions-for-deterrence.md
@ -7,10 +7,13 @@ date: 2025-04-11
 domain: ai-alignment
 secondary_domains: [grand-strategy]
 format: article
-status: unprocessed
+status: processed
 processed_by: theseus
 processed_date: 2026-05-03
 priority: medium
 tags: [MAIM, deterrence, red-lines, recursive-self-improvement, critique, MIRI]
 intake_tier: research-task
 extraction_model: "anthropic/claude-sonnet-4.5"
 ---
 ## Content