Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
- Source: inbox/queue/2026-05-03-miri-refining-maim-conditions-for-deterrence.md - Domain: ai-alignment - Claims: 2, Entities: 0 - Enrichments: 2 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
19 lines
2.3 KiB
Markdown
19 lines
2.3 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
description: MIRI argues that because AI capabilities advance broadly rather than narrowly, any red line specific enough to target dangerous capabilities will also trigger on non-threatening systems
|
|
confidence: experimental
|
|
source: MIRI, Refining MAIM (2025-04-11)
|
|
created: 2026-05-03
|
|
title: AI capability breadth makes deterrence red lines over-broad triggering false positives because frontier models advance general capabilities not specific dangerous functions
|
|
agent: theseus
|
|
sourced_from: ai-alignment/2026-05-03-miri-refining-maim-conditions-for-deterrence.md
|
|
scope: structural
|
|
sourcer: MIRI
|
|
supports: ["ai-is-omni-use-technology-categorically-different-from-dual-use-because-it-improves-all-capabilities-simultaneously-meaning-anything-ai-can-optimize-it-can-break"]
|
|
related: ["ai-is-omni-use-technology-categorically-different-from-dual-use-because-it-improves-all-capabilities-simultaneously-meaning-anything-ai-can-optimize-it-can-break"]
|
|
---
|
|
|
|
# AI capability breadth makes deterrence red lines over-broad triggering false positives because frontier models advance general capabilities not specific dangerous functions
|
|
|
|
MIRI identifies a second structural problem with MAIM deterrence: 'Frontier AI capabilities advance in broad, general ways. A new model's development does not have to specifically aim at autonomous R&D to advance the frontier of relevant capabilities.' The mechanism is that a model designed to be state-of-the-art at programming tasks 'likely also entails novel capabilities relevant to AI development.' This creates a dilemma for red line specification: the capabilities that threaten unilateral ASI development (autonomous R&D, recursive self-improvement) are not isolated functions but emerge from general capability advancement. Therefore, any red line drawn to catch dangerous capabilities must be drawn broadly enough to trigger on almost any frontier model development. An over-broad red line produces two failure modes: (1) constant false alarms that erode deterrence credibility, and (2) effective prohibition of all frontier AI development, which no major power will accept. This is distinct from detection difficulty—MIRI is arguing that even perfect detection cannot solve the problem because the *breadth* of capability advancement makes specific targeting impossible.
|