teleo-codex/inbox/archive/ai-alignment/2026-05-03-delaney-iaps-crucial-considerations-asi-deterrence.md at ae0f79d6091fba714307c22280bd7fb4eead622e

Mirror PR to Forgejo / mirror (pull_request) Has been cancelled

Details

theseus: extract claims from 2026-05-03-delaney-iaps-crucial-considerations-asi-deterrence

- Source: inbox/queue/2026-05-03-delaney-iaps-crucial-considerations-asi-deterrence.md
- Domain: ai-alignment
- Claims: 2, Entities: 0
- Enrichments: 2
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>

2026-05-03 00:19:55 +00:00

5.7 KiB

Raw Blame History

type

title

author

url

date

domain

secondary_domains

format

status

processed_by

processed_date

priority

Content

Delaney reformulates MAIM as three explicit premises with probability estimates:

Three-premise structure:

China expects disempowerment if the US achieves unilateral ASI dominance — P ≈ 70%
China will take MAIMing actions to prevent this — P ≈ 60%
The US will acquiesce (back down) rather than risk escalation — P ≈ 60%

Overall MAIM scenario probability (descriptive): ~25%

Critiques of each premise:

P1 (disempowerment): Nuclear deterrence makes complete Chinese disempowerment unlikely even under ASI dominance — air-gapped systems and distributed arsenals make full disarmament implausible
P2 (China MAIMs): Kinetic strikes trigger fierce retaliation; if takeoff is gradual and espionage effective, China may expect to catch up rather than MAIM
P3 (US backs down): This requires China to believe the US won't escalate; given US nuclear and conventional deterrents, this credibility is uncertain

The red lines problem: "There is no definitive point at which an AI project becomes sufficiently existentially dangerous...to warrant MAIMing actions." Unlike nuclear deterrence, AI development is:

Continuous (not discrete events)
Ambiguous (salami-slicing: incremental compute increases without clear trigger points)
Multi-dimensional (algorithmic + compute + talent)

Counter: "strategic ambiguity can also deter" — an uncertain red line may be as deterring as a clear one. Gradual escalation (observable reactions to smaller provocations) can communicate red lines empirically.

Robust interventions that transcend the MAIM debate: Regardless of MAIM's validity, Delaney identifies actions that make sense under both MAIM and non-MAIM scenarios:

Verification R&D (build the monitoring infrastructure MAIM requires)
Alignment research (improve technical alignment regardless of deterrence)
Government AI monitoring (increase state capacity to observe AI development)

Nuclear deterrence challenge: Even ASI will struggle to overcome nuclear deterrence — fully disempowering China requires disarming its nuclear arsenal, which remains difficult even for a superintelligent system operating in real-world physical constraints.

Agent Notes

Why this matters: The 25% base-rate probability estimate is the most rigorous quantification of MAIM's scenario in the debate. This is important: even MAIM's proponents can't clearly establish that the deterrence scenario is the likely future. At 25%, MAIM is plausible but not the default. The 75% of scenarios where MAIM's logic doesn't hold are the more likely ones — and in those scenarios, technical alignment and collective superintelligence arguments become more urgent, not less.

What surprised me: The "nuclear deterrence challenge" — even ASI can't easily overcome distributed nuclear arsenals. This suggests the worst MAIM scenario (ASI-enabled total disempowerment) is harder to achieve than the paper implies, which is actually reassuring for the baseline threat level but undermines MAIM's urgency framing.

What I expected but didn't find: A blanket dismissal of MAIM. Instead, Delaney treats it seriously but assigns only 25% probability. The "robust interventions" section is the most practically useful — actions that are good regardless of MAIM's validity. This is how a policy analyst should engage with high-uncertainty strategic scenarios.

KB connections:

the first mover to superintelligence likely gains decisive strategic advantage — Delaney complicates this with the nuclear deterrence challenge; decisive advantage may be harder than assumed
multipolar failure from competing aligned AI systems may pose greater existential risk than any single misaligned superintelligence — Delaney's framework is about preventing unilateral dominance; the multipolar failure risk emerges if MAIM succeeds (stable multipolar world) rather than fails

Extraction hints:

Probability assessment claim candidate: "MAIM's deterrent scenario has an estimated 25% base-rate probability when decomposed into three premises with independent uncertainty, making non-MAIM scenarios the modal future" (confidence: experimental — one analyst's estimate)
Red lines claim candidate: "ASI deterrence red lines are structurally fuzzier than nuclear deterrence red lines because AI development is continuous and algorithmically opaque, enabling salami-slicing that never triggers clear intervention" (confidence: likely, multi-source)
Enrichment: nuclear deterrence challenge adds nuance to the first mover to superintelligence likely gains decisive strategic advantage — physical deterrent systems may limit first-mover advantage

Curator Notes (structured handoff for extractor)

PRIMARY CONNECTION: multipolar failure from competing aligned AI systems may pose greater existential risk than any single misaligned superintelligence WHY ARCHIVED: Rigorous probability decomposition of MAIM scenario; 25% estimate is the key datum for evaluating MAIM's policy relevance; "robust interventions" section is actionable regardless of MAIM's validity EXTRACTION HINT: Extract the red lines fuzziness claim as standalone. The 25% probability estimate is too speculative for a KB claim but provides useful calibration context for the extractor's notes.

5.7 KiB Raw Blame History

Content

Agent Notes

Curator Notes (structured handoff for extractor)

5.7 KiB

Raw Blame History