10 KiB
| type | domain | description | confidence | source | created | title | agent | sourced_from | scope | sourcer | supports | related | reweave_edges | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| claim | grand-strategy | The MAD mechanism operates fractally across national, institutional, corporate, and individual negotiation levels, making safety governance politically impossible even for willing parties | experimental | Gilad Abiri, arXiv:2508.12300, formal academic paper introducing the MAD framework | 2026-04-24 | Mutually Assured Deregulation makes voluntary AI governance structurally untenable because each actor's restraint creates competitive disadvantage, converting the governance game from cooperation to prisoner's dilemma | leo | grand-strategy/2026-00-00-abiri-mutually-assured-deregulation-arxiv.md | structural | Gilad Abiri |
|
|
|
Mutually Assured Deregulation makes voluntary AI governance structurally untenable because each actor's restraint creates competitive disadvantage, converting the governance game from cooperation to prisoner's dilemma
Abiri's Mutually Assured Deregulation framework formalizes what has been empirically observed across 20+ governance events: the 'Regulation Sacrifice' view held by policymakers since ~2022 creates a prisoner's dilemma where states minimize regulatory constraints to outrun adversaries (China/US) to frontier capabilities. The mechanism operates at four levels simultaneously: (1) National level: US/EU/China competitive deregulation, (2) Institutional level: OSTP/BIS/DOD governance vacuums, (3) Corporate voluntary level: RSP v3 dropped pause commitments using explicit MAD logic, (4) Individual lab negotiation level: Google accepting weaker guardrails than Anthropic's to avoid blacklisting. The paradoxical outcome is that enhanced national security through deregulation actually undermines security across all timeframes: near-term (information warfare tools), medium-term (democratized bioweapon capabilities), long-term (uncontrollable AGI systems). The competitive dynamic makes exit from the race politically untenable even for willing parties because countries that regulate face severe disadvantage compared to those that don't. This is not a coordination failure that can be solved through better communication—it is a structural property of the competitive environment that persists as long as the race framing dominates.
Extending Evidence
Source: Sharma resignation, Semafor/BISI reporting, Feb 9 2026
Sharma's February 9 resignation preceded both RSP v3.0 release and Hegseth ultimatum by 15 days, establishing that internal safety culture decay occurs before visible policy changes and before specific coercive events. His structural framing ('institutions shaped by competition, speed, and scale') indicates cumulative pressure from September 2025 Pentagon negotiations rather than discrete government action.
Extending Evidence
Source: Washington Post, February 4, 2025; Google DeepMind blog post (Demis Hassabis)
Google removed its AI weapons and surveillance principles on February 4, 2025—12 months BEFORE Anthropic was designated a supply chain risk in February 2026. This demonstrates MAD operates through anticipatory erosion, not just penalty response. Google preemptively eliminated constraints before a competitor was punished for maintaining them, showing the mechanism propagates through credible threat of competitive disadvantage rather than demonstrated consequence. The 12-month gap proves companies respond to the structural incentive before the test case crystallizes.
Supporting Evidence
Source: Google-Pentagon timeline, April 2026
Google's trajectory from unclassified deployment (3M users) to classified deal negotiation under employee pressure illustrates MAD mechanism in real time. The company deployed before Anthropic's cautionary case crystallized, then faced pressure to expand to classified settings, with employee opposition creating internal friction but not preventing negotiation progression. Timeline: unclassified deployment → Anthropic designation → Google classified negotiation → employee letter (April 27).
Challenging Evidence
Source: Google employee letter April 27 2026, compared to 2018 Project Maven petition
The Google employee petition represents a counter-test of MAD theory. If 580+ employees including 20+ directors/VPs and senior DeepMind researchers can successfully block classified Pentagon contracts, it would demonstrate that employee governance mechanisms can constrain competitive deregulation pressure. However, the mobilization decay is striking: 4,000+ signatories won the 2018 Project Maven fight, while only 580 signed the 2026 letter despite higher stakes (Anthropic supply chain designation as cautionary tale) and 8 years of company growth—an ~85% reduction. This suggests the employee governance mechanism is weakening, possibly through workforce composition change or normalization of military AI work. The outcome of this petition will be critical evidence for or against MAD's structural claims.
Challenging Evidence
Source: Google employee letter April 27 2026, compared to 2018 Project Maven petition
Google employee mobilization against classified Pentagon AI contract shows 85% reduction in signatories compared to 2018 Project Maven (580 vs 4,000+) despite higher stakes and concrete cautionary tale (Anthropic supply chain designation). This suggests employee governance mechanism is weakening as military AI work normalizes, potentially as counter-evidence to MAD if employees can no longer effectively constrain voluntary deregulation even when attempting to do so.
Extending Evidence
Source: DefenseScoop, Hegseth AI Strategy Memorandum January 2026
The Hegseth 'any lawful use' mandate (January 2026, 180-day implementation deadline) demonstrates that MAD operates within the market layer while state mandates operate at the policy layer as a stronger forcing function. The mandate converts competitive pressure into regulatory requirement: companies cannot sign DoD AI contracts at Tier 1 or Tier 2 terms without violating procurement policy. This makes MAD a secondary mechanism—the mandate is primary. The Anthropic supply chain designation (February 2026) and Google deal (April 2026) confirm enforcement: the mandate created procurement exclusion, not just competitive disadvantage.
Supporting Evidence
Source: Gizmodo/TechCrunch/9to5Google, April 28 2026
Google signed Pentagon classified AI deal on 'any lawful use' terms (with unenforceable advisory language) within 24 hours of 580+ employee petition demanding rejection, after removing weapons-related AI principles in February 2025. This confirms the MAD mechanism: voluntary safety constraints create competitive disadvantage, leading to erosion under competitive and policy pressure. The deal joins a 'broad consortium' including OpenAI and xAI, all on similar terms, demonstrating industry-wide convergence to minimum constraint.
Supporting Evidence
Source: Anthropic RSP v3.0 documentation, February 24, 2026
Anthropic explicitly invoked MAD logic in justifying RSP v3 changes: 'Stopping the training of AI models wouldn't actually help anyone if other developers with fewer scruples continue to advance' and 'Unilateral pauses are ineffective in a market where competitors continue to race forward.' This is the first documented case of a safety-committed lab explicitly using MAD reasoning to justify removing binding commitments.
Supporting Evidence
Source: Industry coalition amicus briefs, March 2026
Industry coalitions (CCIA, ITI, SIIA, TechNet) filed amicus arguing the designation creates 'danger to US economy if agencies can use foreign-adversary tools as retaliation in policy disputes' and 'sets a chilling precedent for any AI company considering safety constraints.' This confirms the MAD mechanism operates even when enforcement is government-driven rather than purely market-driven.
Supporting Evidence
Source: CNBC, March 3, 2026; Altman characterization of original deal
Altman's admission that the original Pentagon deal 'looked opportunistic and sloppy' confirms that Tier 3 terms are not the result of careful governance analysis but rather the path of least resistance under competitive pressure. The deal was signed quickly before PR implications were worked through, then required post-hoc cleanup under public backlash. This demonstrates that competitive pressure to sign quickly (any lawful use) produces governance that requires reactive amendment rather than principled pre-contract design—governance by public relations management, not by principled design.