Compare commits

...

2 commits

Author SHA1 Message Date
Teleo Agents
aa62e4dd9d leo: extract claims from 2026-02-09-semafor-sharma-anthropic-safety-head-resignation
- Source: inbox/queue/2026-02-09-semafor-sharma-anthropic-safety-head-resignation.md
- Domain: grand-strategy
- Claims: 1, Entities: 1
- Enrichments: 2
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Leo <PIPELINE>
2026-04-25 08:16:17 +00:00
Teleo Agents
8fd2c9840e leo: extract claims from 2026-02-03-bengio-international-ai-safety-report-2026
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
- Source: inbox/queue/2026-02-03-bengio-international-ai-safety-report-2026.md
- Domain: grand-strategy
- Claims: 1, Entities: 1
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Leo <PIPELINE>
2026-04-25 08:15:20 +00:00
10 changed files with 138 additions and 23 deletions

View file

@ -10,16 +10,16 @@ agent: leo
scope: structural
sourcer: Council of Europe, civil society organizations, GPPi
related_claims: ["eu-ai-act-article-2-3-national-security-exclusion-confirms-legislative-ceiling-is-cross-jurisdictional.md", "the-legislative-ceiling-on-military-ai-governance-is-conditional-not-absolute-cwc-proves-binding-governance-without-carveouts-is-achievable-but-requires-three-currently-absent-conditions.md", "international-ai-governance-stepping-stone-theory-fails-because-strategic-actors-opt-out-at-non-binding-stage.md"]
related:
- eu-ai-governance-reveals-form-substance-divergence-at-domestic-regulatory-level-through-simultaneous-treaty-ratification-and-compliance-delay
- international-ai-governance-form-substance-divergence-enables-simultaneous-treaty-ratification-and-domestic-implementation-weakening
- International AI governance stepping-stone theory (voluntary → non-binding → binding) fails because strategic actors with frontier AI capabilities opt out even at the non-binding declaration stage
reweave_edges:
- eu-ai-governance-reveals-form-substance-divergence-at-domestic-regulatory-level-through-simultaneous-treaty-ratification-and-compliance-delay|related|2026-04-18
- international-ai-governance-form-substance-divergence-enables-simultaneous-treaty-ratification-and-domestic-implementation-weakening|related|2026-04-18
- International AI governance stepping-stone theory (voluntary → non-binding → binding) fails because strategic actors with frontier AI capabilities opt out even at the non-binding declaration stage|related|2026-04-18
related: ["eu-ai-governance-reveals-form-substance-divergence-at-domestic-regulatory-level-through-simultaneous-treaty-ratification-and-compliance-delay", "international-ai-governance-form-substance-divergence-enables-simultaneous-treaty-ratification-and-domestic-implementation-weakening", "International AI governance stepping-stone theory (voluntary \u2192 non-binding \u2192 binding) fails because strategic actors with frontier AI capabilities opt out even at the non-binding declaration stage", "binding-international-ai-governance-achieves-legal-form-through-scope-stratification-excluding-high-stakes-applications", "use-based-ai-governance-emerged-as-legislative-framework-through-slotkin-ai-guardrails-act", "ai-weapons-governance-tractability-stratifies-by-strategic-utility-creating-ottawa-treaty-path-for-medium-utility-categories"]
reweave_edges: ["eu-ai-governance-reveals-form-substance-divergence-at-domestic-regulatory-level-through-simultaneous-treaty-ratification-and-compliance-delay|related|2026-04-18", "international-ai-governance-form-substance-divergence-enables-simultaneous-treaty-ratification-and-domestic-implementation-weakening|related|2026-04-18", "International AI governance stepping-stone theory (voluntary \u2192 non-binding \u2192 binding) fails because strategic actors with frontier AI capabilities opt out even at the non-binding declaration stage|related|2026-04-18"]
---
# Binding international AI governance achieves legal form through scope stratification — the Council of Europe AI Framework Convention entered force by explicitly excluding national security, defense applications, and making private sector obligations optional
The Council of Europe AI Framework Convention (CETS 225) entered into force on November 1, 2025, becoming the first legally binding international AI treaty. However, it achieved this binding status through systematic exclusion of high-stakes applications: (1) National security activities are completely exempt — parties 'are not required to apply the provisions of the treaty to activities related to the protection of their national security interests'; (2) National defense matters are explicitly excluded; (3) Private sector obligations are opt-in — parties may choose whether to directly obligate companies or 'take other measures' while respecting international obligations. Civil society organizations warned that 'the prospect of failing to address private companies while also providing states with a broad national security exemption would provide little meaningful protection to individuals who are increasingly subject to powerful AI systems.' This pattern mirrors the EU AI Act Article 2.3 national security carve-out, suggesting scope stratification is the dominant mechanism by which AI governance frameworks achieve binding legal form. The treaty's rapid entry into force (18 months from adoption, requiring only 5 ratifications including 3 CoE members) was enabled by its limited scope — it binds only where it excludes the highest-stakes AI deployments. This creates a two-tier international architecture: Tier 1 (CoE treaty) binds civil AI applications with minimal enforcement; Tier 2 (military, frontier development, private sector) remains ungoverned internationally. The GPPi March 2026 policy brief 'Anchoring Global AI Governance' acknowledges the challenge of building on this foundation given its structural limitations.
The Council of Europe AI Framework Convention (CETS 225) entered into force on November 1, 2025, becoming the first legally binding international AI treaty. However, it achieved this binding status through systematic exclusion of high-stakes applications: (1) National security activities are completely exempt — parties 'are not required to apply the provisions of the treaty to activities related to the protection of their national security interests'; (2) National defense matters are explicitly excluded; (3) Private sector obligations are opt-in — parties may choose whether to directly obligate companies or 'take other measures' while respecting international obligations. Civil society organizations warned that 'the prospect of failing to address private companies while also providing states with a broad national security exemption would provide little meaningful protection to individuals who are increasingly subject to powerful AI systems.' This pattern mirrors the EU AI Act Article 2.3 national security carve-out, suggesting scope stratification is the dominant mechanism by which AI governance frameworks achieve binding legal form. The treaty's rapid entry into force (18 months from adoption, requiring only 5 ratifications including 3 CoE members) was enabled by its limited scope — it binds only where it excludes the highest-stakes AI deployments. This creates a two-tier international architecture: Tier 1 (CoE treaty) binds civil AI applications with minimal enforcement; Tier 2 (military, frontier development, private sector) remains ungoverned internationally. The GPPi March 2026 policy brief 'Anchoring Global AI Governance' acknowledges the challenge of building on this foundation given its structural limitations.
## Supporting Evidence
**Source:** International AI Safety Report 2026
The 2026 International AI Safety Report, despite achieving consensus across 30+ countries, does not close the military AI governance gap and explicitly notes that national security exemptions remain. Even at the epistemic coordination level (agreement on facts), the report's scope excludes high-stakes military applications, confirming that strategic interest conflicts prevent comprehensive governance even before operational commitments are attempted.

View file

@ -0,0 +1,19 @@
---
type: claim
domain: grand-strategy
description: International scientific bodies can achieve agreement on facts (epistemic layer) while simultaneously documenting failure to achieve agreement on action (operational layer), as demonstrated by 30+ countries coordinating on AI risk evidence while confirming governance remains voluntary and fragmented
confidence: experimental
source: International AI Safety Report 2026 (Bengio et al., 100+ experts, 30+ countries)
created: 2026-04-25
title: Epistemic coordination on AI safety outpaces operational coordination, creating documented scientific consensus on governance fragmentation
agent: leo
sourced_from: grand-strategy/2026-02-03-bengio-international-ai-safety-report-2026.md
scope: structural
sourcer: Yoshua Bengio et al.
supports: ["international-ai-governance-stepping-stone-theory-fails-because-strategic-actors-opt-out-at-non-binding-stage", "binding-international-ai-governance-achieves-legal-form-through-scope-stratification-excluding-high-stakes-applications"]
related: ["technology-advances-exponentially-but-coordination-mechanisms-evolve-linearly-creating-a-widening-gap", "formal-coordination-mechanisms-require-narrative-objective-function-specification", "binding-international-ai-governance-achieves-legal-form-through-scope-stratification-excluding-high-stakes-applications", "evidence-dilemma-rapid-ai-development-structurally-prevents-adequate-pre-deployment-safety-evidence-accumulation", "only binding regulation with enforcement teeth changes frontier AI lab behavior because every voluntary commitment has been eroded abandoned or made conditional on competitor behavior when commercially inconvenient", "AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation"]
---
# Epistemic coordination on AI safety outpaces operational coordination, creating documented scientific consensus on governance fragmentation
The 2026 International AI Safety Report represents the largest international scientific collaboration on AI governance to date, with 100+ independent experts from 30+ countries and international organizations (EU, OECD, UN) achieving consensus on AI capabilities, risks, and governance gaps. However, the report's own findings document that 'current governance remains fragmented, largely voluntary, and difficult to evaluate due to limited incident reporting and transparency.' The report explicitly does NOT make binding policy recommendations, instead choosing to 'synthesize evidence' rather than 'recommend action.' This reveals a structural decoupling between two layers of coordination: (1) epistemic coordination (agreement on what is true) which succeeded at unprecedented scale, and (2) operational coordination (agreement on what to do) which the report itself confirms has failed. The report's deliberate choice to function purely in the epistemic layer—informing rather than constraining—demonstrates that international scientific consensus can coexist with and actually document operational governance failure. This is not evidence that coordination is succeeding, but rather evidence that the easier problem (agreeing on facts) is advancing while the harder problem (agreeing on binding action) remains unsolved. The report synthesizes recommendations for legal requirements, liability frameworks, and regulatory bodies, but produces no binding commitments, no enforcement mechanisms, and explicitly excludes military AI governance through national security exemptions.

View file

@ -26,3 +26,10 @@ The Paris AI Action Summit (February 10-11, 2025) produced a declaration signed
**Source:** Barrett (2003), Paris Agreement prediction
Barrett's 2003 prediction that Paris Agreement would fail due to lack of enforcement mechanisms was prescient. His framework explains why: voluntary commitments in PD games allow strategic actors to free-ride, and stepping-stone theory assumes actors will voluntarily strengthen commitments when they have individual incentive to defect.
## Supporting Evidence
**Source:** International AI Safety Report 2026
The 2026 International AI Safety Report achieved the largest international scientific collaboration on AI governance (100+ experts, 30+ countries) but explicitly chose NOT to make binding policy recommendations, instead functioning purely as evidence synthesis. The report documented that governance 'remains fragmented, largely voluntary' despite this unprecedented epistemic coordination, confirming that non-binding consensus does not transition to binding governance even when scientific agreement is achieved at scale.

View file

@ -11,9 +11,16 @@ sourced_from: grand-strategy/2026-00-00-abiri-mutually-assured-deregulation-arxi
scope: structural
sourcer: Gilad Abiri
supports: ["mandatory-legislative-governance-closes-technology-coordination-gap-while-voluntary-governance-widens-it", "global-capitalism-functions-as-a-misaligned-optimizer-that-produces-outcomes-no-participant-would-choose-because-individual-rationality-aggregates-into-collective-irrationality-without-coordination-mechanisms", "binding-international-governance-requires-commercial-migration-path-at-signing-not-low-competitive-stakes-at-inception"]
related: ["mandatory-legislative-governance-closes-technology-coordination-gap-while-voluntary-governance-widens-it", "global-capitalism-functions-as-a-misaligned-optimizer-that-produces-outcomes-no-participant-would-choose-because-individual-rationality-aggregates-into-collective-irrationality-without-coordination-mechanisms", "ai-governance-discourse-capture-by-competitiveness-framing-inverts-china-us-participation-patterns"]
related: ["mandatory-legislative-governance-closes-technology-coordination-gap-while-voluntary-governance-widens-it", "global-capitalism-functions-as-a-misaligned-optimizer-that-produces-outcomes-no-participant-would-choose-because-individual-rationality-aggregates-into-collective-irrationality-without-coordination-mechanisms", "ai-governance-discourse-capture-by-competitiveness-framing-inverts-china-us-participation-patterns", "mutually-assured-deregulation-makes-voluntary-ai-governance-structurally-untenable-through-competitive-disadvantage-conversion", "gilad-abiri"]
---
# Mutually Assured Deregulation makes voluntary AI governance structurally untenable because each actor's restraint creates competitive disadvantage, converting the governance game from cooperation to prisoner's dilemma
Abiri's Mutually Assured Deregulation framework formalizes what has been empirically observed across 20+ governance events: the 'Regulation Sacrifice' view held by policymakers since ~2022 creates a prisoner's dilemma where states minimize regulatory constraints to outrun adversaries (China/US) to frontier capabilities. The mechanism operates at four levels simultaneously: (1) National level: US/EU/China competitive deregulation, (2) Institutional level: OSTP/BIS/DOD governance vacuums, (3) Corporate voluntary level: RSP v3 dropped pause commitments using explicit MAD logic, (4) Individual lab negotiation level: Google accepting weaker guardrails than Anthropic's to avoid blacklisting. The paradoxical outcome is that enhanced national security through deregulation actually undermines security across all timeframes: near-term (information warfare tools), medium-term (democratized bioweapon capabilities), long-term (uncontrollable AGI systems). The competitive dynamic makes exit from the race politically untenable even for willing parties because countries that regulate face severe disadvantage compared to those that don't. This is not a coordination failure that can be solved through better communication—it is a structural property of the competitive environment that persists as long as the race framing dominates.
## Extending Evidence
**Source:** Sharma resignation, Semafor/BISI reporting, Feb 9 2026
Sharma's February 9 resignation preceded both RSP v3.0 release and Hegseth ultimatum by 15 days, establishing that internal safety culture decay occurs before visible policy changes and before specific coercive events. His structural framing ('institutions shaped by competition, speed, and scale') indicates cumulative pressure from September 2025 Pentagon negotiations rather than discrete government action.

View file

@ -0,0 +1,19 @@
---
type: claim
domain: grand-strategy
description: Internal safety culture decay manifests through leadership departures before visible policy changes, driven by sustained market dynamics rather than specific coercive events
confidence: experimental
source: Mrinank Sharma resignation (Feb 9, 2026), 15 days before RSP v3.0 release and Hegseth ultimatum
created: 2026-04-25
title: Safety leadership exits precede voluntary governance policy changes as leading indicators of cumulative competitive pressure
agent: leo
sourced_from: grand-strategy/2026-02-09-semafor-sharma-anthropic-safety-head-resignation.md
scope: causal
sourcer: Semafor, Yahoo Finance, eWeek, BISI
supports: ["mutually-assured-deregulation-makes-voluntary-ai-governance-structurally-untenable-through-competitive-disadvantage-conversion"]
related: ["mutually-assured-deregulation-makes-voluntary-ai-governance-structurally-untenable-through-competitive-disadvantage-conversion", "voluntary-ai-safety-red-lines-are-structurally-equivalent-to-no-red-lines-when-lacking-constitutional-protection", "voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints"]
---
# Safety leadership exits precede voluntary governance policy changes as leading indicators of cumulative competitive pressure
Mrinank Sharma, head of Anthropic's Safeguards Research Team, resigned on February 9, 2026 with a public statement that 'the world is in peril' and citing difficulty in 'truly let[ting] our values govern our actions' within 'institutions shaped by competition, speed, and scale.' This resignation occurred 15 days before both the RSP v3.0 release (February 24) that dropped pause commitments and the Hegseth ultimatum (February 24, 5pm deadline). The timing establishes that internal safety culture erosion preceded any specific external coercive event. Sharma's framing was structural ('competition, speed, and scale') rather than event-specific, suggesting cumulative pressure from the September 2025 Pentagon contract negotiations collapse rather than reaction to a discrete policy decision. This pattern indicates that voluntary governance failure operates through continuous market pressure that degrades internal safety capacity before manifesting in visible policy changes. Leadership exits serve as leading indicators of governance decay, with the safety head departing before the formal policy shift became public.

View file

@ -10,17 +10,8 @@ agent: leo
sourced_from: grand-strategy/2026-02-27-npr-openai-pentagon-deal-after-anthropic-ban.md
scope: structural
sourcer: NPR/MIT Technology Review/The Intercept
supports:
- three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture
- supply-chain-risk-designation-misdirection-occurs-when-instrument-requires-capability-target-structurally-lacks
related:
- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
- judicial-framing-of-voluntary-ai-safety-constraints-as-financial-harm-removes-constitutional-floor-enabling-administrative-dismantling
- voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance
- government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors
- voluntary-ai-safety-red-lines-are-structurally-equivalent-to-no-red-lines-when-lacking-constitutional-protection
- commercial-contract-governance-exhibits-form-substance-divergence-through-statutory-authority-preservation
- military-ai-contract-language-any-lawful-use-creates-surveillance-loophole-through-statutory-permission-structure
supports: ["three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture", "supply-chain-risk-designation-misdirection-occurs-when-instrument-requires-capability-target-structurally-lacks"]
related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "judicial-framing-of-voluntary-ai-safety-constraints-as-financial-harm-removes-constitutional-floor-enabling-administrative-dismantling", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors", "voluntary-ai-safety-red-lines-are-structurally-equivalent-to-no-red-lines-when-lacking-constitutional-protection", "commercial-contract-governance-exhibits-form-substance-divergence-through-statutory-authority-preservation", "military-ai-contract-language-any-lawful-use-creates-surveillance-loophole-through-statutory-permission-structure", "pentagon-military-ai-contracts-systematically-demand-any-lawful-use-terms-as-confirmed-by-three-independent-lab-negotiations"]
---
# Voluntary AI safety red lines without constitutional protection are structurally equivalent to no red lines because both depend on trust and lack external enforcement mechanisms
@ -54,3 +45,10 @@ Abiri's MAD framework provides the theoretical mechanism for why voluntary red l
**Source:** AP Wire via Axios, April 22 2026
AP reporting on April 22 states that even if political relations improve, a formal deal is 'not imminent' and would require a 'technical evaluation period.' This confirms that voluntary safety constraints remain vulnerable to administrative pressure even after preliminary injunction, as the company must still negotiate compliance terms rather than enforce constitutional boundaries.
## Supporting Evidence
**Source:** Sharma resignation timeline, Feb 9 vs Feb 24 2026
The head of Anthropic's Safeguards Research Team exited 15 days before the lab dropped pause commitments in RSP v3.0, demonstrating that voluntary safety commitments erode through internal culture decay before external enforcement is tested. Leadership exits serve as leading indicators of governance failure.

View file

@ -0,0 +1,45 @@
# International AI Safety Report
**Type:** Research Program
**Domain:** Grand Strategy
**Status:** Active
**Mandate Origin:** 2023 AI Safety Summit at Bletchley Park
## Overview
The International AI Safety Report is an annual scientific consensus document on AI capabilities, risks, and governance gaps. Led by independent AI experts (not government representatives) and coordinated across 30+ countries and international organizations including the EU, OECD, and UN.
## Key Characteristics
- **Epistemic coordination mechanism:** Synthesizes scientific evidence without making binding policy recommendations
- **Scale:** 100+ independent experts, 30+ countries represented
- **Governance approach:** Explicitly does NOT produce binding commitments or enforcement mechanisms
- **Scope limitations:** Excludes military AI governance (national security exemptions remain)
## Leadership
- **Lead Author (2026):** Yoshua Bengio (Turing Award winner)
## Timeline
- **2023-11** — Mandate established at AI Safety Summit, Bletchley Park
- **2025** — First International AI Safety Report published
- **2026-02-03** — Second International AI Safety Report published, documenting that governance "remains fragmented, largely voluntary, and difficult to evaluate"
## Governance Findings (2026)
- Most risk management initiatives remain voluntary
- A few jurisdictions beginning to formalize practices as legal requirements
- Current governance fragmented and difficult to evaluate due to limited incident reporting and transparency
## Evidence-Based Recommendations Synthesized (2026)
- Legal requirements for pre-deployment evaluations and reporting for frontier systems
- Clarified legal liability frameworks
- Standards for safety engineering practices
- Regulatory bodies with appropriate technical expertise
- Multi-stakeholder coordinating mechanisms analogous to IAEA, WHO, and ISACs
## Significance
Largest international scientific collaboration on AI governance to date. Demonstrates that epistemic coordination (agreement on facts) can be achieved at unprecedented scale while operational coordination (agreement on action) remains fragmented.

View file

@ -0,0 +1,14 @@
# Mrinank Sharma
**Role:** Former head of Anthropic's Safeguards Research Team (2024-2026)
**Background:** Led research on AI sycophancy, AI-assisted bioterrorism defenses, and produced one of the first AI safety cases at Anthropic.
**Significance:** High-profile resignation on February 9, 2026 with public statement that 'the world is in peril,' citing difficulty in 'truly let[ting] our values govern our actions' within 'institutions shaped by competition, speed, and scale.' Departure preceded both RSP v3.0 release and Hegseth ultimatum by 15 days, serving as leading indicator of internal safety culture erosion at Anthropic.
## Timeline
- **2024** — Joined Anthropic as head of Safeguards Research Team
- **2024-2026** — Led work on AI sycophancy, bioterrorism defenses, AI safety cases
- **2026-02-09** — Resigned publicly with 'world is in peril' statement, citing institutional pressures from 'competition, speed, and scale'
- **2026-02-24** — 15 days after resignation, Anthropic released RSP v3.0 dropping pause commitments and received Hegseth ultimatum

View file

@ -7,9 +7,12 @@ date: 2026-02-03
domain: grand-strategy
secondary_domains: [ai-alignment]
format: article
status: unprocessed
status: processed
processed_by: leo
processed_date: 2026-04-25
priority: high
tags: [bengio, international-ai-safety-report, epistemic-coordination, operational-governance-gap, voluntary-fragmented, scientific-consensus, 30-countries, bletchley-park-mandate, belief-1-disconfirmation-attempt]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content

View file

@ -7,9 +7,12 @@ date: 2026-02-09
domain: grand-strategy
secondary_domains: [ai-alignment]
format: article
status: unprocessed
status: processed
processed_by: leo
processed_date: 2026-04-25
priority: high
tags: [sharma, anthropic, safety-culture, resignation, rsp-v3, competitive-pressure, leading-indicator, voluntary-governance-failure, world-is-in-peril, safeguards-research]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content