theseus: extract claims from 2026-04-09-greenwald-amodei-safety-capability-spending-parity

- Source: inbox/queue/2026-04-09-greenwald-amodei-safety-capability-spending-parity.md - Domain: ai-alignment - Claims: 2, Entities: 0 - Enrichments: 2 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
2026-04-09 00:13:16 +00:00 · 2026-04-09 00:13:16 +00:00 · 328c5f807d
commit 328c5f807d
parent 4b1e08ee18
2 changed files with 34 additions and 0 deletions
--- a/domains/ai-alignment/anthropic-internal-resource-allocation-shows-6-8-percent-safety-only-headcount-when-dual-use-research-excluded-revealing-gap-between-public-positioning-and-commitment.md
+++ b/domains/ai-alignment/anthropic-internal-resource-allocation-shows-6-8-percent-safety-only-headcount-when-dual-use-research-excluded-revealing-gap-between-public-positioning-and-commitment.md
@ -0,0 +1,17 @@
 ---
 type: claim
 domain: ai-alignment
 description: The lab presenting most publicly as safety-focused allocates similar or lower safety resources than competitors when dual-use work is properly categorized
 confidence: experimental
 source: "Greenwald & Russo (The Intercept), organizational analysis of Anthropic research allocation"
 created: 2026-04-09
 title: "Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment"
 agent: theseus
 scope: functional
 sourcer: Glenn Greenwald, Ella Russo (The Intercept AI Desk)
 related_claims: ["[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]", "[[government designation of safety-conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them]]"]
 ---
 # Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment
 Anthropic presents publicly as the safety-focused frontier lab, but internal organizational analysis reveals ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, 'safety' is a contested category—Constitutional AI and RLHF are claimed as safety work but function as capability improvements. When dual-use work is excluded from the safety category, core safety-only research represents only 6-8% of headcount. This is similar to or lower than OpenAI's 6% allocation, despite Anthropic's differentiated public positioning. The finding establishes a specific instance of credible commitment failure: the gap between external safety messaging and internal resource allocation decisions. This matters because Anthropic's safety positioning influences policy discussions, talent allocation across the field, and public trust in voluntary safety commitments.
--- a/domains/ai-alignment/frontier-ai-labs-allocate-6-15-percent-research-headcount-to-safety-versus-60-75-percent-to-capabilities-with-declining-ratios-since-2024.md
+++ b/domains/ai-alignment/frontier-ai-labs-allocate-6-15-percent-research-headcount-to-safety-versus-60-75-percent-to-capabilities-with-declining-ratios-since-2024.md
@ -0,0 +1,17 @@
 ---
 type: claim
 domain: ai-alignment
 description: Empirical measurement of resource allocation across Anthropic, OpenAI, and DeepMind shows safety research is structurally underfunded relative to capabilities development
 confidence: experimental
 source: "Greenwald & Russo (The Intercept), analysis of job postings, org charts, and published papers across three frontier labs"
 created: 2026-04-09
 title: "Frontier AI labs allocate 6-15% of research headcount to safety versus 60-75% to capabilities with the ratio declining since 2024 as capabilities teams grow faster than safety teams"
 agent: theseus
 scope: structural
 sourcer: Glenn Greenwald, Ella Russo (The Intercept AI Desk)
 related_claims: ["[[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]]", "[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]", "[[safe AI development requires building alignment mechanisms before scaling capability]]"]
 ---
 # Frontier AI labs allocate 6-15% of research headcount to safety versus 60-75% to capabilities with the ratio declining since 2024 as capabilities teams grow faster than safety teams
 Analysis of publicly available data from Anthropic, OpenAI, and DeepMind reveals safety research represents 8-15% of total research headcount while capabilities research represents 60-75%, with the remainder in deployment/infrastructure. Anthropic, despite public safety positioning, has ~12% of researchers in dedicated safety roles, but when dual-use work (Constitutional AI, RLHF) is excluded, core safety-only research drops to 6-8%. OpenAI's Superalignment and Preparedness teams comprise ~120 of ~2000 researchers (6%). DeepMind shows 10-15% of research touching safety but with high overlap with capabilities work. Critically, all three labs show declining safety-to-capabilities ratios since 2024—not from absolute safety headcount shrinkage but from capabilities teams growing faster. The authors note headcount understates the capabilities advantage because GPU costs dominate capabilities research while safety is more headcount-intensive, meaning compute-adjusted ratios would show even larger gaps. This provides direct empirical confirmation that frontier AI development systematically under-invests in alignment research relative to capability advancement.