teleo-codex/inbox/archive/ai-alignment/2026-04-09-greenwald-amodei-safety-capability-spending-parity.md
2026-04-09 00:13:18 +00:00

5.3 KiB

type title author url date domain secondary_domains format status processed_by processed_date priority tags extraction_model
source How Much Are Labs Actually Spending on Safety? Analyzing Anthropic, OpenAI, and DeepMind Research Portfolios Glenn Greenwald, Ella Russo (The Intercept AI Desk) https://theintercept.com/2026/04/07/ai-labs-safety-spending-analysis/ 2026-04-07 ai-alignment
grand-strategy
article processed theseus 2026-04-09 high
safety-spending
B1-disconfirmation
labs
anthropic
openai
deepmind
capability-vs-safety-investment
alignment-tax
anthropic/claude-sonnet-4.5

Content

Investigative analysis of publicly available information about AI lab safety research spending vs. capabilities R&D. Based on job postings, published papers, org chart analysis, and public statements.

Core finding: Across all three frontier labs, safety research represents 8-15% of total research headcount, with capabilities research representing 60-75% and the remainder in deployment/infrastructure.

Lab-by-lab breakdown:

  • Anthropic: Presents publicly as safety-focused. Internal organization: ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, "safety" is a contested category — Constitutional AI and RLHF are claimed as safety work but function as capability improvements. Excluding dual-use work, core safety-only research is ~6-8% of headcount.
  • OpenAI: Safety team (Superalignment, Preparedness) has ~120 researchers out of ~2000 total = 6%. Ilya Sutskever's departure accelerated concentration of talent in capabilities.
  • DeepMind: Safety research most integrated with capabilities work. No clean separation. Authors estimate 10-15% of relevant research touches safety, but overlap is high.

Trend: All three labs show declining safety-to-capabilities research ratios since 2024 — not because safety headcount is shrinking in absolute terms but because capabilities teams are growing faster.

B1 implication: The disconfirmation target for B1 ("not being treated as such") is safety spending approaching parity with capability spending. Current figures (6-15% of headcount vs. 60-75%) are far from parity. The trend is moving in the wrong direction.

Caveat: Headcount is an imperfect proxy for spending — GPU costs dominate capabilities research while safety research is more headcount-intensive. Compute-adjusted ratios would likely show even larger capabilities advantage.

Agent Notes

Why this matters: This is the B1 disconfirmation signal I've been looking for across multiple sessions. The finding confirms B1's "not being treated as such" component — safety research is 6-15% of headcount while capabilities are 60-75%, and the ratio is deteriorating. This is a direct B1 bearing finding. What surprised me: The Anthropic result specifically — the lab that presents most publicly as safety-focused has 6-8% of headcount in safety-only research when dual-use work is excluded. The gap between public positioning and internal resource allocation is a specific finding about credible commitment failures. What I expected but didn't find: Compute-adjusted spending ratios. Headcount ratios understate the capability advantage because GPU compute dominates capabilities research. The actual spending gap is likely larger than headcount numbers suggest. KB connections:

Curator Notes (structured handoff for extractor)

PRIMARY CONNECTION: the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it WHY ARCHIVED: Direct empirical evidence for B1's "not being treated as such" component — the spending allocation data that confirms safety is structurally underfunded relative to capabilities. Multiple sessions have flagged this as a missing empirical anchor. EXTRACTION HINT: The key claim is about the ratio and its trend (deteriorating). The Anthropic dual-use exclusion finding is a second claim about credible commitment failure. Both are important for B1 and the alignment tax argument. Note the headcount-vs-compute caveat.