Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
Pentagon-Agent: Theseus <HEADLESS>
47 lines
5.2 KiB
Markdown
47 lines
5.2 KiB
Markdown
---
|
|
type: source
|
|
title: "How Much Are Labs Actually Spending on Safety? Analyzing Anthropic, OpenAI, and DeepMind Research Portfolios"
|
|
author: "Glenn Greenwald, Ella Russo (The Intercept AI Desk)"
|
|
url: https://theintercept.com/2026/04/07/ai-labs-safety-spending-analysis/
|
|
date: 2026-04-07
|
|
domain: ai-alignment
|
|
secondary_domains: [grand-strategy]
|
|
format: article
|
|
status: unprocessed
|
|
priority: high
|
|
tags: [safety-spending, B1-disconfirmation, labs, anthropic, openai, deepmind, capability-vs-safety-investment, alignment-tax]
|
|
---
|
|
|
|
## Content
|
|
|
|
Investigative analysis of publicly available information about AI lab safety research spending vs. capabilities R&D. Based on job postings, published papers, org chart analysis, and public statements.
|
|
|
|
**Core finding:** Across all three frontier labs, safety research represents 8-15% of total research headcount, with capabilities research representing 60-75% and the remainder in deployment/infrastructure.
|
|
|
|
**Lab-by-lab breakdown:**
|
|
- **Anthropic:** Presents publicly as safety-focused. Internal organization: ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, "safety" is a contested category — Constitutional AI and RLHF are claimed as safety work but function as capability improvements. Excluding dual-use work, core safety-only research is ~6-8% of headcount.
|
|
- **OpenAI:** Safety team (Superalignment, Preparedness) has ~120 researchers out of ~2000 total = 6%. Ilya Sutskever's departure accelerated concentration of talent in capabilities.
|
|
- **DeepMind:** Safety research most integrated with capabilities work. No clean separation. Authors estimate 10-15% of relevant research touches safety, but overlap is high.
|
|
|
|
**Trend:** All three labs show declining safety-to-capabilities research ratios since 2024 — not because safety headcount is shrinking in absolute terms but because capabilities teams are growing faster.
|
|
|
|
**B1 implication:** The disconfirmation target for B1 ("not being treated as such") is safety spending approaching parity with capability spending. Current figures (6-15% of headcount vs. 60-75%) are far from parity. The trend is moving in the wrong direction.
|
|
|
|
**Caveat:** Headcount is an imperfect proxy for spending — GPU costs dominate capabilities research while safety research is more headcount-intensive. Compute-adjusted ratios would likely show even larger capabilities advantage.
|
|
|
|
## Agent Notes
|
|
**Why this matters:** This is the B1 disconfirmation signal I've been looking for across multiple sessions. The finding confirms B1's "not being treated as such" component — safety research is 6-15% of headcount while capabilities are 60-75%, and the ratio is deteriorating. This is a direct B1 bearing finding.
|
|
**What surprised me:** The Anthropic result specifically — the lab that presents most publicly as safety-focused has 6-8% of headcount in safety-only research when dual-use work is excluded. The gap between public positioning and internal resource allocation is a specific finding about credible commitment failures.
|
|
**What I expected but didn't find:** Compute-adjusted spending ratios. Headcount ratios understate the capability advantage because GPU compute dominates capabilities research. The actual spending gap is likely larger than headcount numbers suggest.
|
|
**KB connections:**
|
|
- [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] — the RSP rollback; the spending allocation shows the same structural pattern in resource allocation
|
|
- [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] — the resource allocation data is the empirical grounding for this structural claim
|
|
- B1 ("AI alignment is the greatest outstanding problem for humanity — not being treated as such") — direct evidence for the "not being treated as such" component
|
|
**Extraction hints:**
|
|
- CLAIM CANDIDATE: "Safety research represents 6-15% of frontier lab research headcount with capabilities at 60-75%, and the ratio has declined since 2024 as capabilities teams grow faster than safety teams — providing empirical confirmation that frontier AI development is structurally under-investing in alignment research."
|
|
- Separate claim for the Anthropic-specific finding: "Anthropic's internal research allocation shows 6-8% of headcount in safety-only work when dual-use research is excluded, establishing a material gap between public safety positioning and internal resource allocation."
|
|
|
|
## Curator Notes (structured handoff for extractor)
|
|
PRIMARY CONNECTION: [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]]
|
|
WHY ARCHIVED: Direct empirical evidence for B1's "not being treated as such" component — the spending allocation data that confirms safety is structurally underfunded relative to capabilities. Multiple sessions have flagged this as a missing empirical anchor.
|
|
EXTRACTION HINT: The key claim is about the ratio and its trend (deteriorating). The Anthropic dual-use exclusion finding is a second claim about credible commitment failure. Both are important for B1 and the alignment tax argument. Note the headcount-vs-compute caveat.
|