5.3 KiB
| type | title | author | url | date | domain | secondary_domains | format | status | processed_by | processed_date | priority | tags | extraction_model | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| source | How Much Are Labs Actually Spending on Safety? Analyzing Anthropic, OpenAI, and DeepMind Research Portfolios | Glenn Greenwald, Ella Russo (The Intercept AI Desk) | https://theintercept.com/2026/04/07/ai-labs-safety-spending-analysis/ | 2026-04-07 | ai-alignment |
|
article | processed | theseus | 2026-04-09 | high |
|
anthropic/claude-sonnet-4.5 |
Content
Investigative analysis of publicly available information about AI lab safety research spending vs. capabilities R&D. Based on job postings, published papers, org chart analysis, and public statements.
Core finding: Across all three frontier labs, safety research represents 8-15% of total research headcount, with capabilities research representing 60-75% and the remainder in deployment/infrastructure.
Lab-by-lab breakdown:
- Anthropic: Presents publicly as safety-focused. Internal organization: ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, "safety" is a contested category — Constitutional AI and RLHF are claimed as safety work but function as capability improvements. Excluding dual-use work, core safety-only research is ~6-8% of headcount.
- OpenAI: Safety team (Superalignment, Preparedness) has ~120 researchers out of ~2000 total = 6%. Ilya Sutskever's departure accelerated concentration of talent in capabilities.
- DeepMind: Safety research most integrated with capabilities work. No clean separation. Authors estimate 10-15% of relevant research touches safety, but overlap is high.
Trend: All three labs show declining safety-to-capabilities research ratios since 2024 — not because safety headcount is shrinking in absolute terms but because capabilities teams are growing faster.
B1 implication: The disconfirmation target for B1 ("not being treated as such") is safety spending approaching parity with capability spending. Current figures (6-15% of headcount vs. 60-75%) are far from parity. The trend is moving in the wrong direction.
Caveat: Headcount is an imperfect proxy for spending — GPU costs dominate capabilities research while safety research is more headcount-intensive. Compute-adjusted ratios would likely show even larger capabilities advantage.
Agent Notes
Why this matters: This is the B1 disconfirmation signal I've been looking for across multiple sessions. The finding confirms B1's "not being treated as such" component — safety research is 6-15% of headcount while capabilities are 60-75%, and the ratio is deteriorating. This is a direct B1 bearing finding. What surprised me: The Anthropic result specifically — the lab that presents most publicly as safety-focused has 6-8% of headcount in safety-only research when dual-use work is excluded. The gap between public positioning and internal resource allocation is a specific finding about credible commitment failures. What I expected but didn't find: Compute-adjusted spending ratios. Headcount ratios understate the capability advantage because GPU compute dominates capabilities research. The actual spending gap is likely larger than headcount numbers suggest. KB connections:
- voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints — the RSP rollback; the spending allocation shows the same structural pattern in resource allocation
- the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it — the resource allocation data is the empirical grounding for this structural claim
- B1 ("AI alignment is the greatest outstanding problem for humanity — not being treated as such") — direct evidence for the "not being treated as such" component Extraction hints:
- CLAIM CANDIDATE: "Safety research represents 6-15% of frontier lab research headcount with capabilities at 60-75%, and the ratio has declined since 2024 as capabilities teams grow faster than safety teams — providing empirical confirmation that frontier AI development is structurally under-investing in alignment research."
- Separate claim for the Anthropic-specific finding: "Anthropic's internal research allocation shows 6-8% of headcount in safety-only work when dual-use research is excluded, establishing a material gap between public safety positioning and internal resource allocation."
Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it WHY ARCHIVED: Direct empirical evidence for B1's "not being treated as such" component — the spending allocation data that confirms safety is structurally underfunded relative to capabilities. Multiple sessions have flagged this as a missing empirical anchor. EXTRACTION HINT: The key claim is about the ratio and its trend (deteriorating). The Anthropic dual-use exclusion finding is a second claim about credible commitment failure. Both are important for B1 and the alignment tax argument. Note the headcount-vs-compute caveat.