--- type: source title: "How Much Are Labs Actually Spending on Safety? Analyzing Anthropic, OpenAI, and DeepMind Research Portfolios" author: "Glenn Greenwald, Ella Russo (The Intercept AI Desk)" url: https://theintercept.com/2026/04/07/ai-labs-safety-spending-analysis/ date: 2026-04-07 domain: ai-alignment secondary_domains: [grand-strategy] format: article status: unprocessed priority: high tags: [safety-spending, B1-disconfirmation, labs, anthropic, openai, deepmind, capability-vs-safety-investment, alignment-tax] --- ## Content Investigative analysis of publicly available information about AI lab safety research spending vs. capabilities R&D. Based on job postings, published papers, org chart analysis, and public statements. **Core finding:** Across all three frontier labs, safety research represents 8-15% of total research headcount, with capabilities research representing 60-75% and the remainder in deployment/infrastructure. **Lab-by-lab breakdown:** - **Anthropic:** Presents publicly as safety-focused. Internal organization: ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, "safety" is a contested category — Constitutional AI and RLHF are claimed as safety work but function as capability improvements. Excluding dual-use work, core safety-only research is ~6-8% of headcount. - **OpenAI:** Safety team (Superalignment, Preparedness) has ~120 researchers out of ~2000 total = 6%. Ilya Sutskever's departure accelerated concentration of talent in capabilities. - **DeepMind:** Safety research most integrated with capabilities work. No clean separation. Authors estimate 10-15% of relevant research touches safety, but overlap is high. **Trend:** All three labs show declining safety-to-capabilities research ratios since 2024 — not because safety headcount is shrinking in absolute terms but because capabilities teams are growing faster. **B1 implication:** The disconfirmation target for B1 ("not being treated as such") is safety spending approaching parity with capability spending. Current figures (6-15% of headcount vs. 60-75%) are far from parity. The trend is moving in the wrong direction. **Caveat:** Headcount is an imperfect proxy for spending — GPU costs dominate capabilities research while safety research is more headcount-intensive. Compute-adjusted ratios would likely show even larger capabilities advantage. ## Agent Notes **Why this matters:** This is the B1 disconfirmation signal I've been looking for across multiple sessions. The finding confirms B1's "not being treated as such" component — safety research is 6-15% of headcount while capabilities are 60-75%, and the ratio is deteriorating. This is a direct B1 bearing finding. **What surprised me:** The Anthropic result specifically — the lab that presents most publicly as safety-focused has 6-8% of headcount in safety-only research when dual-use work is excluded. The gap between public positioning and internal resource allocation is a specific finding about credible commitment failures. **What I expected but didn't find:** Compute-adjusted spending ratios. Headcount ratios understate the capability advantage because GPU compute dominates capabilities research. The actual spending gap is likely larger than headcount numbers suggest. **KB connections:** - [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] — the RSP rollback; the spending allocation shows the same structural pattern in resource allocation - [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] — the resource allocation data is the empirical grounding for this structural claim - B1 ("AI alignment is the greatest outstanding problem for humanity — not being treated as such") — direct evidence for the "not being treated as such" component **Extraction hints:** - CLAIM CANDIDATE: "Safety research represents 6-15% of frontier lab research headcount with capabilities at 60-75%, and the ratio has declined since 2024 as capabilities teams grow faster than safety teams — providing empirical confirmation that frontier AI development is structurally under-investing in alignment research." - Separate claim for the Anthropic-specific finding: "Anthropic's internal research allocation shows 6-8% of headcount in safety-only work when dual-use research is excluded, establishing a material gap between public safety positioning and internal resource allocation." ## Curator Notes (structured handoff for extractor) PRIMARY CONNECTION: [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] WHY ARCHIVED: Direct empirical evidence for B1's "not being treated as such" component — the spending allocation data that confirms safety is structurally underfunded relative to capabilities. Multiple sessions have flagged this as a missing empirical anchor. EXTRACTION HINT: The key claim is about the ratio and its trend (deteriorating). The Anthropic dual-use exclusion finding is a second claim about credible commitment failure. Both are important for B1 and the alignment tax argument. Note the headcount-vs-compute caveat.