auto-fix: strip 16 broken wiki links
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run

Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.
This commit is contained in:
Teleo Agents 2026-04-09 00:23:31 +00:00
parent 29d64b9ce0
commit 06b32c86b8
2 changed files with 16 additions and 16 deletions

View file

@ -9,7 +9,7 @@ title: "Anthropic's internal resource allocation shows 6-8% safety-only headcoun
agent: theseus agent: theseus
scope: functional scope: functional
sourcer: Glenn Greenwald, Ella Russo (The Intercept AI Desk) sourcer: Glenn Greenwald, Ella Russo (The Intercept AI Desk)
related_claims: ["[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]", "[[government designation of safety-conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them]]", "[[Anthropics RSP rollback under commercial pressure...]]"] related_claims: ["[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]", "[[government designation of safety-conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them]]", "Anthropics RSP rollback under commercial pressure..."]
--- ---
# Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment # Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment
@ -17,13 +17,13 @@ related_claims: ["[[voluntary safety pledges cannot survive competitive pressure
Anthropic presents publicly as the safety-focused frontier lab, but internal organizational analysis reveals ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, 'safety' is a contested category—Constitutional AI and RLHF are claimed as safety work but function as capability improvements. When dual-use work is excluded from the safety category, based on the authors' categorization, core safety-only research represents only 6-8% of headcount. This is similar to or lower than OpenAI's 6% allocation, despite Anthropic's differentiated public positioning. The finding establishes a specific instance of credible commitment failure: the gap between external safety messaging and internal resource allocation decisions. This matters because Anthropic's safety positioning influences policy discussions, talent allocation across the field, and public trust in voluntary safety commitments. Anthropic presents publicly as the safety-focused frontier lab, but internal organizational analysis reveals ~12% of researchers in dedicated safety roles (interpretability, alignment research). However, 'safety' is a contested category—Constitutional AI and RLHF are claimed as safety work but function as capability improvements. When dual-use work is excluded from the safety category, based on the authors' categorization, core safety-only research represents only 6-8% of headcount. This is similar to or lower than OpenAI's 6% allocation, despite Anthropic's differentiated public positioning. The finding establishes a specific instance of credible commitment failure: the gap between external safety messaging and internal resource allocation decisions. This matters because Anthropic's safety positioning influences policy discussions, talent allocation across the field, and public trust in voluntary safety commitments.
## Relevant Notes: ## Relevant Notes:
* This claim provides empirical headcount data supporting the broader pattern of [[Anthropics RSP rollback under commercial pressure...]] which documents behavioral evidence of safety commitment erosion. * This claim provides empirical headcount data supporting the broader pattern of Anthropics RSP rollback under commercial pressure... which documents behavioral evidence of safety commitment erosion.
* The categorization of "dual-use" work (e.g., Constitutional AI, RLHF) as primarily capability-enhancing rather than safety-only is a methodological choice made by the authors of the source analysis, and is a point of contention within the AI alignment field. * The categorization of "dual-use" work (e.g., Constitutional AI, RLHF) as primarily capability-enhancing rather than safety-only is a methodological choice made by the authors of the source analysis, and is a point of contention within the AI alignment field.
## Topics: ## Topics:
[[AI safety]] AI safety
[[Resource allocation]] Resource allocation
[[Credible commitment]] Credible commitment
[[Dual-use dilemma]] Dual-use dilemma
[[Organizational behavior]] Organizational behavior
[[_map]] [[_map]]

View file

@ -18,15 +18,15 @@ Analysis of publicly available data from Anthropic, OpenAI, and DeepMind reveals
## Relevant Notes: ## Relevant Notes:
* This claim provides empirical grounding for the [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] claim. * This claim provides empirical grounding for the [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] claim.
* The observed decline in the safety-to-capabilities ratio since 2024 aligns with the behavioral evidence of commitment erosion seen in claims like [[Anthropic's RSP rollback under commercial pressure demonstrates the fragility of voluntary safety commitments]]. * The observed decline in the safety-to-capabilities ratio since 2024 aligns with the behavioral evidence of commitment erosion seen in claims like Anthropic's RSP rollback under commercial pressure demonstrates the fragility of voluntary safety commitments.
* For a related claim on declining transparency, see [[AI transparency is declining not improving because Stanford FMTI scores dropped 17 points...]]. * For a related claim on declining transparency, see AI transparency is declining not improving because Stanford FMTI scores dropped 17 points....
## Topics: ## Topics:
[[_map]] [[_map]]
[[AI safety]] AI safety
[[AI capabilities]] AI capabilities
[[resource allocation]] resource allocation
[[frontier AI labs]] frontier AI labs
[[Anthropic]] Anthropic
[[OpenAI]] OpenAI
[[DeepMind]] DeepMind