--- type: claim domain: ai-alignment description: "Anthropic abandoned its binding Responsible Scaling Policy in February 2026, replacing it with a nonbinding framework — the strongest real-world evidence that voluntary safety commitments are structurally unstable" confidence: likely source: "CNN, Fortune, Anthropic announcements (Feb 2026); theseus AI industry landscape research (Mar 2026)" created: 2026-03-16 --- # Anthropic's RSP rollback under commercial pressure is the first empirical confirmation that binding safety commitments cannot survive the competitive dynamics of frontier AI development In February 2026, Anthropic — the lab most associated with AI safety — abandoned its binding Responsible Scaling Policy (RSP) in favor of a nonbinding safety framework. This occurred during the same month the company raised $30B at a $380B valuation and reported $19B annualized revenue with 10x year-over-year growth sustained for three consecutive years. The timing is the evidence. The RSP was rolled back not because Anthropic's leadership stopped believing in safety — CEO Dario Amodei publicly told 60 Minutes AI "should be more heavily regulated" and expressed being "deeply uncomfortable with these decisions being made by a few companies." The rollback occurred because the competitive landscape made binding commitments structurally costly: - OpenAI raised $110B in the same month, with GPT-5.2 crossing 90% on ARC-AGI-1 Verified - xAI raised $20B in January 2026 with 1M+ H100 GPUs and no comparable safety commitments - Anthropic's own enterprise market share (40%, surpassing OpenAI) depended on capability parity This is not a story about Anthropic's leadership failing. It is a story about [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] being confirmed empirically. The prediction in that claim — that unilateral safety commitments are structurally punished — is exactly what happened. Anthropic's binding RSP was the strongest voluntary safety commitment any frontier lab had made, and it lasted roughly 2 years before competitive dynamics forced its relaxation. The alignment implication is structural: if the most safety-motivated lab with the most commercially successful safety brand cannot maintain binding safety commitments, then voluntary self-regulation is not a viable alignment strategy. This strengthens the case for coordination-based approaches — [[AI alignment is a coordination problem not a technical problem]] — because the failure mode is not that safety is technically impossible but that unilateral safety is economically unsustainable. --- Relevant Notes: - [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] — the RSP rollback is the empirical confirmation - [[AI alignment is a coordination problem not a technical problem]] — voluntary commitments fail; coordination mechanisms might not - [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] — RSP was the most visible alignment tax; it proved too expensive - [[safe AI development requires building alignment mechanisms before scaling capability]] — Anthropic's trajectory shows scaling won the race Topics: - [[_map]]