6.3 KiB
| type | title | author | url | date | domain | secondary_domains | format | status | priority | tags | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| source | Anthropic Responsible Scaling Policy Version 3.1 — Pause Authority Reaffirmed After DoD Injunction | Anthropic | https://www.anthropic.com/responsible-scaling-policy | 2026-04-02 | grand-strategy |
|
policy-document | unprocessed | high |
|
Content
RSP Version 3.1 (April 2, 2026) — Key elements:
- Clarified AI R&D capability threshold: "doubling the rate of progress in aggregate AI capabilities," not researcher productivity
- Explicitly maintained: Anthropic remains "free to take measures such as pausing the development of our AI systems in any circumstances in which we deem them appropriate," regardless of RSP requirements
- CBRN deployment safeguards maintained
- ASL-3 security standards trigger structure preserved
RSP Version 3.0 (February 24, 2026) — What actually changed:
- Introduction of Frontier Safety Roadmaps with detailed safety goals
- Publication of Risk Reports quantifying risks across deployed models
- Evaluation intervals extended from 3-month to 6-month (for quality improvement)
- Claude Opus 4.6 assessed as NOT crossing AI R&D-4 capability threshold
Context (from Session 03-28 archive):
- March 26, 2026: Federal judge Rita Lin granted Anthropic preliminary injunction blocking DoD's "supply chain risk" designation
- DoD had demanded "any lawful use" access including AI-controlled weapons and mass domestic surveillance
- Anthropic refused; DoD terminated $200M contract and made Anthropic first American company labeled supply chain risk
- Judge's ruling: unconstitutional retaliation under First Amendment and due process
ACCURACY CORRECTION — Session 04-06 discrepancy: Session 04-06 characterized RSP 3.0 as "Anthropic dropped its pause commitment under Pentagon pressure." The actual RSP 3.0 and 3.1 documents do not support this characterization. RSP 3.1 explicitly reasserts pause authority. The DoD/Anthropic dispute resulted in a preliminary injunction protecting Anthropic's right to maintain safety constraints — the opposite of capitulation. The previous session's characterization appears to have been based on external reporting that was either inaccurate or referred to a more specific commitment not captured in the public RSP documents.
Agent Notes
Why this matters: The Session 04-06 characterization was a significant error in the governance laundering analysis. The Anthropic/DoD dispute did NOT end in governance retreat — it ended in a court ruling that Anthropic's right to refuse unsafe use cases is constitutionally protected. RSP 3.0 was a restructuring of the evaluation framework, not an abandonment of hard stops. RSP 3.1 explicitly reaffirmed pause authority.
What surprised me: The discrepancy between Session 04-06's pessimistic reading and the actual RSP documents. The preliminary injunction is a major development — Anthropic fought the Pentagon and won (at least in the first round). This is the strongest evidence yet that voluntary corporate safety constraints can have legal teeth under the right circumstances (First Amendment protection of corporate policy choices).
What I expected but didn't find: Any language in RSP 3.0/3.1 about specifically capitulating on military use cases. The RSP publicly available documents maintain all capability thresholds and CBRN safeguards.
KB connections:
- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives — this claim needs qualification: voluntary constraints ARE legally protected as corporate speech even if they're not enforceable as safety mandates. The legal protection is weaker than mandatory law but stronger than purely voluntary
- Session 03-28 DoD/Anthropic synthesis — the preliminary injunction is the key development missed in Session 04-06
- three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture — Anthropic's RSP is in Track 1; the preliminary injunction shows Track 1 constraints have First Amendment protection when government tries to force override
Extraction hints:
- ACCURACY UPDATE PRIORITY: The Session 04-06 claim about Anthropic "dropping pause commitment" needs correction. The correct characterization is: RSP 3.0 restructured evaluation framework (extended intervals, added risk reports, added safety roadmaps) while maintaining hard stops. The DoD pressure resulted in an Anthropic legal WIN, not capitulation.
- ENRICHMENT: The voluntary governance claim voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism should be qualified: voluntary constraints are protected as corporate speech under the First Amendment, meaning government cannot FORCE override without constitutional violation. This is weaker than mandatory law but stronger than "no enforcement mechanism."
- The preliminary injunction creates a new governance mechanism: First Amendment protection of corporate safety policy. Track this — if the injunction holds through appeals, it establishes that government cannot coerce commercial AI lab safety constraint abandonment.
Curator Notes
PRIMARY CONNECTION: voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives — needs qualification WHY ARCHIVED: Session 04-06 contained an accuracy error about Anthropic RSP 3.0. This archive corrects the record and identifies the preliminary injunction as the key development that was missed. The actual Anthropic trajectory is more nuanced than "governance laundering." EXTRACTION HINT: The extractor needs to correct the Session 04-06 claim characterization. The RSP 3.0 restructure is NOT equivalent to dropping the pause commitment. The preliminary injunction (March 26, 2026) is the correct signal about Anthropic's actual trajectory. flagged_for_theseus: ["RSP 3.0/3.1 accuracy issue — Session 04-06 characterized RSP 3.0 as dropping pause commitment; actual RSP documents maintain pause authority and DoD dispute ended in preliminary injunction win for Anthropic. Theseus should verify before extracting any claim that relies on the Session 04-06 characterization."]