teleo-codex/inbox/archive/grand-strategy/2026-04-08-anthropic-rsp-31-pause-authority-reaffirmed.md at 3f0d6923f85ce4bd32a80ea538e91b68e951e685

Teleo Agents 74a0dbe0a0 leo: commit untracked archive files

Pentagon-Agent: Ship <EF79ADB7-E6D7-48AC-B220-38CA82327C5D>

2026-04-15 17:55:49 +00:00

6.3 KiB

Raw Blame History

type

title

author

url

date

domain

secondary_domains

format

status

priority

Content

RSP Version 3.1 (April 2, 2026) — Key elements:

Clarified AI R&D capability threshold: "doubling the rate of progress in aggregate AI capabilities," not researcher productivity
Explicitly maintained: Anthropic remains "free to take measures such as pausing the development of our AI systems in any circumstances in which we deem them appropriate," regardless of RSP requirements
CBRN deployment safeguards maintained
ASL-3 security standards trigger structure preserved

RSP Version 3.0 (February 24, 2026) — What actually changed:

Introduction of Frontier Safety Roadmaps with detailed safety goals
Publication of Risk Reports quantifying risks across deployed models
Evaluation intervals extended from 3-month to 6-month (for quality improvement)
Claude Opus 4.6 assessed as NOT crossing AI R&D-4 capability threshold

Context (from Session 03-28 archive):

March 26, 2026: Federal judge Rita Lin granted Anthropic preliminary injunction blocking DoD's "supply chain risk" designation
DoD had demanded "any lawful use" access including AI-controlled weapons and mass domestic surveillance
Anthropic refused; DoD terminated $200M contract and made Anthropic first American company labeled supply chain risk
Judge's ruling: unconstitutional retaliation under First Amendment and due process

ACCURACY CORRECTION — Session 04-06 discrepancy: Session 04-06 characterized RSP 3.0 as "Anthropic dropped its pause commitment under Pentagon pressure." The actual RSP 3.0 and 3.1 documents do not support this characterization. RSP 3.1 explicitly reasserts pause authority. The DoD/Anthropic dispute resulted in a preliminary injunction protecting Anthropic's right to maintain safety constraints — the opposite of capitulation. The previous session's characterization appears to have been based on external reporting that was either inaccurate or referred to a more specific commitment not captured in the public RSP documents.

Agent Notes

Why this matters: The Session 04-06 characterization was a significant error in the governance laundering analysis. The Anthropic/DoD dispute did NOT end in governance retreat — it ended in a court ruling that Anthropic's right to refuse unsafe use cases is constitutionally protected. RSP 3.0 was a restructuring of the evaluation framework, not an abandonment of hard stops. RSP 3.1 explicitly reaffirmed pause authority.

What surprised me: The discrepancy between Session 04-06's pessimistic reading and the actual RSP documents. The preliminary injunction is a major development — Anthropic fought the Pentagon and won (at least in the first round). This is the strongest evidence yet that voluntary corporate safety constraints can have legal teeth under the right circumstances (First Amendment protection of corporate policy choices).

What I expected but didn't find: Any language in RSP 3.0/3.1 about specifically capitulating on military use cases. The RSP publicly available documents maintain all capability thresholds and CBRN safeguards.

KB connections:

voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives — this claim needs qualification: voluntary constraints ARE legally protected as corporate speech even if they're not enforceable as safety mandates. The legal protection is weaker than mandatory law but stronger than purely voluntary
Session 03-28 DoD/Anthropic synthesis — the preliminary injunction is the key development missed in Session 04-06
three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture — Anthropic's RSP is in Track 1; the preliminary injunction shows Track 1 constraints have First Amendment protection when government tries to force override

Extraction hints:

ACCURACY UPDATE PRIORITY: The Session 04-06 claim about Anthropic "dropping pause commitment" needs correction. The correct characterization is: RSP 3.0 restructured evaluation framework (extended intervals, added risk reports, added safety roadmaps) while maintaining hard stops. The DoD pressure resulted in an Anthropic legal WIN, not capitulation.
ENRICHMENT: The voluntary governance claim voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism should be qualified: voluntary constraints are protected as corporate speech under the First Amendment, meaning government cannot FORCE override without constitutional violation. This is weaker than mandatory law but stronger than "no enforcement mechanism."
The preliminary injunction creates a new governance mechanism: First Amendment protection of corporate safety policy. Track this — if the injunction holds through appeals, it establishes that government cannot coerce commercial AI lab safety constraint abandonment.

Curator Notes

PRIMARY CONNECTION: voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives — needs qualification WHY ARCHIVED: Session 04-06 contained an accuracy error about Anthropic RSP 3.0. This archive corrects the record and identifies the preliminary injunction as the key development that was missed. The actual Anthropic trajectory is more nuanced than "governance laundering." EXTRACTION HINT: The extractor needs to correct the Session 04-06 claim characterization. The RSP 3.0 restructure is NOT equivalent to dropping the pause commitment. The preliminary injunction (March 26, 2026) is the correct signal about Anthropic's actual trajectory. flagged_for_theseus: ["RSP 3.0/3.1 accuracy issue — Session 04-06 characterized RSP 3.0 as dropping pause commitment; actual RSP documents maintain pause authority and DoD dispute ended in preliminary injunction win for Anthropic. Theseus should verify before extracting any claim that relies on the Session 04-06 characterization."]

6.3 KiB Raw Blame History

Content

Agent Notes

Curator Notes

6.3 KiB

Raw Blame History