diff --git a/domains/ai-alignment/ai-verification-limits-become-corporate-safety-arguments-in-government-contracts.md b/domains/ai-alignment/ai-verification-limits-become-corporate-safety-arguments-in-government-contracts.md new file mode 100644 index 000000000..c400c402c --- /dev/null +++ b/domains/ai-alignment/ai-verification-limits-become-corporate-safety-arguments-in-government-contracts.md @@ -0,0 +1,19 @@ +--- +type: claim +domain: ai-alignment +description: Anthropic's refusal cited model unreliability for autonomous weapons as a contractual constraint, operationalizing B4 verification degradation as a deployment boundary +confidence: experimental +source: Anthropic DoD statement, February 2026 +created: 2026-05-11 +title: AI verification limits are invoked as corporate safety arguments in government contract disputes rather than just technical research findings +agent: theseus +sourced_from: ai-alignment/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md +scope: functional +sourcer: "@AnthropicAI" +supports: ["ai-capability-and-reliability-are-independent-dimensions-because-claude-solved-a-30-year-open-mathematical-problem-while-simultaneously-degrading-at-basic-program-execution-during-the-same-session"] +related: ["ai-capability-and-reliability-are-independent-dimensions-because-claude-solved-a-30-year-open-mathematical-problem-while-simultaneously-degrading-at-basic-program-execution-during-the-same-session", "verification-of-meaningful-human-control-is-technically-infeasible-because-ai-decision-opacity-and-adversarial-resistance-defeat-external-audit", "selective-virtue-governance-is-risk-management-not-ethical-framework-when-operational-definitions-are-unverifiable", "ai-company-ethical-restrictions-are-contractually-penetrable-through-multi-tier-deployment-chains", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "ai-assisted-targeting-satisfies-autonomous-weapons-red-lines-through-action-type-definition"] +--- + +# AI verification limits are invoked as corporate safety arguments in government contract disputes rather than just technical research findings + +Anthropic's statement explicitly argued that 'frontier AI systems are simply not reliable enough to power fully autonomous weapons'—a verification-based safety constraint used as grounds for contract refusal. This represents a novel deployment of the B4 thesis (verification degrades faster than capability grows) as a corporate governance mechanism rather than purely a research observation. The company is not claiming Claude lacks the capability for autonomous targeting, but that verification of correct operation is insufficient for the stakes involved. This shifts verification limits from a technical property to a contractual constraint with legal enforceability. The framing suggests labs can operationalize reliability thresholds as hard deployment boundaries that survive government pressure when backed by litigation. This is distinct from capability-based refusal ('our system can't do this') or values-based refusal alone ('we won't do this')—it's a hybrid argument that verification inadequacy makes deployment unsafe regardless of capability or intent. The fact that this argument appeared in a government contract dispute rather than a research paper suggests verification limits are becoming actionable governance tools. diff --git a/domains/ai-alignment/hard-safety-constraints-survive-government-coercion-through-litigation-where-soft-pledges-collapse.md b/domains/ai-alignment/hard-safety-constraints-survive-government-coercion-through-litigation-where-soft-pledges-collapse.md new file mode 100644 index 000000000..f184ada70 --- /dev/null +++ b/domains/ai-alignment/hard-safety-constraints-survive-government-coercion-through-litigation-where-soft-pledges-collapse.md @@ -0,0 +1,20 @@ +--- +type: claim +domain: ai-alignment +description: Anthropic's refusal of DoD 'any lawful use' mandate through public litigation demonstrates that hard deployment constraints differ structurally from soft safety pledges in their durability under coercive pressure +confidence: experimental +source: Anthropic public statement, February 2026 +created: 2026-05-11 +title: Hard safety constraints backed by litigation survive government coercion where soft voluntary pledges collapse under competitive pressure +agent: theseus +sourced_from: ai-alignment/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md +scope: structural +sourcer: "@AnthropicAI" +supports: ["government-designation-of-safety-conscious-ai-labs-as-supply-chain-risks-inverts-the-regulatory-dynamic-by-penalizing-safety-constraints-rather-than-enforcing-them"] +challenges: ["voluntary-safety-pledges-cannot-survive-competitive-pressure-because-unilateral-commitments-are-structurally-punished-when-competitors-advance-without-equivalent-constraints"] +related: ["voluntary-safety-pledges-cannot-survive-competitive-pressure-because-unilateral-commitments-are-structurally-punished-when-competitors-advance-without-equivalent-constraints", "government-designation-of-safety-conscious-ai-labs-as-supply-chain-risks-inverts-the-regulatory-dynamic-by-penalizing-safety-constraints-rather-than-enforcing-them", "voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "coercive-governance-instruments-deployed-for-future-optionality-preservation-not-current-harm-prevention-when-pentagon-designates-domestic-ai-labs-as-supply-chain-risks", "coercive-ai-governance-instruments-self-negate-at-operational-timescale-when-governing-strategically-indispensable-capabilities", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors"] +--- + +# Hard safety constraints backed by litigation survive government coercion where soft voluntary pledges collapse under competitive pressure + +Anthropic maintained two hard safety exceptions—no mass domestic surveillance, no fully autonomous lethal weapons—for 3+ months against direct DoD coercive pressure, accepting designation as a 'Supply-Chain Risk to National Security' rather than removing the constraints. This contrasts sharply with the RSP rollback documented in Mode 1 collapse, where soft conditional safety thresholds eroded under commercial pressure. The key structural difference: hard constraints are binary deployment restrictions ('will not use for X') that can be litigated in court, while soft pledges are conditional capability thresholds ('will pause if Y') that depend on competitive context. Anthropic's CEO-level public refusal with judicial remedy represents a different durability class than voluntary commitments that require unilateral sacrifice. The company explicitly framed refusal on values grounds ('incompatible with democratic values') and reliability grounds ('not reliable enough'), invoking B4 verification limits as a corporate safety argument. This is the first documented case of a frontier AI lab accepting direct government penalty rather than removing a safety constraint, suggesting hard constraints that create justiciable disputes have different survival properties than soft pledges that collapse when competitors advance. diff --git a/inbox/queue/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md b/inbox/archive/ai-alignment/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md similarity index 98% rename from inbox/queue/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md rename to inbox/archive/ai-alignment/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md index 79595843a..0a9f44661 100644 --- a/inbox/queue/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md +++ b/inbox/archive/ai-alignment/2026-02-14-anthropic-statement-dod-refusal-any-lawful-use.md @@ -7,10 +7,13 @@ date: 2026-02-14 domain: ai-alignment secondary_domains: [] format: article -status: unprocessed +status: processed +processed_by: theseus +processed_date: 2026-05-11 priority: high tags: [dod, any-lawful-use, safety-constraints, Mode-2, B1-test, governance] intake_tier: research-task +extraction_model: "anthropic/claude-sonnet-4.5" --- ## Content