theseus: extract claims from 2026-01-29-metr-frontier-ai-safety-regulations-reference

- Source: inbox/queue/2026-01-29-metr-frontier-ai-safety-regulations-reference.md - Domain: ai-alignment - Claims: 0, Entities: 0 - Enrichments: 4 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
2026-05-11 00:20:30 +00:00 · 2026-05-11 00:20:30 +00:00 · dfb453ab28
commit dfb453ab28
parent a4e629a4e6
3 changed files with 19 additions and 2 deletions
--- a/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
+++ b/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
@ -31,3 +31,10 @@ Apollo's deception probe work represents one of the few non-behavioral evaluatio
 **Source:** Theseus EU AI Act compliance analysis, synthesizing Santos-Grueiro architecture findings with EU regulatory framework

 EU AI Act GPAI compliance documentation (in force August 2025) maps conformity requirements onto behavioral evaluation pipelines (red-teaming, capability evaluations, safety benchmarking, RLHF). Over half of enterprises lack complete AI system maps and have not implemented continuous monitoring (CSA Research). Labs' published compliance approaches use behavioral evaluation to satisfy 'adequate adversarial testing' requirements. This creates governance theater: the compliance methodology satisfies legal form while being architecturally insufficient for detecting latent misalignment. Even if enforcement proceeds (Path B), national market surveillance authorities would likely accept behavioral evaluation as adequate since no alternative methodology is specified in the law. Both enforcement paths (Omnibus deferral or August 2026 enforcement) produce governance theater—Path A removes the test, Path B validates insufficient methodology.
+
+
+## Supporting Evidence
+
+**Source:** METR Frontier AI Safety Regulations Reference, January 2026
+
+METR's regulatory reference identifies a 'key gap' where the three regulatory regimes (EU GPAI, California SB 53, NY RAISE) together cover evaluation requirements but 'leave the translation from research evaluations to mandatory compliance requirements incomplete.' METR's own evaluations (BashArena, monitoring evasion measurements) are not in the mandatory compliance pipeline.
--- a/domains/ai-alignment/voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance.md
+++ b/domains/ai-alignment/voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance.md
@ -11,7 +11,7 @@ attribution:
  sourcer:
    - handle: "the-intercept"
      context: "The Intercept analysis of OpenAI Pentagon contract, March 2026"
-related: ["government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "commercial-contract-governance-exhibits-form-substance-divergence-through-statutory-authority-preservation", "military-ai-contract-language-any-lawful-use-creates-surveillance-loophole-through-statutory-permission-structure", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance"]
+related: ["government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "commercial-contract-governance-exhibits-form-substance-divergence-through-statutory-authority-preservation", "military-ai-contract-language-any-lawful-use-creates-surveillance-loophole-through-statutory-permission-structure", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance", "trust-based-safety-guarantees-fail-architecturally-in-classified-deployments"]
 reweave_edges: ["government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors|related|2026-03-31", "cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation|supports|2026-04-03", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice|supports|2026-04-03", "Voluntary AI safety constraints are protected as corporate speech but unenforceable as safety requirements, creating legal mechanism gap when primary demand-side actor seeks safety-unconstrained providers|supports|2026-04-20", "Commercial contract governance of military AI produces form-substance divergence through statutory authority preservation that voluntary amendments cannot override|supports|2026-04-24", "Voluntary AI safety red lines without constitutional protection are structurally equivalent to no red lines because both depend on trust and lack external enforcement mechanisms|supports|2026-04-24", "Advisory safety guardrails on AI systems deployed to air-gapped classified networks are unenforceable by design because vendors cannot monitor queries, outputs, or downstream decisions|supports|2026-04-29"]
 supports: ["cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "Voluntary AI safety constraints are protected as corporate speech but unenforceable as safety requirements, creating legal mechanism gap when primary demand-side actor seeks safety-unconstrained providers", "Commercial contract governance of military AI produces form-substance divergence through statutory authority preservation that voluntary amendments cannot override", "Voluntary AI safety red lines without constitutional protection are structurally equivalent to no red lines because both depend on trust and lack external enforcement mechanisms", "Advisory safety guardrails on AI systems deployed to air-gapped classified networks are unenforceable by design because vendors cannot monitor queries, outputs, or downstream decisions"]
 ---
@ -35,3 +35,10 @@ Topics:
 **Source:** Hassett statement May 6, 2026; CAISI voluntary program expansion

 The White House AI EO represents a shift from voluntary commitments (CAISI voluntary program with Google DeepMind, Microsoft, xAI) to mandatory pre-release review, but the review mechanism is scoped to cybersecurity rather than alignment. The EO creates binding enforcement infrastructure but applies it to the wrong problem domain, demonstrating that mandatory governance without correct scope is still governance theater.
+
+
+## Supporting Evidence
+
+**Source:** METR Frontier AI Safety Regulations Reference, January 2026
+
+California SB 53 makes external evaluation voluntary (not mandatory) and accepts ISO/IEC 42001 as compliance evidence. METR's reference document identifies this as a 'self-reporting architecture' and notes the 'voluntary third-party evaluation and ISO/IEC 42001 acceptance both identified in prior Sessions as inadequate.'
--- a/inbox/archive/ai-alignment/2026-01-29-metr-frontier-ai-safety-regulations-reference.md
+++ b/inbox/archive/ai-alignment/2026-01-29-metr-frontier-ai-safety-regulations-reference.md
@ -7,10 +7,13 @@ date: 2026-01-29
 domain: ai-alignment
 secondary_domains: []
 format: article
-status: unprocessed
+status: processed
+processed_by: theseus
+processed_date: 2026-05-11
 priority: medium
 tags: [metr, frontier-ai, safety-regulations, eu-ai-act, gpai, california-sb53, new-york-raise, regulatory-reference]
 intake_tier: research-task
+extraction_model: "anthropic/claude-sonnet-4.5"
 ---

 ## Content