theseus: extract claims from 2026-01-29-metr-frontier-ai-safety-regulations-reference
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled

- Source: inbox/queue/2026-01-29-metr-frontier-ai-safety-regulations-reference.md
- Domain: ai-alignment
- Claims: 0, Entities: 0
- Enrichments: 4
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
This commit is contained in:
Teleo Agents 2026-05-11 00:20:30 +00:00
parent a4e629a4e6
commit dfb453ab28
3 changed files with 19 additions and 2 deletions

View file

@ -31,3 +31,10 @@ Apollo's deception probe work represents one of the few non-behavioral evaluatio
**Source:** Theseus EU AI Act compliance analysis, synthesizing Santos-Grueiro architecture findings with EU regulatory framework
EU AI Act GPAI compliance documentation (in force August 2025) maps conformity requirements onto behavioral evaluation pipelines (red-teaming, capability evaluations, safety benchmarking, RLHF). Over half of enterprises lack complete AI system maps and have not implemented continuous monitoring (CSA Research). Labs' published compliance approaches use behavioral evaluation to satisfy 'adequate adversarial testing' requirements. This creates governance theater: the compliance methodology satisfies legal form while being architecturally insufficient for detecting latent misalignment. Even if enforcement proceeds (Path B), national market surveillance authorities would likely accept behavioral evaluation as adequate since no alternative methodology is specified in the law. Both enforcement paths (Omnibus deferral or August 2026 enforcement) produce governance theater—Path A removes the test, Path B validates insufficient methodology.
## Supporting Evidence
**Source:** METR Frontier AI Safety Regulations Reference, January 2026
METR's regulatory reference identifies a 'key gap' where the three regulatory regimes (EU GPAI, California SB 53, NY RAISE) together cover evaluation requirements but 'leave the translation from research evaluations to mandatory compliance requirements incomplete.' METR's own evaluations (BashArena, monitoring evasion measurements) are not in the mandatory compliance pipeline.

View file

@ -11,7 +11,7 @@ attribution:
sourcer:
- handle: "the-intercept"
context: "The Intercept analysis of OpenAI Pentagon contract, March 2026"
related: ["government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "commercial-contract-governance-exhibits-form-substance-divergence-through-statutory-authority-preservation", "military-ai-contract-language-any-lawful-use-creates-surveillance-loophole-through-statutory-permission-structure", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance"]
related: ["government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "commercial-contract-governance-exhibits-form-substance-divergence-through-statutory-authority-preservation", "military-ai-contract-language-any-lawful-use-creates-surveillance-loophole-through-statutory-permission-structure", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance", "trust-based-safety-guarantees-fail-architecturally-in-classified-deployments"]
reweave_edges: ["government-safety-penalties-invert-regulatory-incentives-by-blacklisting-cautious-actors|related|2026-03-31", "cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation|supports|2026-04-03", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice|supports|2026-04-03", "Voluntary AI safety constraints are protected as corporate speech but unenforceable as safety requirements, creating legal mechanism gap when primary demand-side actor seeks safety-unconstrained providers|supports|2026-04-20", "Commercial contract governance of military AI produces form-substance divergence through statutory authority preservation that voluntary amendments cannot override|supports|2026-04-24", "Voluntary AI safety red lines without constitutional protection are structurally equivalent to no red lines because both depend on trust and lack external enforcement mechanisms|supports|2026-04-24", "Advisory safety guardrails on AI systems deployed to air-gapped classified networks are unenforceable by design because vendors cannot monitor queries, outputs, or downstream decisions|supports|2026-04-29"]
supports: ["cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation", "multilateral-verification-mechanisms-can-substitute-for-failed-voluntary-commitments-when-binding-enforcement-replaces-unilateral-sacrifice", "Voluntary AI safety constraints are protected as corporate speech but unenforceable as safety requirements, creating legal mechanism gap when primary demand-side actor seeks safety-unconstrained providers", "Commercial contract governance of military AI produces form-substance divergence through statutory authority preservation that voluntary amendments cannot override", "Voluntary AI safety red lines without constitutional protection are structurally equivalent to no red lines because both depend on trust and lack external enforcement mechanisms", "Advisory safety guardrails on AI systems deployed to air-gapped classified networks are unenforceable by design because vendors cannot monitor queries, outputs, or downstream decisions"]
---
@ -35,3 +35,10 @@ Topics:
**Source:** Hassett statement May 6, 2026; CAISI voluntary program expansion
The White House AI EO represents a shift from voluntary commitments (CAISI voluntary program with Google DeepMind, Microsoft, xAI) to mandatory pre-release review, but the review mechanism is scoped to cybersecurity rather than alignment. The EO creates binding enforcement infrastructure but applies it to the wrong problem domain, demonstrating that mandatory governance without correct scope is still governance theater.
## Supporting Evidence
**Source:** METR Frontier AI Safety Regulations Reference, January 2026
California SB 53 makes external evaluation voluntary (not mandatory) and accepts ISO/IEC 42001 as compliance evidence. METR's reference document identifies this as a 'self-reporting architecture' and notes the 'voluntary third-party evaluation and ISO/IEC 42001 acceptance both identified in prior Sessions as inadequate.'

View file

@ -7,10 +7,13 @@ date: 2026-01-29
domain: ai-alignment
secondary_domains: []
format: article
status: unprocessed
status: processed
processed_by: theseus
processed_date: 2026-05-11
priority: medium
tags: [metr, frontier-ai, safety-regulations, eu-ai-act, gpai, california-sb53, new-york-raise, regulatory-reference]
intake_tier: research-task
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content