Mirror PR to Forgejo / mirror (pull_request) Waiting to run

Details

theseus: extract claims from 2025-07-10-gpai-code-of-practice-final-loss-of-control-category

- Source: inbox/queue/2025-07-10-gpai-code-of-practice-final-loss-of-control-category.md
- Domain: ai-alignment
- Claims: 1, Entities: 1
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>

2026-05-11 00:21:10 +00:00

3.1 KiB

Raw Blame History

type

domain

description

confidence

source

created

title

agent

sourced_from

scope

sourcer

supports

challenges

claim

ai-alignment

The Code explicitly requires loss-of-control evaluation but compliance benchmarks show 0% coverage of these capabilities, creating governance theater risk

experimental

EU AI Office GPAI Code of Practice, July 2025

2026-05-11

EU GPAI Code naming loss of control as mandatory systemic risk category creates formal requirement without corresponding verification infrastructure

theseus

ai-alignment/2025-07-10-gpai-code-of-practice-final-loss-of-control-category.md

structural

EU AI Office

eu-ai-act-extraterritorial-enforcement-creates-binding-governance-alternative-to-us-voluntary-commitments

voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance

major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation

safe AI development requires building alignment mechanisms before scaling capability

eu-ai-act-gpai-requirements-survived-omnibus-deferral-creating-mandatory-frontier-governance

EU GPAI Code naming loss of control as mandatory systemic risk category creates formal requirement without corresponding verification infrastructure

The EU GPAI Code of Practice (July 2025) explicitly names 'loss of control' as one of four mandatory systemic risk categories requiring 'special attention' for models trained with >10^25 FLOPs. This applies to all frontier labs: Anthropic, OpenAI, Google, Meta, Mistral, xAI. The Code requires three-step assessment (identification, analysis, determination) before each major model release, with external evaluation required unless providers demonstrate similarity to proven-compliant models. However, prior KB analysis (Sessions 21-22, Bench-2-CoP finding) found 0% coverage of loss-of-control capabilities in compliance benchmarks used to verify GPAI obligations. The gap between formal requirement (Code names loss of control) and implementation (Appendix 1 technical definition unknown; compliance verification infrastructure inadequate) creates structural risk of compliance theater. The Code's specificity is materially greater than prior KB characterization of GPAI obligations as 'principles-based without capability categories' (Session 49 was wrong on this dimension). Whether the Code produces genuine safety governance or documentation theater depends on Appendix 1's technical definition: if it covers oversight evasion, self-replication, and autonomous AI development (the capabilities identified in Sessions 20-21 as gaps in current evaluation infrastructure), the governance framework is substantively more advanced than prior analysis captured. If not, it confirms prior analysis. Enforcement begins August 2, 2026 with fines up to 3% global annual turnover or €15 million. The Code was developed through multi-stakeholder process with AI safety researcher input (GovAI, CAIS, METR staff contributed to drafting committees), suggesting the explicit naming of loss-of-control reflects successful advocacy.

3.1 KiB Raw Blame History

EU GPAI Code naming loss of control as mandatory systemic risk category creates formal requirement without corresponding verification infrastructure

3.1 KiB

Raw Blame History