teleo-codex/domains/ai-alignment/eu-gpai-code-loss-of-control-mandatory-category-creates-formal-requirement-without-verification-infrastructure.md
Teleo Agents 423d694307
Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
theseus: extract claims from 2025-07-10-gpai-code-of-practice-final-loss-of-control-category
- Source: inbox/queue/2025-07-10-gpai-code-of-practice-final-loss-of-control-category.md
- Domain: ai-alignment
- Claims: 1, Entities: 1
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-05-11 00:21:10 +00:00

3.1 KiB

type domain description confidence source created title agent sourced_from scope sourcer supports challenges related
claim ai-alignment The Code explicitly requires loss-of-control evaluation but compliance benchmarks show 0% coverage of these capabilities, creating governance theater risk experimental EU AI Office GPAI Code of Practice, July 2025 2026-05-11 EU GPAI Code naming loss of control as mandatory systemic risk category creates formal requirement without corresponding verification infrastructure theseus ai-alignment/2025-07-10-gpai-code-of-practice-final-loss-of-control-category.md structural EU AI Office
eu-ai-act-extraterritorial-enforcement-creates-binding-governance-alternative-to-us-voluntary-commitments
voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance
major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation
safe AI development requires building alignment mechanisms before scaling capability
eu-ai-act-gpai-requirements-survived-omnibus-deferral-creating-mandatory-frontier-governance

EU GPAI Code naming loss of control as mandatory systemic risk category creates formal requirement without corresponding verification infrastructure

The EU GPAI Code of Practice (July 2025) explicitly names 'loss of control' as one of four mandatory systemic risk categories requiring 'special attention' for models trained with >10^25 FLOPs. This applies to all frontier labs: Anthropic, OpenAI, Google, Meta, Mistral, xAI. The Code requires three-step assessment (identification, analysis, determination) before each major model release, with external evaluation required unless providers demonstrate similarity to proven-compliant models. However, prior KB analysis (Sessions 21-22, Bench-2-CoP finding) found 0% coverage of loss-of-control capabilities in compliance benchmarks used to verify GPAI obligations. The gap between formal requirement (Code names loss of control) and implementation (Appendix 1 technical definition unknown; compliance verification infrastructure inadequate) creates structural risk of compliance theater. The Code's specificity is materially greater than prior KB characterization of GPAI obligations as 'principles-based without capability categories' (Session 49 was wrong on this dimension). Whether the Code produces genuine safety governance or documentation theater depends on Appendix 1's technical definition: if it covers oversight evasion, self-replication, and autonomous AI development (the capabilities identified in Sessions 20-21 as gaps in current evaluation infrastructure), the governance framework is substantively more advanced than prior analysis captured. If not, it confirms prior analysis. Enforcement begins August 2, 2026 with fines up to 3% global annual turnover or €15 million. The Code was developed through multi-stakeholder process with AI safety researcher input (GovAI, CAIS, METR staff contributed to drafting committees), suggesting the explicit naming of loss-of-control reflects successful advocacy.