teleo-codex/entities/ai-alignment/eu-gpai-code-of-practice.md
Teleo Agents 423d694307
Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
theseus: extract claims from 2025-07-10-gpai-code-of-practice-final-loss-of-control-category
- Source: inbox/queue/2025-07-10-gpai-code-of-practice-final-loss-of-control-category.md
- Domain: ai-alignment
- Claims: 1, Entities: 1
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-05-11 00:21:10 +00:00

3.5 KiB

EU GPAI Code of Practice

Type: Regulatory Framework
Domain: AI Alignment, AI Governance
Status: Active (enforcement begins August 2, 2026)
Jurisdiction: European Union (extraterritorial application)
Authority: EU AI Office

Overview

The General-Purpose AI (GPAI) Code of Practice is the primary implementation vehicle for EU AI Act Articles 50-55, establishing mandatory obligations for providers of GPAI models with systemic risk (defined as models trained with >10^25 FLOPs).

Scope

Covered Providers (as of August 2025):

  • Anthropic (Claude)
  • OpenAI (GPT-4o, o3)
  • Google (Gemini 2.5 Pro)
  • Meta (Llama-4)
  • Mistral
  • xAI (Grok)

Four Mandatory Systemic Risk Categories

  1. CBRN risks — chemical, biological, radiological, nuclear
  2. Loss of control — AI systems that could become uncontrollable or undermine human oversight
  3. Cyber offense capabilities — capabilities enabling cyberattacks
  4. Harmful manipulation — large-scale manipulation of populations

Requirements

Safety and Security Model Report (before market placement):

  • Detailed model architecture and capabilities documentation
  • Justification of why systemic risks are acceptable
  • Documentation of systemic risk identification, analysis, and mitigation processes
  • Description of independent external evaluators' involvement
  • Details of implemented safety and security mitigations

Three-Step Assessment Process (per major model release):

  1. Identification — must identify potential systemic risks from the four categories
  2. Analysis — must analyze each risk, with third-party evaluators potentially required if risks exceed prior models
  3. Determination — must determine whether risks are acceptable before release

External Evaluation: Required unless providers can demonstrate their model is "similarly safe" to a proven-compliant model.

Enforcement

  • Soft enforcement: August 2025
  • Fines begin: August 2, 2026
  • Penalty structure: Up to 3% global annual turnover or €15 million, whichever is higher
  • Compliance presumption: Signatories get presumption of compliance; non-signatories face higher AI Office scrutiny

Signatories

As of August 2025: Anthropic, OpenAI, Google DeepMind, Meta, Mistral, Cohere, xAI, and ~50 other organizations.

Development Process

Developed through multi-stakeholder process with significant industry input. AI safety researchers from GovAI, CAIS, and METR contributed to drafting committees. The four categories were contested — CBRN and cyber offense less controversial; loss of control and harmful manipulation reflect more contested AI safety concerns.

Key Uncertainty

The specific technical definition of "loss of control" is in Appendix 1. Whether it means (a) behavioral human-override capability (shallow, consistent with current safety training) or (b) oversight evasion, self-replication, autonomous AI development (substantive alignment-relevant capabilities) determines whether GPAI enforcement produces genuine safety governance or documentation compliance theater.

Timeline

  • 2025-07-10 — Final version published by EU AI Office
  • 2025-08 — Soft enforcement begins; major frontier labs sign as participants
  • 2026-08-02 — Fines and full enforcement begin
  • EU AI Act Articles 50-55 (parent legislation)
  • EU AI Office (enforcement authority)
  • Sessions 21-22 KB analysis (Bench-2-CoP finding: 0% compliance benchmark coverage of loss-of-control capabilities)