teleo-codex/entities/ai-alignment/uk-aisi.md
m3taversal d0998a23bd theseus: AI coordination governance evidence — 3 claims + 1 entity
- What: 3 claims on coordination governance empirics (binding regulation as
  only mechanism that works, transparency declining, compute export controls
  as misaligned governance) + UK AISI entity + comprehensive source archive
- Why: targeted research on weakest grounding of B2 ("alignment is coordination
  problem"). Found that voluntary coordination has empirically failed across
  every mechanism tested (2023-2026). Only binding regulation with enforcement
  changes behavior. This challenges the optimistic version of B2 and
  strengthens the case for enforcement-backed coordination.
- Connections: confirms voluntary-safety-pledge claim with extensive new
  evidence, strengthens nation-state-control claim, challenges alignment-tax
  claim by showing the tax is being cut not paid

Pentagon-Agent: Theseus <B4A5B354-03D6-4291-A6A8-1E04A879D9AC>
2026-03-16 19:35:00 +00:00

2.9 KiB

type entity_type name domain handles website status category key_metrics tracked_by created last_updated
entity governance_body UK AI Safety Institute ai-alignment
@AISafetyInst
https://www.aisi.gov.uk active Government AI safety evaluation body
pre_deployment_evals frontier_report blocking_authority
Conducted joint US-UK evaluation of OpenAI o1 (Dec 2024) Published Frontier AI Trends Report showing apprentice-level cyber task completion at 50% None — labs grant voluntary access and retain full release authority
theseus 2026-03-16 2026-03-16

UK AI Safety Institute

Overview

The first government-established AI safety evaluation body, created after the Bletchley Summit (November 2023). Conducted the most concrete bilateral safety cooperation to date (joint US-UK evaluation of OpenAI's o1, December 2024). Rebranded to "AI Security Institute" in February 2025, signaling an emphasis shift from safety to security.

Current State

  • Conducted pre-deployment evaluations of multiple frontier models
  • Published Frontier AI Trends Report: AI models now complete apprentice-level cyber tasks 50% of the time (up from 10% in early 2024), surpass PhD-level experts in chemistry/biology by up to 60%
  • Key finding: Model B (released 6 months after Model A) required ~40x more expert effort to find universal attacks in biological misuse
  • No blocking authority — labs participate voluntarily and retain full control over release decisions

Timeline

  • 2023-11 — Created after Bletchley Summit
  • 2024-04 — US-UK MOU signed for joint model testing, research sharing, personnel exchanges
  • 2024-12 — Joint pre-deployment evaluation of OpenAI o1 with US AISI
  • 2025-02 — Rebranded to "AI Security Institute"

Alignment Significance

The UK AISI is the strongest evidence that institutional infrastructure CAN be created from international coordination — but also the strongest evidence that institutional infrastructure without enforcement authority has limited impact. Labs grant access voluntarily. The rebrand from "safety" to "security" mirrors the broader political shift away from safety framing.

The US counterpart (AISI → CAISI) has been defunded and rebranded under the Trump administration, demonstrating the fragility of institutions that depend on executive branch support rather than legislative mandate.

Relationship to KB

Topics: