Compare commits

..

1 commit

Author SHA1 Message Date
Teleo Agents
c57c1567c3 entity-batch: update 1 entities
- Applied 1 entity operations from queue
- Files: entities/ai-alignment/uk-aisi.md

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
2026-03-19 16:07:55 +00:00
2 changed files with 9 additions and 12 deletions

View file

@ -41,6 +41,14 @@ The first government-established AI safety evaluation body, created after the Bl
- **2025-07-00** — Conducted international joint testing exercise on agentic systems
- **2025-05-00** — Released HiBayES statistical modeling framework
- **2024-04-00** — Released open-source Inspect evaluation framework
- **2026-03-16** — Conducted cyber capability testing on 7 LLMs on custom-built cyber ranges
- **2026-03-00** — Renamed from 'AI Safety Institute' to 'AI Security Institute'
- **2026-02-25** — Released Inspect Scout transcript analysis tool
- **2026-02-17** — Conducted universal jailbreak assessment against best-defended systems
- **2025-10-22** — Released ControlArena library for AI control experiments
- **2025-07-00** — Conducted international joint testing exercise on agentic systems
- **2025-05-00** — Released HiBayES statistical modeling framework
- **2024-04-00** — Released open-source Inspect evaluation framework
## Alignment Significance
The UK AISI is the strongest evidence that institutional infrastructure CAN be created from international coordination — but also the strongest evidence that institutional infrastructure without enforcement authority has limited impact. Labs grant access voluntarily. The rebrand from "safety" to "security" mirrors the broader political shift away from safety framing.

View file

@ -7,7 +7,7 @@ date_published: 2025-11-29
date_archived: 2026-03-16
domain: ai-alignment
secondary_domains: [collective-intelligence]
status: null-result
status: enrichment
processed_by: theseus
tags: [game-theory, program-equilibria, multi-agent, cooperation, strategic-interaction]
sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
@ -16,10 +16,6 @@ processed_by: theseus
processed_date: 2026-03-19
enrichments_applied: ["AI agents can reach cooperative program equilibria inaccessible in traditional game theory because open-source code transparency enables conditional strategies that require mutual legibility.md", "multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md", "coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
processed_by: theseus
processed_date: 2026-03-19
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "LLM returned 0 claims, 0 rejected by validator"
---
# Evaluating LLMs in Open-Source Games
@ -42,10 +38,3 @@ Relevant to coordination-as-alignment thesis and to mechanism design for multi-a
- Research sourced via Alex Obadia tweet, part of ARIA Research Scaling Trust programme
- Open-source games are defined as game-theoretic framework where players submit computer programs as actions
- LLMs demonstrated measurable evolutionary fitness across repeated game interactions
## Key Facts
- Sistla & Kleiman-Weiner paper published November 29, 2025 on arxiv.org/abs/2512.00371
- Research sourced via Alex Obadia (@ObadiaAlex) tweet, part of ARIA Research Scaling Trust programme
- Open-source games defined as game-theoretic framework where players submit computer programs as actions
- LLMs demonstrated measurable evolutionary fitness across repeated game interactions in the study