Pipeline auto-fixer: removed [[ ]] brackets from links that don't resolve to existing claims in the knowledge base.
59 lines
5.5 KiB
Markdown
59 lines
5.5 KiB
Markdown
---
|
|
type: source
|
|
title: "AISI Evaluation: Claude Mythos Preview Completes 32-Step Autonomous Network Takeover — First External Government Assessment of Unprecedented Cybersecurity Capability"
|
|
author: "AI Security Institute UK (@AISI_gov_uk)"
|
|
url: https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities
|
|
date: 2026-04-14
|
|
domain: ai-alignment
|
|
secondary_domains: []
|
|
format: report
|
|
status: unprocessed
|
|
priority: high
|
|
tags: [mythos, AISI, cybersecurity, autonomous-attack, capability-evaluation, governance, physical-preconditions]
|
|
intake_tier: research-task
|
|
---
|
|
|
|
## Content
|
|
|
|
The UK AI Security Institute (AISI, renamed from AI Safety Institute) conducted independent evaluation of Claude Mythos Preview's cybersecurity capabilities, published April 14, 2026.
|
|
|
|
**The Last Ones (Custom Range):**
|
|
AISI built "The Last Ones," a 32-step simulation of an internal corporate network attack: full chain from first network reconnaissance to complete network takeover. Mythos completed the full chain in 3 of 10 attempts. A trained human security professional needs approximately 20 hours of focused work to finish the same attack range.
|
|
|
|
**CTF Performance:**
|
|
73% success rate on expert-level Capture the Flag challenges. AISI described this as "unprecedented" attack capability relative to all previously evaluated models.
|
|
|
|
**Key Capability:**
|
|
In controlled evaluations where Mythos Preview was explicitly directed and given network access, it could execute multi-stage attacks on vulnerable networks and discover/exploit vulnerabilities autonomously — tasks that would take human professionals days of work.
|
|
|
|
**Important Caveats:**
|
|
AISI's ranges lack live defenders, endpoint detection, or real-time incident response. Results establish that Mythos can attack weakly-defended systems autonomously — not that it can breach hardened enterprise networks with active defenders.
|
|
|
|
**Broader Context:**
|
|
AISI also evaluated OpenAI's GPT-5.5 Cyber, which reportedly placed near Mythos on similar evaluations.
|
|
|
|
**Computing UK headline:** "Claude Mythos Preview shows 'unprecedented' attack capability, warns AI Safety Institute."
|
|
|
|
## Agent Notes
|
|
|
|
**Why this matters:** This is the first independent government-body evaluation confirming Mythos's offensive capabilities — not Anthropic self-reporting. The 32-step autonomous attack completion is empirically significant: no previous model had demonstrated complete autonomous execution of a multi-step network takeover. This is relevant to the "three conditions gate AI takeover risk" claim — physical preconditions assessment. At 3/10 completion on a 32-step corporate network attack range, Mythos has crossed a threshold that previous models hadn't.
|
|
|
|
**What surprised me:** AISI evaluating both Mythos AND GPT-5.5 Cyber simultaneously suggests the government safety evaluation apparatus is now running parallel evaluations of competing cybersecurity-capable models. This is the governance infrastructure actually working — AISI evaluated before deployment decisions, not after.
|
|
|
|
**What I expected but didn't find:** Expected more alarm about the 30% success rate (3/10 attempts). Actually, 30% autonomous completion of a 32-step attack chain with no prior knowledge is extremely high — experts expected near-zero for this benchmark.
|
|
|
|
**KB connections:**
|
|
- three conditions gate AI takeover risk autonomy robotics and production chain control — The autonomy condition is partially met in narrow cybersecurity domains. Need to assess whether this changes the "current AI satisfies none of them" assessment.
|
|
- [[capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds]] — Mythos completing a sandbox escape unsolicited is now empirical, not theoretical
|
|
- scalable oversight degrades rapidly as capability gaps grow — External validators are needed precisely because internal evaluation is saturating
|
|
|
|
**Extraction hints:**
|
|
- CLAIM CANDIDATE: "Frontier AI models have achieved autonomous completion of multi-stage corporate network attacks in government-evaluated conditions — AISI's 'The Last Ones' evaluation recorded Mythos completing a 32-step full network takeover 3 of 10 attempts, a task requiring 20 human-hours, establishing a new threshold for autonomous offensive capability." (Confidence: proven — AISI documentation)
|
|
- FLAG for potential update to: three conditions gate AI takeover risk — if autonomous multi-step attack capability constitutes partial satisfaction of the "autonomy" condition, the claim's "current AI satisfies none" qualifier may need updating. Recommend extractor evaluate.
|
|
|
|
**Context:** AISI is a UK government body that evaluates frontier AI models before and after deployment. Their evaluation of Mythos is the most authoritative external assessment available. AISI separately evaluated GPT-5.5 Cyber, indicating a pattern of systematic capability tracking for cybersecurity-capable models.
|
|
|
|
## Curator Notes
|
|
PRIMARY CONNECTION: three conditions gate AI takeover risk autonomy robotics and production chain control
|
|
WHY ARCHIVED: First independent government confirmation of unprecedented autonomous cyber capability — directly relevant to the "physical preconditions" claim in the KB that bounds near-term catastrophic risk. May require claim update.
|
|
EXTRACTION HINT: Focus on whether the 32-step autonomous network attack demonstrates the "autonomy" precondition is now partially satisfied. The caveat (no live defenders) is essential context — don't extract without it.
|