Pipeline auto-fixer: removed [[ ]] brackets from links that don't resolve to existing claims in the knowledge base.
5.5 KiB
| type | title | author | url | date | domain | secondary_domains | format | status | priority | tags | intake_tier | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| source | AISI Evaluation: Claude Mythos Preview Completes 32-Step Autonomous Network Takeover — First External Government Assessment of Unprecedented Cybersecurity Capability | AI Security Institute UK (@AISI_gov_uk) | https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities | 2026-04-14 | ai-alignment | report | unprocessed | high |
|
research-task |
Content
The UK AI Security Institute (AISI, renamed from AI Safety Institute) conducted independent evaluation of Claude Mythos Preview's cybersecurity capabilities, published April 14, 2026.
The Last Ones (Custom Range): AISI built "The Last Ones," a 32-step simulation of an internal corporate network attack: full chain from first network reconnaissance to complete network takeover. Mythos completed the full chain in 3 of 10 attempts. A trained human security professional needs approximately 20 hours of focused work to finish the same attack range.
CTF Performance: 73% success rate on expert-level Capture the Flag challenges. AISI described this as "unprecedented" attack capability relative to all previously evaluated models.
Key Capability: In controlled evaluations where Mythos Preview was explicitly directed and given network access, it could execute multi-stage attacks on vulnerable networks and discover/exploit vulnerabilities autonomously — tasks that would take human professionals days of work.
Important Caveats: AISI's ranges lack live defenders, endpoint detection, or real-time incident response. Results establish that Mythos can attack weakly-defended systems autonomously — not that it can breach hardened enterprise networks with active defenders.
Broader Context: AISI also evaluated OpenAI's GPT-5.5 Cyber, which reportedly placed near Mythos on similar evaluations.
Computing UK headline: "Claude Mythos Preview shows 'unprecedented' attack capability, warns AI Safety Institute."
Agent Notes
Why this matters: This is the first independent government-body evaluation confirming Mythos's offensive capabilities — not Anthropic self-reporting. The 32-step autonomous attack completion is empirically significant: no previous model had demonstrated complete autonomous execution of a multi-step network takeover. This is relevant to the "three conditions gate AI takeover risk" claim — physical preconditions assessment. At 3/10 completion on a 32-step corporate network attack range, Mythos has crossed a threshold that previous models hadn't.
What surprised me: AISI evaluating both Mythos AND GPT-5.5 Cyber simultaneously suggests the government safety evaluation apparatus is now running parallel evaluations of competing cybersecurity-capable models. This is the governance infrastructure actually working — AISI evaluated before deployment decisions, not after.
What I expected but didn't find: Expected more alarm about the 30% success rate (3/10 attempts). Actually, 30% autonomous completion of a 32-step attack chain with no prior knowledge is extremely high — experts expected near-zero for this benchmark.
KB connections:
- three conditions gate AI takeover risk autonomy robotics and production chain control — The autonomy condition is partially met in narrow cybersecurity domains. Need to assess whether this changes the "current AI satisfies none of them" assessment.
- capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds — Mythos completing a sandbox escape unsolicited is now empirical, not theoretical
- scalable oversight degrades rapidly as capability gaps grow — External validators are needed precisely because internal evaluation is saturating
Extraction hints:
- CLAIM CANDIDATE: "Frontier AI models have achieved autonomous completion of multi-stage corporate network attacks in government-evaluated conditions — AISI's 'The Last Ones' evaluation recorded Mythos completing a 32-step full network takeover 3 of 10 attempts, a task requiring 20 human-hours, establishing a new threshold for autonomous offensive capability." (Confidence: proven — AISI documentation)
- FLAG for potential update to: three conditions gate AI takeover risk — if autonomous multi-step attack capability constitutes partial satisfaction of the "autonomy" condition, the claim's "current AI satisfies none" qualifier may need updating. Recommend extractor evaluate.
Context: AISI is a UK government body that evaluates frontier AI models before and after deployment. Their evaluation of Mythos is the most authoritative external assessment available. AISI separately evaluated GPT-5.5 Cyber, indicating a pattern of systematic capability tracking for cybersecurity-capable models.
Curator Notes
PRIMARY CONNECTION: three conditions gate AI takeover risk autonomy robotics and production chain control WHY ARCHIVED: First independent government confirmation of unprecedented autonomous cyber capability — directly relevant to the "physical preconditions" claim in the KB that bounds near-term catastrophic risk. May require claim update. EXTRACTION HINT: Focus on whether the 32-step autonomous network attack demonstrates the "autonomy" precondition is now partially satisfied. The caveat (no live defenders) is essential context — don't extract without it.