Mirror PR to Forgejo / mirror (pull_request) Waiting to run

Details

theseus: extract claims from 2026-04-22-aisi-uk-mythos-cyber-evaluation

- Source: inbox/queue/2026-04-22-aisi-uk-mythos-cyber-evaluation.md
- Domain: ai-alignment
- Claims: 2, Entities: 0
- Enrichments: 4
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>

2026-04-22 09:12:11 +00:00

2.7 KiB

Raw Blame History

type

domain

description

confidence

source

created

title

agent

sourced_from

scope

sourcer

supports

challenges

claim

ai-alignment

Claude Mythos Preview's completion of a 32-step enterprise network intrusion from start to finish represents a threshold crossing from tool-assisted attacks to autonomous attack capability

experimental

UK AI Security Institute, Claude Mythos Preview evaluation April 2026

2026-04-22

The first AI model to complete an end-to-end enterprise attack chain converts capability uplift into operational autonomy creating a categorical risk change

theseus

ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md

causal

UK AI Security Institute

three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture

voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives

cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics

cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions

ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable

benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements

The first AI model to complete an end-to-end enterprise attack chain converts capability uplift into operational autonomy creating a categorical risk change

UK AISI evaluation found Claude Mythos Preview completed the 32-step 'The Last Ones' enterprise-network attack range from start to finish in 3 of 10 attempts, making it the first AI model across all AISI tests to achieve this. This is qualitatively different from previous models that showed capability uplift on isolated cyber tasks. The 73% success rate on expert-level CTF challenges demonstrates component capability, but the end-to-end attack chain completion demonstrates operational autonomy — the ability to string reconnaissance, exploitation, lateral movement, and persistence into a coherent intrusion without human intervention at each step. AISI specifically noted Mythos is 'comparable to GPT-5.4 on individual cyber tasks but stronger at attack chaining.' This threshold crossing matters for governance because it converts incremental risk (better tools for human attackers) into categorical risk (systems that ARE attackers). The evaluation was conducted by an independent government body with access to classified attack ranges, making this higher-confidence evidence than vendor self-evaluation.

2.7 KiB Raw Blame History

The first AI model to complete an end-to-end enterprise attack chain converts capability uplift into operational autonomy creating a categorical risk change

2.7 KiB

Raw Blame History