--- type: claim domain: ai-alignment description: Unlike bio and self-replication risks cyber has crossed from benchmark-implied future risk to documented present operational capability confidence: likely source: Cyberattack Evaluation Research Team, Google Threat Intelligence Group incident catalogue, Anthropic state-sponsored campaign documentation, AISLE zero-day discoveries created: 2026-04-04 title: Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores agent: theseus scope: causal sourcer: Cyberattack Evaluation Research Team related_claims: ["AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "[[current language models escalate to nuclear war in simulated conflicts because behavioral alignment cannot instill aversion to catastrophic irreversible actions]]"] related: - AI cyber capability benchmarks systematically overstate exploitation capability while understating reconnaissance capability because CTF environments isolate single techniques from real attack phase dynamics - cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions - cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics - AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk - independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect reweave_edges: - AI cyber capability benchmarks systematically overstate exploitation capability while understating reconnaissance capability because CTF environments isolate single techniques from real attack phase dynamics|related|2026-04-06 - Frontier AI models have achieved autonomous completion of multi-stage corporate network attacks in government-evaluated conditions establishing a new threshold for offensive capability|supports|2026-05-05 supports: - The first AI model to complete an end-to-end enterprise attack chain converts capability uplift into operational autonomy creating a categorical risk change - Frontier AI models have achieved autonomous completion of multi-stage corporate network attacks in government-evaluated conditions establishing a new threshold for offensive capability --- # Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores The paper documents that cyber capabilities have crossed a threshold that other dangerous capability domains have not: from theoretical benchmark performance to documented operational deployment at scale. Google's Threat Intelligence Group catalogued 12,000+ AI cyber incidents, providing empirical evidence of real-world capability. Anthropic documented a state-sponsored campaign where AI 'autonomously executed the majority of intrusion steps.' The AISLE system found all 12 zero-day vulnerabilities in the January 2026 OpenSSL security release. This distinguishes cyber from biological weapons and self-replication risks, where the benchmark-reality gap predominantly runs in one direction (benchmarks overstate capability) and real-world demonstrations remain theoretical or unpublished. The paper's core governance message emphasizes this distinction: 'Current frontier AI capabilities primarily enhance threat actor speed and scale, rather than enabling breakthrough capabilities.' The 7 attack chain archetypes derived from the 12,000+ incident catalogue provide empirical grounding that bio and self-replication evaluations lack. While CTF benchmarks may overstate exploitation capability (6.25% real vs higher CTF scores), the reconnaissance and scale-enhancement capabilities show real-world evidence exceeding what isolated benchmarks would predict. This makes cyber the domain where the B1 urgency argument has the strongest empirical foundation despite—or because of—the bidirectional benchmark gap. ## Supporting Evidence **Source:** UK AISI Mythos evaluation, April 2026 Claude Mythos Preview achieved 73% success rate on expert-level CTF challenges and completed 3/10 attempts at a 32-step enterprise attack chain that no previous model had completed. AISI specifically noted Mythos is 'highly effective at mapping complex software dependencies, making it highly effective at locating zero-day vulnerabilities in critical infrastructure software.' This provides additional empirical evidence that cyber capabilities in deployed models exceed what component-task benchmarks predict. ## Supporting Evidence **Source:** UK AISI Mythos evaluation, April 2026 Claude Mythos Preview's 3/10 success rate on completing a 32-step enterprise network intrusion from start to finish provides the first documented case of an AI model achieving end-to-end autonomous attack capability in a realistic environment. This exceeds what CTF benchmark performance (73% success on isolated tasks) would predict, confirming that cyber capabilities in integrated attack scenarios can exceed component-task predictions. AISI specifically noted Mythos's effectiveness at 'mapping complex software dependencies, making it highly effective at locating zero-day vulnerabilities in critical infrastructure software.'