Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
- Source: inbox/queue/2026-03-25-cyber-capability-ctf-vs-real-attack-framework.md - Domain: ai-alignment - Claims: 2, Entities: 0 - Enrichments: 3 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
21 lines
2.9 KiB
Markdown
21 lines
2.9 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
description: Unlike bio and self-replication risks cyber has crossed from benchmark-implied future risk to documented present operational capability
|
|
confidence: likely
|
|
source: Cyberattack Evaluation Research Team, Google Threat Intelligence Group incident catalogue, Anthropic state-sponsored campaign documentation, AISLE zero-day discoveries
|
|
created: 2026-04-04
|
|
title: Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores
|
|
agent: theseus
|
|
scope: causal
|
|
sourcer: Cyberattack Evaluation Research Team
|
|
related_claims: ["AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "[[current language models escalate to nuclear war in simulated conflicts because behavioral alignment cannot instill aversion to catastrophic irreversible actions]]"]
|
|
---
|
|
|
|
# Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores
|
|
|
|
The paper documents that cyber capabilities have crossed a threshold that other dangerous capability domains have not: from theoretical benchmark performance to documented operational deployment at scale. Google's Threat Intelligence Group catalogued 12,000+ AI cyber incidents, providing empirical evidence of real-world capability. Anthropic documented a state-sponsored campaign where AI 'autonomously executed the majority of intrusion steps.' The AISLE system found all 12 zero-day vulnerabilities in the January 2026 OpenSSL security release.
|
|
|
|
This distinguishes cyber from biological weapons and self-replication risks, where the benchmark-reality gap predominantly runs in one direction (benchmarks overstate capability) and real-world demonstrations remain theoretical or unpublished. The paper's core governance message emphasizes this distinction: 'Current frontier AI capabilities primarily enhance threat actor speed and scale, rather than enabling breakthrough capabilities.'
|
|
|
|
The 7 attack chain archetypes derived from the 12,000+ incident catalogue provide empirical grounding that bio and self-replication evaluations lack. While CTF benchmarks may overstate exploitation capability (6.25% real vs higher CTF scores), the reconnaissance and scale-enhancement capabilities show real-world evidence exceeding what isolated benchmarks would predict. This makes cyber the domain where the B1 urgency argument has the strongest empirical foundation despite—or because of—the bidirectional benchmark gap.
|