theseus: extract claims from 2026-04-22-aisi-uk-mythos-cyber-evaluation #3833

Closed
theseus wants to merge 1 commit from extract/2026-04-22-aisi-uk-mythos-cyber-evaluation-5f32 into main
3 changed files with 22 additions and 1 deletions

View file

@ -34,3 +34,10 @@ Claude Mythos Preview achieved 73% success rate on expert-level CTF challenges a
**Source:** UK AISI Mythos evaluation, April 2026 **Source:** UK AISI Mythos evaluation, April 2026
Claude Mythos Preview's 3/10 success rate on completing a 32-step enterprise network intrusion from start to finish provides the first documented case of an AI model achieving end-to-end autonomous attack capability in a realistic environment. This exceeds what CTF benchmark performance (73% success on isolated tasks) would predict, confirming that cyber capabilities in integrated attack scenarios can exceed component-task predictions. AISI specifically noted Mythos's effectiveness at 'mapping complex software dependencies, making it highly effective at locating zero-day vulnerabilities in critical infrastructure software.' Claude Mythos Preview's 3/10 success rate on completing a 32-step enterprise network intrusion from start to finish provides the first documented case of an AI model achieving end-to-end autonomous attack capability in a realistic environment. This exceeds what CTF benchmark performance (73% success on isolated tasks) would predict, confirming that cyber capabilities in integrated attack scenarios can exceed component-task predictions. AISI specifically noted Mythos's effectiveness at 'mapping complex software dependencies, making it highly effective at locating zero-day vulnerabilities in critical infrastructure software.'
## Supporting Evidence
**Source:** UK AISI evaluation, April 2026
Claude Mythos Preview achieved 73% success on expert-level CTF challenges and completed full 32-step enterprise attack chains, with AISI noting it is 'specifically effective at mapping complex software dependencies, making it highly effective at locating zero-day vulnerabilities in critical infrastructure software.' This provides additional empirical evidence that cyber capabilities in frontier models exceed isolated benchmark predictions.

View file

@ -10,9 +10,16 @@ agent: theseus
sourced_from: ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md sourced_from: ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md
scope: functional scope: functional
sourcer: UK AI Security Institute sourcer: UK AI Security Institute
related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation"] related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation", "cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions"]
--- ---
# Independent government evaluation publishing adverse findings during commercial negotiation functions as a governance instrument through information asymmetry reduction # Independent government evaluation publishing adverse findings during commercial negotiation functions as a governance instrument through information asymmetry reduction
UK AISI published detailed evaluation of Claude Mythos Preview's cyber capabilities in April 2026 while Anthropic was actively negotiating a Pentagon deal. The evaluation revealed Mythos as the first model to complete end-to-end enterprise attack chains, a finding with direct implications for military procurement decisions. This timing is significant because private commercial negotiations operate under information asymmetry — the vendor controls capability disclosure and the buyer must rely on vendor claims. Independent government evaluation publishing findings publicly during active negotiations breaks this asymmetry by creating a credible third-party signal that neither party controls. AISI's institutional position as a government safety body (not a commercial competitor or advocacy organization) gives the evaluation credibility that vendor self-assessment lacks. The fact that AISI published findings that could complicate Anthropic's commercial negotiation demonstrates the evaluation body's independence. This is a governance mechanism distinct from regulation (no binding constraint) and voluntary commitment (no vendor control) — it's information provision that changes the negotiation context. UK AISI published detailed evaluation of Claude Mythos Preview's cyber capabilities in April 2026 while Anthropic was actively negotiating a Pentagon deal. The evaluation revealed Mythos as the first model to complete end-to-end enterprise attack chains, a finding with direct implications for military procurement decisions. This timing is significant because private commercial negotiations operate under information asymmetry — the vendor controls capability disclosure and the buyer must rely on vendor claims. Independent government evaluation publishing findings publicly during active negotiations breaks this asymmetry by creating a credible third-party signal that neither party controls. AISI's institutional position as a government safety body (not a commercial competitor or advocacy organization) gives the evaluation credibility that vendor self-assessment lacks. The fact that AISI published findings that could complicate Anthropic's commercial negotiation demonstrates the evaluation body's independence. This is a governance mechanism distinct from regulation (no binding constraint) and voluntary commitment (no vendor control) — it's information provision that changes the negotiation context.
## Supporting Evidence
**Source:** UK AISI, April 2026
UK AISI published detailed evaluation of Claude Mythos Preview's dangerous cyber capabilities while Anthropic was actively negotiating a Pentagon deployment deal, demonstrating independent evaluation as a governance mechanism that creates information asymmetry reduction during private commercial negotiations.

View file

@ -66,3 +66,10 @@ The CISA exclusion from Mythos access while NSA received access demonstrates tha
**Source:** UK AISI Mythos evaluation, April 2026 **Source:** UK AISI Mythos evaluation, April 2026
The absence of public ASL-4 classification announcement for Claude Mythos Preview while Anthropic negotiates a Pentagon deal provides empirical evidence of the mechanism. AISI's evaluation demonstrates capability uplift sufficient to trigger ASL-4 under Anthropic's published RSP criteria (demonstrated uplift to sophisticated attacks, autonomous end-to-end intrusion capability), yet no ASL-4 announcement has been made during the commercial negotiation period. This suggests that voluntary safety level classifications are subject to strategic timing considerations when the primary customer (Pentagon) requires capability-maximizing alternatives. The absence of public ASL-4 classification announcement for Claude Mythos Preview while Anthropic negotiates a Pentagon deal provides empirical evidence of the mechanism. AISI's evaluation demonstrates capability uplift sufficient to trigger ASL-4 under Anthropic's published RSP criteria (demonstrated uplift to sophisticated attacks, autonomous end-to-end intrusion capability), yet no ASL-4 announcement has been made during the commercial negotiation period. This suggests that voluntary safety level classifications are subject to strategic timing considerations when the primary customer (Pentagon) requires capability-maximizing alternatives.
## Supporting Evidence
**Source:** UK AISI evaluation during Pentagon negotiations, April 2026
AISI published evaluation showing Mythos crosses dangerous capability thresholds while Anthropic is simultaneously negotiating a Pentagon deal, and notably there is no public ASL-4 classification announcement despite the evaluation appearing to meet RSP trigger criteria. This demonstrates voluntary safety commitments under commercial pressure from government customers.