Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
- Source: inbox/queue/2026-04-10-anthropic-red-mythos-preview-glasswing-disclosure.md - Domain: ai-alignment - Claims: 3, Entities: 2 - Enrichments: 5 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
64 lines
No EOL
2.7 KiB
Markdown
64 lines
No EOL
2.7 KiB
Markdown
# Claude Mythos Preview
|
|
|
|
**Developer:** Anthropic
|
|
**Type:** Frontier AI model with autonomous cyber offense capabilities
|
|
**Status:** Restricted access (not generally available)
|
|
**Access:** ~40 organizations via Project Glasswing
|
|
**Disclosed:** April 2026
|
|
|
|
## Overview
|
|
|
|
Claude Mythos Preview is Anthropic's frontier AI model demonstrating autonomous zero-day vulnerability discovery and exploit development capabilities. It represents the first documented case of a frontier lab withholding a capability-complete model from public release based on explicit capability harm assessment.
|
|
|
|
## Capabilities
|
|
|
|
### Autonomous Exploit Development
|
|
- **181 successful exploits** for Firefox JavaScript engine (vs. 2 from prior Claude Opus 4.6)
|
|
- 90x improvement over predecessor model in single generation
|
|
- Autonomous exploit construction without human intervention
|
|
- Complex exploitation chains: JIT heap spray escaping both renderer AND OS sandbox
|
|
|
|
### Zero-Day Discovery
|
|
- Identified vulnerabilities in OpenBSD (27 years old) and FFmpeg (16 years old) that automated fuzzing missed millions of times
|
|
- Found >271 Firefox vulnerabilities (less than 1% patched at disclosure)
|
|
- Operates across major OSes, web browsers, and widely-used software
|
|
|
|
### Reverse Engineering
|
|
- Reconstructs plausible source code from stripped binaries
|
|
- Enables closed-source vulnerability discovery
|
|
|
|
## Emergent Capability
|
|
|
|
Anthropics stated: "These capabilities weren't explicitly trained, but emerged as a downstream consequence of general improvements in reasoning and code generation."
|
|
|
|
## Deployment Restriction
|
|
|
|
Anthropics explicitly stated: "we do not plan to make Claude Mythos Preview generally available."
|
|
|
|
**Rationale:** "The capabilities could enable attackers if frontier labs aren't careful about how they release these models." Non-experts can ask Mythos to find remote code execution vulnerabilities overnight and receive complete working exploits by morning.
|
|
|
|
**Temporal framing:** Described as "transitional period" with "eventual goal to enable users to safely deploy Mythos-class models at scale" once safeguards exist.
|
|
|
|
## Project Glasswing
|
|
|
|
Restricted access provided to ~40 organizations including:
|
|
- AWS
|
|
- Apple
|
|
- Microsoft
|
|
- Google
|
|
- CrowdStrike
|
|
- Palo Alto Networks
|
|
|
|
Human validators review findings before coordinated disclosure to affected parties.
|
|
|
|
## Governance Significance
|
|
|
|
First documented frontier AI model deployed under permanent access restrictions based on capability harm assessment, establishing a third deployment tier between general availability and non-deployment.
|
|
|
|
## Timeline
|
|
|
|
- **2026-04-10** — Anthropic published technical disclosure on red team research site (red.anthropic.com)
|
|
|
|
## Sources
|
|
|
|
- Anthropic Mythos Preview Technical Disclosure (April 2026) |