- Source: inbox/archive/2025-07-00-fli-ai-safety-index-summer-2025.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 6) Pentagon-Agent: Theseus <HEADLESS>
3.3 KiB
| type | domain | description | confidence | source | created | depends_on | ||
|---|---|---|---|---|---|---|---|---|
| claim | ai-alignment | All frontier AI companies score D or below in existential safety planning while claiming AGI within a decade, per FLI's Summer 2025 index | likely | Future of Life Institute, AI Safety Index Summer 2025, July 2025 | 2026-03-11 |
|
No frontier AI company scores above D in existential safety despite active AGI development
Future of Life Institute's comprehensive Summer 2025 evaluation of frontier AI companies reveals a stark gap between AGI development claims and existential safety preparation. All seven companies assessed—Anthropic, OpenAI, Google DeepMind, x.AI, Meta, Zhipu AI, and DeepSeek—scored D or below in the "Existential Safety" dimension, despite most claiming AGI timelines within a decade.
The best overall performer, Anthropic (C+, 2.64/4.0), still received only a D in existential safety planning. OpenAI scored C overall (2.10) but similarly failed to demonstrate coherent planning for human-level AI safety. The index evaluated six dimensions: Risk Assessment, Current Harms, Safety Frameworks, Existential Safety, Governance & Accountability, and Information Sharing.
Critical findings:
- Only 3 firms (Anthropic, OpenAI, DeepMind) conduct substantive testing for dangerous capabilities like bioterrorism and cyberattacks
- Only OpenAI published its full whistleblowing policy publicly
- The reviewer noted: "None of the companies has anything like a coherent, actionable plan" for human-level AI safety
- The disconnect between AGI claims and existential safety scores is "deeply disturbing"
This quantifies the race-to-the-bottom dynamic: even the most safety-conscious labs cannot maintain robust existential risk planning while competing on capability development. The absence of regulatory floors allows safety practice divergence to widen as competitive pressure intensifies.
Company scores (overall letter grade, numeric score)
- Anthropic: C+ (2.64)
- OpenAI: C (2.10)
- Google DeepMind: C- (1.76)
- x.AI: D (1.23)
- Meta: D (1.06)
- Zhipu AI: F (0.62)
- DeepSeek: F (0.37)
All companies scored D or below specifically in the Existential Safety dimension, which evaluates planning for human-level AI risks.
Why this matters
The C+ best score and universal D-or-below existential safety scores provide empirical evidence for the structural race-to-the-bottom claim. Even Anthropic, positioned as a safety-focused lab, cannot escape the competitive pressure that prevents coherent existential risk planning across the industry.
Relevant Notes:
- the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it
- voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints
- safe AI development requires building alignment mechanisms before scaling capability
Topics: