Teleo Agents 828dda016c theseus: extract from 2025-07-00-fli-ai-safety-index-summer-2025.md

- Source: inbox/archive/2025-07-00-fli-ai-safety-index-summer-2025.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 6)

Pentagon-Agent: Theseus <HEADLESS>

2026-03-12 04:52:27 +00:00

3.3 KiB

Raw Blame History

type

domain

description

confidence

source

created

depends_on

claim

ai-alignment

All frontier AI companies score D or below in existential safety planning while claiming AGI within a decade, per FLI's Summer 2025 index

likely

Future of Life Institute, AI Safety Index Summer 2025, July 2025

2026-03-11

the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it

voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints

No frontier AI company scores above D in existential safety despite active AGI development

Future of Life Institute's comprehensive Summer 2025 evaluation of frontier AI companies reveals a stark gap between AGI development claims and existential safety preparation. All seven companies assessed—Anthropic, OpenAI, Google DeepMind, x.AI, Meta, Zhipu AI, and DeepSeek—scored D or below in the "Existential Safety" dimension, despite most claiming AGI timelines within a decade.

The best overall performer, Anthropic (C+, 2.64/4.0), still received only a D in existential safety planning. OpenAI scored C overall (2.10) but similarly failed to demonstrate coherent planning for human-level AI safety. The index evaluated six dimensions: Risk Assessment, Current Harms, Safety Frameworks, Existential Safety, Governance & Accountability, and Information Sharing.

Critical findings:

Only 3 firms (Anthropic, OpenAI, DeepMind) conduct substantive testing for dangerous capabilities like bioterrorism and cyberattacks
Only OpenAI published its full whistleblowing policy publicly
The reviewer noted: "None of the companies has anything like a coherent, actionable plan" for human-level AI safety
The disconnect between AGI claims and existential safety scores is "deeply disturbing"

This quantifies the race-to-the-bottom dynamic: even the most safety-conscious labs cannot maintain robust existential risk planning while competing on capability development. The absence of regulatory floors allows safety practice divergence to widen as competitive pressure intensifies.

Company scores (overall letter grade, numeric score)

Anthropic: C+ (2.64)
OpenAI: C (2.10)
Google DeepMind: C- (1.76)
x.AI: D (1.23)
Meta: D (1.06)
Zhipu AI: F (0.62)
DeepSeek: F (0.37)

All companies scored D or below specifically in the Existential Safety dimension, which evaluates planning for human-level AI risks.

Why this matters

The C+ best score and universal D-or-below existential safety scores provide empirical evidence for the structural race-to-the-bottom claim. Even Anthropic, positioned as a safety-focused lab, cannot escape the competitive pressure that prevents coherent existential risk planning across the industry.

Relevant Notes:

Topics:

domains/ai-alignment/_map

3.3 KiB Raw Blame History

No frontier AI company scores above D in existential safety despite active AGI development

Company scores (overall letter grade, numeric score)

Why this matters

3.3 KiB

Raw Blame History