- Source: inbox/archive/2025-07-00-fli-ai-safety-index-summer-2025.md - Domain: ai-alignment - Extracted by: headless extraction cron (worker 6) Pentagon-Agent: Theseus <HEADLESS>
2.9 KiB
| type | domain | description | confidence | source | created |
|---|---|---|---|---|---|
| claim | ai-alignment | FLI's Summer 2025 index shows all frontier AI labs score D or below in existential safety planning while claiming AGI within a decade | likely | Future of Life Institute, AI Safety Index Summer 2025 (2025-07-01) | 2026-03-11 |
No frontier AI company scores above D in existential safety despite active AGI development
Future of Life Institute's comprehensive evaluation of frontier AI companies across 6 safety dimensions reveals a critical gap: every company scored D or below in existential safety planning, despite most claiming AGI development within a decade. The best overall performer, Anthropic, achieved only C+ (2.64/4.0) overall and D in existential safety. The index evaluated 7 companies: Anthropic (C+, 2.64), OpenAI (C, 2.10), Google DeepMind (C-, 1.76), x.AI (D, 1.23), Meta (D, 1.06), Zhipu AI (F, 0.62), and DeepSeek (F, 0.37).
The six evaluation dimensions were: Risk Assessment (dangerous capability testing), Current Harms (safety benchmarks and robustness), Safety Frameworks (risk management processes), Existential Safety (planning for human-level AI), Governance & Accountability (whistleblowing and oversight), and Information Sharing (transparency on specs and risks).
Critical findings:
- Only 3 firms (Anthropic, OpenAI, DeepMind) conduct substantive testing for dangerous capabilities like bioterrorism and cyberattacks
- Only OpenAI published its full whistleblowing policy publicly
- FLI reviewers noted "none of the companies has anything like a coherent, actionable plan" for human-level AI safety
- The disconnect between AGI claims and existential safety scores was described as "deeply disturbing"
The methodology was peer-reviewed and based on publicly available information plus email correspondence with developers. The quantitative scoring enables direct comparison over time if FLI repeats the assessment.
Evidence
This claim is supported by FLI's systematic evaluation across standardized criteria, providing the first quantitative company-level comparison of AI safety practices. The universal failure to score above D in existential safety, combined with active AGI development programs, provides empirical evidence for the gap between safety rhetoric and practice.
Relevant Notes:
- the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it — this index provides company-level quantification
- voluntary safety pledges cannot survive competitive pressure — confirmed by best company scoring only C+
- safe AI development requires building alignment mechanisms before scaling capability — violated by every company assessed
- no research group is building alignment through collective intelligence infrastructure — index doesn't evaluate this dimension
Topics: