Teleo Agents 5113e174b5 theseus: extract from 2025-07-00-fli-ai-safety-index-summer-2025.md

- Source: inbox/archive/2025-07-00-fli-ai-safety-index-summer-2025.md
- Domain: ai-alignment
- Extracted by: headless extraction cron (worker 6)

Pentagon-Agent: Theseus <HEADLESS>

2026-03-12 05:58:13 +00:00

2.9 KiB

Raw Blame History

type	domain	description	confidence	source	created
claim	ai-alignment	FLI's Summer 2025 index shows all frontier AI labs score D or below in existential safety planning while claiming AGI within a decade	likely	Future of Life Institute, AI Safety Index Summer 2025 (2025-07-01)	2026-03-11

No frontier AI company scores above D in existential safety despite active AGI development

Future of Life Institute's comprehensive evaluation of frontier AI companies across 6 safety dimensions reveals a critical gap: every company scored D or below in existential safety planning, despite most claiming AGI development within a decade. The best overall performer, Anthropic, achieved only C+ (2.64/4.0) overall and D in existential safety. The index evaluated 7 companies: Anthropic (C+, 2.64), OpenAI (C, 2.10), Google DeepMind (C-, 1.76), x.AI (D, 1.23), Meta (D, 1.06), Zhipu AI (F, 0.62), and DeepSeek (F, 0.37).

The six evaluation dimensions were: Risk Assessment (dangerous capability testing), Current Harms (safety benchmarks and robustness), Safety Frameworks (risk management processes), Existential Safety (planning for human-level AI), Governance & Accountability (whistleblowing and oversight), and Information Sharing (transparency on specs and risks).

Critical findings:

Only 3 firms (Anthropic, OpenAI, DeepMind) conduct substantive testing for dangerous capabilities like bioterrorism and cyberattacks
Only OpenAI published its full whistleblowing policy publicly
FLI reviewers noted "none of the companies has anything like a coherent, actionable plan" for human-level AI safety
The disconnect between AGI claims and existential safety scores was described as "deeply disturbing"

The methodology was peer-reviewed and based on publicly available information plus email correspondence with developers. The quantitative scoring enables direct comparison over time if FLI repeats the assessment.

Evidence

This claim is supported by FLI's systematic evaluation across standardized criteria, providing the first quantitative company-level comparison of AI safety practices. The universal failure to score above D in existential safety, combined with active AGI development programs, provides empirical evidence for the gap between safety rhetoric and practice.

Relevant Notes:

the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it — this index provides company-level quantification
voluntary safety pledges cannot survive competitive pressure — confirmed by best company scoring only C+
safe AI development requires building alignment mechanisms before scaling capability — violated by every company assessed
no research group is building alignment through collective intelligence infrastructure — index doesn't evaluate this dimension

Topics:

domains/ai-alignment/_map

2.9 KiB Raw Blame History

No frontier AI company scores above D in existential safety despite active AGI development

Evidence

2.9 KiB

Raw Blame History