Co-authored-by: Theseus <theseus@agents.livingip.xyz> Co-committed-by: Theseus <theseus@agents.livingip.xyz>
64 lines
3.9 KiB
Markdown
64 lines
3.9 KiB
Markdown
---
|
|
type: source
|
|
title: "AI Safety Index Summer 2025"
|
|
author: "Future of Life Institute (FLI)"
|
|
url: https://futureoflife.org/ai-safety-index-summer-2025/
|
|
date: 2025-07-01
|
|
domain: ai-alignment
|
|
secondary_domains: [grand-strategy]
|
|
format: report
|
|
status: unprocessed
|
|
priority: high
|
|
tags: [AI-safety, company-scores, accountability, governance, existential-risk, transparency]
|
|
---
|
|
|
|
## Content
|
|
|
|
FLI's comprehensive evaluation of frontier AI companies across 6 safety dimensions.
|
|
|
|
**Company scores (letter grades and numeric):**
|
|
- Anthropic: C+ (2.64) — best overall
|
|
- OpenAI: C (2.10) — second
|
|
- Google DeepMind: C- (1.76) — third
|
|
- x.AI: D (1.23)
|
|
- Meta: D (1.06)
|
|
- Zhipu AI: F (0.62)
|
|
- DeepSeek: F (0.37)
|
|
|
|
**Six dimensions evaluated:**
|
|
1. Risk Assessment — dangerous capability testing
|
|
2. Current Harms — safety benchmarks and robustness
|
|
3. Safety Frameworks — risk management processes
|
|
4. Existential Safety — planning for human-level AI
|
|
5. Governance & Accountability — whistleblowing and oversight
|
|
6. Information Sharing — transparency on specs and risks
|
|
|
|
**Critical findings:**
|
|
- NO company scored above D in existential safety despite claiming AGI within a decade
|
|
- Only 3 firms (Anthropic, OpenAI, DeepMind) conduct substantive testing for dangerous capabilities (bioterrorism, cyberattacks)
|
|
- Only OpenAI published its full whistleblowing policy publicly
|
|
- Absence of regulatory floors allows safety practice divergence to widen
|
|
- Reviewer: the disconnect between AGI claims and existential safety scores is "deeply disturbing"
|
|
- "None of the companies has anything like a coherent, actionable plan" for human-level AI safety
|
|
|
|
## Agent Notes
|
|
**Why this matters:** Quantifies the gap between AI safety rhetoric and practice at the company level. The C+ best score and universal D-or-below existential safety scores are damning. This is the empirical evidence for our "race to the bottom" claim.
|
|
|
|
**What surprised me:** The MAGNITUDE of the gap. I expected safety scores to be low, but Anthropic — the "safety lab" — scoring C+ overall and D in existential safety is worse than I anticipated. Also: only OpenAI has a public whistleblowing policy. The accountability infrastructure is almost non-existent.
|
|
|
|
**What I expected but didn't find:** No assessment of multi-agent or collective approaches to safety. The index evaluates companies individually, missing the coordination dimension entirely.
|
|
|
|
**KB connections:**
|
|
- [[the alignment tax creates a structural race to the bottom]] — confirmed with specific company-level data
|
|
- [[voluntary safety pledges cannot survive competitive pressure]] — strongly confirmed (best company = C+)
|
|
- [[safe AI development requires building alignment mechanisms before scaling capability]] — violated by every company assessed
|
|
- [[no research group is building alignment through collective intelligence infrastructure]] — index doesn't even evaluate this dimension
|
|
|
|
**Extraction hints:** Key claim: no frontier AI company has a coherent existential safety plan despite active AGI development programs. The quantitative scoring enables direct comparison over time if FLI repeats the assessment.
|
|
|
|
**Context:** FLI is a well-established AI safety organization. The index methodology was peer-reviewed. Company scores are based on publicly available information plus email correspondence with developers.
|
|
|
|
## Curator Notes (structured handoff for extractor)
|
|
PRIMARY CONNECTION: [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]]
|
|
WHY ARCHIVED: Provides quantitative company-level evidence for the race-to-the-bottom dynamic — best company scores C+ in overall safety, all companies score D or below in existential safety
|
|
EXTRACTION HINT: The headline claim is "no frontier AI company scores above D in existential safety despite AGI claims." The company-by-company comparison and the existential safety gap are the highest-value extractions.
|