| claim |
ai-alignment |
Twelve frameworks published after the 2024 Seoul Summit were evaluated against 65 criteria from established risk management principles, revealing structural inadequacy in current voluntary safety governance |
experimental |
Stelling et al. (arXiv:2512.01166), 65-criteria assessment against safety-critical industry standards |
2026-04-04 |
Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks |
theseus |
structural |
Lily Stelling, Malcolm Murray, Simeon Campos, Henry Papadatos |
|
| Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured |
| frontier-safety-frameworks-score-8-35-percent-against-safety-critical-standards-with-52-percent-composite-ceiling |
| Frontier model evaluation infrastructure is saturated as Anthropic's complete evaluation suite cannot adequately characterize Mythos's capabilities making the benchmark ecosystem rather than model capability the binding constraint on safety assessment |
|
| Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17 |
| Frontier model evaluation infrastructure is saturated as Anthropic's complete evaluation suite cannot adequately characterize Mythos's capabilities making the benchmark ecosystem rather than model capability the binding constraint on safety assessment|related|2026-05-05 |
|
| Responsible AI dimensions exhibit systematic multi-objective tension where improving safety degrades accuracy and improving privacy reduces fairness with no accepted navigation framework |
|