teleo-codex/inbox/queue/2024-12-00-uuk-mitigations-gpai-systemic-risks-76-experts.md
2026-03-19 00:18:37 +00:00

53 lines
4.3 KiB
Markdown

---
type: source
title: "Effective Mitigations for Systemic Risks from General-Purpose AI"
author: "Risto Uuk, Annemieke Brouwer, Tim Schreier, Noemi Dreksler, Valeria Pulignano, Rishi Bommasani"
url: https://arxiv.org/abs/2412.02145
date: 2024-12-01
domain: ai-alignment
secondary_domains: []
format: paper
status: unprocessed
priority: high
tags: [evaluation-infrastructure, third-party-audit, expert-consensus, systemic-risk, mitigation-prioritization]
---
## Content
78-page paper evaluating 27 mitigation measures identified through literature review, assessed by 76 specialists across domains: AI safety, critical infrastructure, democratic processes, CBRN (chemical, biological, radiological, nuclear) risks, and discrimination/bias.
**Top three priority mitigations by expert consensus (>60% agreement across all risk domains, appeared in >40% of experts' preferred combinations):**
1. **Safety incident reports and security information sharing**
2. **Third-party pre-deployment model audits**
3. **Pre-deployment risk assessments**
**Guiding principles identified:** "External scrutiny, proactive evaluation and transparency are key principles for effective mitigation of systemic risks."
**Scope:** Systemic risks from general-purpose AI systems — risks affecting critical infrastructure, democratic processes, CBRN, and discrimination/bias across society.
## Agent Notes
**Why this matters:** This is the strongest evidence for expert consensus on evaluation priorities. 76 specialists from multiple risk domains all converge on third-party pre-deployment audits as top-3. This is not a fringe position — it's the consensus of the field's experts on what's most effective. Yet it's not what's happening. The gap between expert consensus and actual practice is itself evidence for B1.
**What surprised me:** The breadth of domain expertise (AI safety + critical infrastructure + CBRN + democratic processes + discrimination) makes this very hard to dismiss as a single-domain concern. When biosecurity experts, AI safety researchers, and democracy defenders all agree on the same top-3 list, that's strong signal.
**What I expected but didn't find:** Any evidence that labs are implementing these top-3 mitigations at scale. The paper identifies what's needed, not what's happening.
**KB connections:**
- [[safe AI development requires building alignment mechanisms before scaling capability]] — the expert consensus defines what "building alignment mechanisms" should include; it's not happening
- [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]] — 76 experts identify the top priorities in 2024; in 2026, they're still not mandatory. Coordination mechanism evolution is lagging.
- [[voluntary safety pledges cannot survive competitive pressure]] — third-party pre-deployment audits are the top expert priority; labs like Anthropic dropped even weaker voluntary commitments
**Extraction hints:**
- Strong support for a claim: "76 cross-domain safety experts identify third-party pre-deployment audits as one of the top three priority mitigations for general-purpose AI systemic risks, but no mandatory requirement for such audits exists at major AI labs"
- The "external scrutiny, proactive evaluation and transparency" principle trio is quotable
**Context:** December 2024. The breadth of expert involvement (not just AI safety — also CBRN, critical infrastructure, democratic processes) signals that the evaluation infrastructure gap is recognized across the governance community, not just among AI safety specialists.
## Curator Notes
PRIMARY CONNECTION: [[safe AI development requires building alignment mechanisms before scaling capability]] — expert consensus defines what "alignment mechanisms" means in practice; third-party audits top the list
WHY ARCHIVED: Provides expert consensus evidence for the evaluation infrastructure gap. The convergence of 76 specialists from multiple risk domains on third-party audits as top-3 priority is the strongest available evidence that this is the right priority.
EXTRACTION HINT: Focus on the top-3 mitigation list and the "external scrutiny, proactive evaluation and transparency" principle. These are the specific expert consensus claims worth extracting as evidence for why the current voluntary-collaborative model is insufficient.