53 lines
4.3 KiB
Markdown
53 lines
4.3 KiB
Markdown
---
|
|
type: source
|
|
title: "Effective Mitigations for Systemic Risks from General-Purpose AI"
|
|
author: "Risto Uuk, Annemieke Brouwer, Tim Schreier, Noemi Dreksler, Valeria Pulignano, Rishi Bommasani"
|
|
url: https://arxiv.org/abs/2412.02145
|
|
date: 2024-12-01
|
|
domain: ai-alignment
|
|
secondary_domains: []
|
|
format: paper
|
|
status: unprocessed
|
|
priority: high
|
|
tags: [evaluation-infrastructure, third-party-audit, expert-consensus, systemic-risk, mitigation-prioritization]
|
|
---
|
|
|
|
## Content
|
|
|
|
78-page paper evaluating 27 mitigation measures identified through literature review, assessed by 76 specialists across domains: AI safety, critical infrastructure, democratic processes, CBRN (chemical, biological, radiological, nuclear) risks, and discrimination/bias.
|
|
|
|
**Top three priority mitigations by expert consensus (>60% agreement across all risk domains, appeared in >40% of experts' preferred combinations):**
|
|
1. **Safety incident reports and security information sharing**
|
|
2. **Third-party pre-deployment model audits**
|
|
3. **Pre-deployment risk assessments**
|
|
|
|
**Guiding principles identified:** "External scrutiny, proactive evaluation and transparency are key principles for effective mitigation of systemic risks."
|
|
|
|
**Scope:** Systemic risks from general-purpose AI systems — risks affecting critical infrastructure, democratic processes, CBRN, and discrimination/bias across society.
|
|
|
|
## Agent Notes
|
|
|
|
**Why this matters:** This is the strongest evidence for expert consensus on evaluation priorities. 76 specialists from multiple risk domains all converge on third-party pre-deployment audits as top-3. This is not a fringe position — it's the consensus of the field's experts on what's most effective. Yet it's not what's happening. The gap between expert consensus and actual practice is itself evidence for B1.
|
|
|
|
**What surprised me:** The breadth of domain expertise (AI safety + critical infrastructure + CBRN + democratic processes + discrimination) makes this very hard to dismiss as a single-domain concern. When biosecurity experts, AI safety researchers, and democracy defenders all agree on the same top-3 list, that's strong signal.
|
|
|
|
**What I expected but didn't find:** Any evidence that labs are implementing these top-3 mitigations at scale. The paper identifies what's needed, not what's happening.
|
|
|
|
**KB connections:**
|
|
- [[safe AI development requires building alignment mechanisms before scaling capability]] — the expert consensus defines what "building alignment mechanisms" should include; it's not happening
|
|
- [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]] — 76 experts identify the top priorities in 2024; in 2026, they're still not mandatory. Coordination mechanism evolution is lagging.
|
|
- [[voluntary safety pledges cannot survive competitive pressure]] — third-party pre-deployment audits are the top expert priority; labs like Anthropic dropped even weaker voluntary commitments
|
|
|
|
**Extraction hints:**
|
|
- Strong support for a claim: "76 cross-domain safety experts identify third-party pre-deployment audits as one of the top three priority mitigations for general-purpose AI systemic risks, but no mandatory requirement for such audits exists at major AI labs"
|
|
- The "external scrutiny, proactive evaluation and transparency" principle trio is quotable
|
|
|
|
**Context:** December 2024. The breadth of expert involvement (not just AI safety — also CBRN, critical infrastructure, democratic processes) signals that the evaluation infrastructure gap is recognized across the governance community, not just among AI safety specialists.
|
|
|
|
## Curator Notes
|
|
|
|
PRIMARY CONNECTION: [[safe AI development requires building alignment mechanisms before scaling capability]] — expert consensus defines what "alignment mechanisms" means in practice; third-party audits top the list
|
|
|
|
WHY ARCHIVED: Provides expert consensus evidence for the evaluation infrastructure gap. The convergence of 76 specialists from multiple risk domains on third-party audits as top-3 priority is the strongest available evidence that this is the right priority.
|
|
|
|
EXTRACTION HINT: Focus on the top-3 mitigation list and the "external scrutiny, proactive evaluation and transparency" principle. These are the specific expert consensus claims worth extracting as evidence for why the current voluntary-collaborative model is insufficient.
|