teleo-codex/inbox/archive/ai-alignment/2024-00-00-govai-coordinated-pausing-evaluation-scheme.md at 3aa6ed22b9b747fcdba07a352fb55341c944d99e

Teleo Agents 572a926c38 pipeline: archive 1 conflict-closed source(s)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>

2026-03-22 00:43:34 +00:00

5.5 KiB

Raw Blame History

type

title

author

url

date

domain

secondary_domains

format

status

priority

Content

GovAI proposes an evaluation-based coordination scheme in which frontier AI developers collectively pause development when evaluations discover dangerous capabilities. The proposal has four versions of escalating institutional weight:

Four versions:

Voluntary pausing (public pressure): When a model fails dangerous capability evaluations, the developer voluntarily pauses; public pressure mechanism for coordination
Collective agreement: Participating developers collectively agree in advance to pause if any model from any participating lab fails evaluations
Single auditor model: One independent auditor evaluates models from multiple developers; all pause if any fail
Legal mandate: Developers are legally required to run evaluations AND pause if dangerous capabilities are discovered

Triggering conditions: Model "fails a set of evaluations" for dangerous capabilities. Specific capabilities cited: designing chemical weapons, exploiting vulnerabilities in safety-critical software, synthesizing disinformation at scale, evading human control.

Five-step process: (1) Evaluate for dangerous capabilities → (2) Pause R&D if failed → (3) Notify other developers → (4) Other developers pause related work → (5) Analyze and resume when safety thresholds met.

Core governance innovation: The scheme treats the same dangerous capability evaluations that detect risks as the compliance trigger for mandatory pausing. Research evaluations and compliance requirements become the same instrument — closing the translation gap by design.

Key obstacle: Antitrust law. Collective coordination among competing AI developers to halt development could violate competition law in multiple jurisdictions. GovAI acknowledges "practical and legal obstacles need to be overcome, especially how to avoid violations of antitrust law."

Assessment: GovAI concludes coordinated pausing is "a promising mechanism for tackling emerging risks from frontier AI models" but notes obstacles including antitrust risk and the question of who defines "failing" an evaluation.

Agent Notes

Why this matters: The Coordinated Pausing proposal is the clearest published attempt to directly bridge research evaluations and compliance requirements by making them the same thing. This is exactly what the translation gap (Layer 3 of governance inadequacy) needs — and the antitrust obstacle explains why it hasn't been implemented despite being logically compelling. This paper shows the bridge IS being designed, but legal architecture is blocking its construction.

What surprised me: The antitrust obstacle is more concrete than I expected. AI development is dominated by a handful of large companies; a collective agreement to pause on evaluation failure could be construed as a cartel agreement, especially under US antitrust law. This is a genuine structural barrier, not a theoretical one. The solution may require government mandate (Version 4) rather than industry coordination (Versions 1-3).

What I expected but didn't find: I expected GovAI to have made more progress toward implementation — the paper appears to be proposing rather than documenting active programs. No news found of this scheme being adopted by any lab or government.

KB connections:

Directly addresses: 2026-03-21-research-compliance-translation-gap.md — proposes a mechanism that makes research evaluations into compliance triggers
Confirms: B2 (alignment is a coordination problem) — the antitrust obstacle IS the coordination problem made concrete
Relates to: domains/ai-alignment/voluntary-safety-pledge-failure.md — Versions 1-2 have the same structural weakness as RSP-style voluntary pledges
Potentially connects to: Rio's mechanism design territory (prediction markets, antitrust-resistant coordination)

Extraction hints:

New claim: "evaluation-based coordination schemes for frontier AI face antitrust obstacles because collective pausing agreements among competing developers could be construed as cartel behavior"
New claim: "legal mandate (government-required evaluation + mandatory pause on failure) is the only version of coordinated pausing that avoids antitrust risk while preserving coordination benefits"
The four-version escalation provides a roadmap for governance evolution: voluntary → collective agreement → single auditor → legal mandate

Curator Notes

PRIMARY CONNECTION: domains/ai-alignment/alignment-reframed-as-coordination-problem.md and translation-gap findings WHY ARCHIVED: The most detailed published proposal for closing the research-to-compliance translation gap; also provides the specific legal obstacle (antitrust) explaining why voluntary coordination can't solve the problem EXTRACTION HINT: The antitrust obstacle to coordinated pausing is the key claim — it explains why the translation gap requires government mandate (Version 4) not just industry coordination, connecting to the FDA vs. SEC model distinction

5.5 KiB Raw Blame History

Content

Agent Notes

Curator Notes

5.5 KiB

Raw Blame History