extract: 2025-12-18-tomasev-distributional-agi-safety
Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
This commit is contained in:
parent
a664eeb0ca
commit
ee178d6f7d
6 changed files with 68 additions and 1 deletions
|
|
@ -19,6 +19,12 @@ This directly validates the LivingIP architecture. Since [[collective superintel
|
|||
|
||||
Since [[intelligence is a property of networks not individuals]], the Patchwork AGI hypothesis applies this principle to artificial general intelligence itself. And since [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]], AGI emerging from agent coordination would follow the same pattern seen at every other scale.
|
||||
|
||||
|
||||
### Additional Evidence (confirm)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
Tomašev et al. (2025) formalize this as the 'patchwork AGI hypothesis' and argue it's the primary near-term risk scenario, proposing 'virtual agentic sandbox economies' as the governance response. They emphasize rapid deployment of coordinating agents makes this more urgent than monolithic superintelligence alignment.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -45,6 +45,12 @@ The source identifies three market failure mechanisms driving over-adoption: (1)
|
|||
|
||||
Krier provides institutional mechanism: personal AI agents enable Coasean bargaining at scale by collapsing transaction costs (discovery, negotiation, enforcement), shifting governance from top-down planning to bottom-up market coordination within state-enforced safety boundaries. Proposes 'Matryoshkan alignment' with nested layers: outer (legal/constitutional), middle (competitive providers), inner (individual customization).
|
||||
|
||||
|
||||
### Additional Evidence (confirm)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
Tomašev et al. provide technical grounding for this claim by showing that distributed AGI emergence shifts the alignment problem from individual objective specification to system-level coordination governance, requiring market mechanisms and collective oversight infrastructure.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -43,6 +43,12 @@ Since [[the alignment tax creates a structural race to the bottom because safety
|
|||
|
||||
Open-source game framework provides 'interpretability, inter-agent transparency, and formal verifiability' as coordination infrastructure. The paper shows agents adapting mechanisms across repeated games, suggesting protocol design (the game structure) shapes strategic behavior more than base model capability.
|
||||
|
||||
|
||||
### Additional Evidence (extend)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
The patchwork AGI hypothesis suggests coordination protocols don't just improve performance—they're the mechanism through which general capability emerges from sub-AGI components, making protocol design the critical path to AGI rather than a performance optimization.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -25,6 +25,12 @@ For the Teleo collective specifically: our multi-agent architecture is designed
|
|||
|
||||
Open-source games reveal that code transparency creates new attack surfaces: agents can inspect opponent code to identify exploitable patterns. Sistla & Kleiman-Weiner show deceptive tactics emerge even with full code visibility, suggesting multi-agent vulnerabilities persist beyond information asymmetry.
|
||||
|
||||
|
||||
### Additional Evidence (extend)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
The paper extends this beyond security vulnerabilities to argue that collective risks require fundamentally different governance mechanisms—market-based coordination, reputation systems, and collective oversight—rather than just better single-agent evaluation.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -0,0 +1,32 @@
|
|||
{
|
||||
"rejected_claims": [
|
||||
{
|
||||
"filename": "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md",
|
||||
"issues": [
|
||||
"missing_attribution_extractor"
|
||||
]
|
||||
},
|
||||
{
|
||||
"filename": "rapid-agent-deployment-makes-distributed-safety-more-urgent-than-superintelligence-alignment.md",
|
||||
"issues": [
|
||||
"missing_attribution_extractor"
|
||||
]
|
||||
}
|
||||
],
|
||||
"validation_stats": {
|
||||
"total": 2,
|
||||
"kept": 0,
|
||||
"fixed": 2,
|
||||
"rejected": 2,
|
||||
"fixes_applied": [
|
||||
"distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:set_created:2026-03-19",
|
||||
"rapid-agent-deployment-makes-distributed-safety-more-urgent-than-superintelligence-alignment.md:set_created:2026-03-19"
|
||||
],
|
||||
"rejections": [
|
||||
"distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:missing_attribution_extractor",
|
||||
"rapid-agent-deployment-makes-distributed-safety-more-urgent-than-superintelligence-alignment.md:missing_attribution_extractor"
|
||||
]
|
||||
},
|
||||
"model": "anthropic/claude-sonnet-4.5",
|
||||
"date": "2026-03-19"
|
||||
}
|
||||
|
|
@ -6,11 +6,15 @@ url: https://arxiv.org/abs/2512.16856
|
|||
date_published: 2025-12-18
|
||||
date_archived: 2026-03-16
|
||||
domain: ai-alignment
|
||||
status: unprocessed
|
||||
status: enrichment
|
||||
processed_by: theseus
|
||||
tags: [distributed-agi, multi-agent-safety, patchwork-hypothesis, coordination]
|
||||
sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
|
||||
twitter_id: "712705562191011841"
|
||||
processed_by: theseus
|
||||
processed_date: 2026-03-19
|
||||
enrichments_applied: ["AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md", "multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md", "AI alignment is a coordination problem not a technical problem.md", "coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem.md"]
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
---
|
||||
|
||||
# Distributional AGI Safety
|
||||
|
|
@ -24,3 +28,10 @@ Key arguments:
|
|||
- Safety focus shifts from individual agent alignment to managing risks at the system-of-systems level
|
||||
|
||||
Directly relevant to our claim [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] and to the collective superintelligence thesis.
|
||||
|
||||
|
||||
## Key Facts
|
||||
- Paper published December 2025 by Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero
|
||||
- Research supported by ARIA Research Scaling Trust programme
|
||||
- Proposes 'virtual agentic sandbox economies' as governance framework
|
||||
- Framework includes market mechanisms, auditability, reputation management, and collective oversight
|
||||
|
|
|
|||
Loading…
Reference in a new issue