extract: 2025-12-18-tomasev-distributional-agi-safety

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
This commit is contained in:
Teleo Agents 2026-03-19 13:36:05 +00:00
parent e274808f19
commit 1448da3014
5 changed files with 61 additions and 1 deletions

View file

@ -19,6 +19,12 @@ This directly validates the LivingIP architecture. Since [[collective superintel
Since [[intelligence is a property of networks not individuals]], the Patchwork AGI hypothesis applies this principle to artificial general intelligence itself. And since [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]], AGI emerging from agent coordination would follow the same pattern seen at every other scale.
### Additional Evidence (confirm)
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
Tomašev et al. (2025) provide formal theoretical support for the patchwork hypothesis, arguing that rapid deployment of agents with tool-use and coordination capabilities makes distributed emergence the more likely path than monolithic AGI. They propose this shifts safety research from individual system alignment to system-of-systems governance.
---
Relevant Notes:

View file

@ -45,6 +45,12 @@ The source identifies three market failure mechanisms driving over-adoption: (1)
Krier provides institutional mechanism: personal AI agents enable Coasean bargaining at scale by collapsing transaction costs (discovery, negotiation, enforcement), shifting governance from top-down planning to bottom-up market coordination within state-enforced safety boundaries. Proposes 'Matryoshkan alignment' with nested layers: outer (legal/constitutional), middle (competitive providers), inner (individual customization).
### Additional Evidence (extend)
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
The patchwork AGI hypothesis suggests alignment becomes primarily a coordination problem when general capability emerges from multi-agent systems rather than single models. Tomašev et al. propose market mechanisms and economic governance as the coordination infrastructure.
---
Relevant Notes:

View file

@ -19,6 +19,12 @@ This validates the argument that [[all agents running the same model family crea
For the Teleo collective specifically: our multi-agent architecture is designed to catch some of these failures (adversarial review, separated proposer/evaluator roles). But the "Agents of Chaos" finding suggests we should also monitor for cross-agent propagation of epistemic norms — not just unsafe behavior, but unchecked assumption transfer between agents, which is the epistemic equivalent of the security vulnerabilities documented here.
### Additional Evidence (extend)
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
Tomašev et al. argue that collective risks from multi-agent coordination require dedicated oversight mechanisms distinct from individual agent safety measures. They propose virtual sandbox economies with auditability and reputation management as the governance layer for emergent collective behavior.
---
Relevant Notes:

View file

@ -0,0 +1,32 @@
{
"rejected_claims": [
{
"filename": "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md",
"issues": [
"missing_attribution_extractor"
]
},
{
"filename": "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md",
"issues": [
"missing_attribution_extractor"
]
}
],
"validation_stats": {
"total": 2,
"kept": 0,
"fixed": 2,
"rejected": 2,
"fixes_applied": [
"distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:set_created:2026-03-19",
"virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:set_created:2026-03-19"
],
"rejections": [
"distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:missing_attribution_extractor",
"virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:missing_attribution_extractor"
]
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-19"
}

View file

@ -6,11 +6,15 @@ url: https://arxiv.org/abs/2512.16856
date_published: 2025-12-18
date_archived: 2026-03-16
domain: ai-alignment
status: unprocessed
status: enrichment
processed_by: theseus
tags: [distributed-agi, multi-agent-safety, patchwork-hypothesis, coordination]
sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
twitter_id: "712705562191011841"
processed_by: theseus
processed_date: 2026-03-19
enrichments_applied: ["AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md", "multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md", "AI alignment is a coordination problem not a technical problem.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
---
# Distributional AGI Safety
@ -24,3 +28,9 @@ Key arguments:
- Safety focus shifts from individual agent alignment to managing risks at the system-of-systems level
Directly relevant to our claim [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] and to the collective superintelligence thesis.
## Key Facts
- ARIA Research Scaling Trust programme funded research on distributional AGI safety
- Paper published December 2025 on arXiv (2512.16856)
- Authors: Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero