extract: 2025-12-18-tomasev-distributional-agi-safety
Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
This commit is contained in:
parent
e274808f19
commit
1448da3014
5 changed files with 61 additions and 1 deletions
|
|
@ -19,6 +19,12 @@ This directly validates the LivingIP architecture. Since [[collective superintel
|
|||
|
||||
Since [[intelligence is a property of networks not individuals]], the Patchwork AGI hypothesis applies this principle to artificial general intelligence itself. And since [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]], AGI emerging from agent coordination would follow the same pattern seen at every other scale.
|
||||
|
||||
|
||||
### Additional Evidence (confirm)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
Tomašev et al. (2025) provide formal theoretical support for the patchwork hypothesis, arguing that rapid deployment of agents with tool-use and coordination capabilities makes distributed emergence the more likely path than monolithic AGI. They propose this shifts safety research from individual system alignment to system-of-systems governance.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -45,6 +45,12 @@ The source identifies three market failure mechanisms driving over-adoption: (1)
|
|||
|
||||
Krier provides institutional mechanism: personal AI agents enable Coasean bargaining at scale by collapsing transaction costs (discovery, negotiation, enforcement), shifting governance from top-down planning to bottom-up market coordination within state-enforced safety boundaries. Proposes 'Matryoshkan alignment' with nested layers: outer (legal/constitutional), middle (competitive providers), inner (individual customization).
|
||||
|
||||
|
||||
### Additional Evidence (extend)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
The patchwork AGI hypothesis suggests alignment becomes primarily a coordination problem when general capability emerges from multi-agent systems rather than single models. Tomašev et al. propose market mechanisms and economic governance as the coordination infrastructure.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -19,6 +19,12 @@ This validates the argument that [[all agents running the same model family crea
|
|||
|
||||
For the Teleo collective specifically: our multi-agent architecture is designed to catch some of these failures (adversarial review, separated proposer/evaluator roles). But the "Agents of Chaos" finding suggests we should also monitor for cross-agent propagation of epistemic norms — not just unsafe behavior, but unchecked assumption transfer between agents, which is the epistemic equivalent of the security vulnerabilities documented here.
|
||||
|
||||
|
||||
### Additional Evidence (extend)
|
||||
*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
|
||||
|
||||
Tomašev et al. argue that collective risks from multi-agent coordination require dedicated oversight mechanisms distinct from individual agent safety measures. They propose virtual sandbox economies with auditability and reputation management as the governance layer for emergent collective behavior.
|
||||
|
||||
---
|
||||
|
||||
Relevant Notes:
|
||||
|
|
|
|||
|
|
@ -0,0 +1,32 @@
|
|||
{
|
||||
"rejected_claims": [
|
||||
{
|
||||
"filename": "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md",
|
||||
"issues": [
|
||||
"missing_attribution_extractor"
|
||||
]
|
||||
},
|
||||
{
|
||||
"filename": "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md",
|
||||
"issues": [
|
||||
"missing_attribution_extractor"
|
||||
]
|
||||
}
|
||||
],
|
||||
"validation_stats": {
|
||||
"total": 2,
|
||||
"kept": 0,
|
||||
"fixed": 2,
|
||||
"rejected": 2,
|
||||
"fixes_applied": [
|
||||
"distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:set_created:2026-03-19",
|
||||
"virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:set_created:2026-03-19"
|
||||
],
|
||||
"rejections": [
|
||||
"distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:missing_attribution_extractor",
|
||||
"virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:missing_attribution_extractor"
|
||||
]
|
||||
},
|
||||
"model": "anthropic/claude-sonnet-4.5",
|
||||
"date": "2026-03-19"
|
||||
}
|
||||
|
|
@ -6,11 +6,15 @@ url: https://arxiv.org/abs/2512.16856
|
|||
date_published: 2025-12-18
|
||||
date_archived: 2026-03-16
|
||||
domain: ai-alignment
|
||||
status: unprocessed
|
||||
status: enrichment
|
||||
processed_by: theseus
|
||||
tags: [distributed-agi, multi-agent-safety, patchwork-hypothesis, coordination]
|
||||
sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
|
||||
twitter_id: "712705562191011841"
|
||||
processed_by: theseus
|
||||
processed_date: 2026-03-19
|
||||
enrichments_applied: ["AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md", "multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md", "AI alignment is a coordination problem not a technical problem.md"]
|
||||
extraction_model: "anthropic/claude-sonnet-4.5"
|
||||
---
|
||||
|
||||
# Distributional AGI Safety
|
||||
|
|
@ -24,3 +28,9 @@ Key arguments:
|
|||
- Safety focus shifts from individual agent alignment to managing risks at the system-of-systems level
|
||||
|
||||
Directly relevant to our claim [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] and to the collective superintelligence thesis.
|
||||
|
||||
|
||||
## Key Facts
|
||||
- ARIA Research Scaling Trust programme funded research on distributional AGI safety
|
||||
- Paper published December 2025 on arXiv (2512.16856)
|
||||
- Authors: Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero
|
||||
|
|
|
|||
Loading…
Reference in a new issue