extract: 2025-12-18-tomasev-distributional-agi-safety

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
2026-03-19 13:36:05 +00:00 · 2026-03-19 13:36:05 +00:00 · 1448da3014
commit 1448da3014
parent e274808f19
5 changed files with 61 additions and 1 deletions
--- a/domains/ai-alignment/AGI
+++ b/domains/ai-alignment/AGI
@ -19,6 +19,12 @@ This directly validates the LivingIP architecture. Since [[collective superintel

 Since [[intelligence is a property of networks not individuals]], the Patchwork AGI hypothesis applies this principle to artificial general intelligence itself. And since [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]], AGI emerging from agent coordination would follow the same pattern seen at every other scale.

+
+### Additional Evidence (confirm)
+*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
+
+Tomašev et al. (2025) provide formal theoretical support for the patchwork hypothesis, arguing that rapid deployment of agents with tool-use and coordination capabilities makes distributed emergence the more likely path than monolithic AGI. They propose this shifts safety research from individual system alignment to system-of-systems governance.
+
 ---

 Relevant Notes:
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -45,6 +45,12 @@ The source identifies three market failure mechanisms driving over-adoption: (1)

 Krier provides institutional mechanism: personal AI agents enable Coasean bargaining at scale by collapsing transaction costs (discovery, negotiation, enforcement), shifting governance from top-down planning to bottom-up market coordination within state-enforced safety boundaries. Proposes 'Matryoshkan alignment' with nested layers: outer (legal/constitutional), middle (competitive providers), inner (individual customization).

+
+### Additional Evidence (extend)
+*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
+
+The patchwork AGI hypothesis suggests alignment becomes primarily a coordination problem when general capability emerges from multi-agent systems rather than single models. Tomašev et al. propose market mechanisms and economic governance as the coordination infrastructure.
+
 ---

 Relevant Notes:
--- a/domains/ai-alignment/multi-agent
+++ b/domains/ai-alignment/multi-agent
@ -19,6 +19,12 @@ This validates the argument that [[all agents running the same model family crea

 For the Teleo collective specifically: our multi-agent architecture is designed to catch some of these failures (adversarial review, separated proposer/evaluator roles). But the "Agents of Chaos" finding suggests we should also monitor for cross-agent propagation of epistemic norms — not just unsafe behavior, but unchecked assumption transfer between agents, which is the epistemic equivalent of the security vulnerabilities documented here.

+
+### Additional Evidence (extend)
+*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19*
+
+Tomašev et al. argue that collective risks from multi-agent coordination require dedicated oversight mechanisms distinct from individual agent safety measures. They propose virtual sandbox economies with auditability and reputation management as the governance layer for emergent collective behavior.
+
 ---

 Relevant Notes:
--- a/inbox/queue/.extraction-debug/2025-12-18-tomasev-distributional-agi-safety.json
+++ b/inbox/queue/.extraction-debug/2025-12-18-tomasev-distributional-agi-safety.json
@ -0,0 +1,32 @@
+{
+  "rejected_claims": [
+    {
+      "filename": "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md",
+      "issues": [
+        "missing_attribution_extractor"
+      ]
+    },
+    {
+      "filename": "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md",
+      "issues": [
+        "missing_attribution_extractor"
+      ]
+    }
+  ],
+  "validation_stats": {
+    "total": 2,
+    "kept": 0,
+    "fixed": 2,
+    "rejected": 2,
+    "fixes_applied": [
+      "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:set_created:2026-03-19",
+      "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:set_created:2026-03-19"
+    ],
+    "rejections": [
+      "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:missing_attribution_extractor",
+      "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:missing_attribution_extractor"
+    ]
+  },
+  "model": "anthropic/claude-sonnet-4.5",
+  "date": "2026-03-19"
+}
--- a/inbox/queue/2025-12-18-tomasev-distributional-agi-safety.md
+++ b/inbox/queue/2025-12-18-tomasev-distributional-agi-safety.md
@ -6,11 +6,15 @@ url: https://arxiv.org/abs/2512.16856
 date_published: 2025-12-18
 date_archived: 2026-03-16
 domain: ai-alignment
-status: unprocessed
+status: enrichment
 processed_by: theseus
 tags: [distributed-agi, multi-agent-safety, patchwork-hypothesis, coordination]
 sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
 twitter_id: "712705562191011841"
+processed_by: theseus
+processed_date: 2026-03-19
+enrichments_applied: ["AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md", "multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md", "AI alignment is a coordination problem not a technical problem.md"]
+extraction_model: "anthropic/claude-sonnet-4.5"
 ---

 # Distributional AGI Safety
@ -24,3 +28,9 @@ Key arguments:
 - Safety focus shifts from individual agent alignment to managing risks at the system-of-systems level

 Directly relevant to our claim [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] and to the collective superintelligence thesis.
+
+
+## Key Facts
+- ARIA Research Scaling Trust programme funded research on distributional AGI safety
+- Paper published December 2025 on arXiv (2512.16856)
+- Authors: Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero