From 1448da3014713720564fa9027557f4edf57e020e Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Thu, 19 Mar 2026 13:36:05 +0000 Subject: [PATCH] extract: 2025-12-18-tomasev-distributional-agi-safety Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA> --- ... rather than a single monolithic system.md | 6 ++++ ...ination problem not a technical problem.md | 6 ++++ ...y in realistic multi-party environments.md | 6 ++++ ...-18-tomasev-distributional-agi-safety.json | 32 +++++++++++++++++++ ...12-18-tomasev-distributional-agi-safety.md | 12 ++++++- 5 files changed, 61 insertions(+), 1 deletion(-) create mode 100644 inbox/queue/.extraction-debug/2025-12-18-tomasev-distributional-agi-safety.json diff --git a/domains/ai-alignment/AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md b/domains/ai-alignment/AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md index fda237cc1..f4b8e99fe 100644 --- a/domains/ai-alignment/AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md +++ b/domains/ai-alignment/AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md @@ -19,6 +19,12 @@ This directly validates the LivingIP architecture. Since [[collective superintel Since [[intelligence is a property of networks not individuals]], the Patchwork AGI hypothesis applies this principle to artificial general intelligence itself. And since [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]], AGI emerging from agent coordination would follow the same pattern seen at every other scale. + +### Additional Evidence (confirm) +*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19* + +Tomašev et al. (2025) provide formal theoretical support for the patchwork hypothesis, arguing that rapid deployment of agents with tool-use and coordination capabilities makes distributed emergence the more likely path than monolithic AGI. They propose this shifts safety research from individual system alignment to system-of-systems governance. + --- Relevant Notes: diff --git a/domains/ai-alignment/AI alignment is a coordination problem not a technical problem.md b/domains/ai-alignment/AI alignment is a coordination problem not a technical problem.md index c554ceccb..c291982df 100644 --- a/domains/ai-alignment/AI alignment is a coordination problem not a technical problem.md +++ b/domains/ai-alignment/AI alignment is a coordination problem not a technical problem.md @@ -45,6 +45,12 @@ The source identifies three market failure mechanisms driving over-adoption: (1) Krier provides institutional mechanism: personal AI agents enable Coasean bargaining at scale by collapsing transaction costs (discovery, negotiation, enforcement), shifting governance from top-down planning to bottom-up market coordination within state-enforced safety boundaries. Proposes 'Matryoshkan alignment' with nested layers: outer (legal/constitutional), middle (competitive providers), inner (individual customization). + +### Additional Evidence (extend) +*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19* + +The patchwork AGI hypothesis suggests alignment becomes primarily a coordination problem when general capability emerges from multi-agent systems rather than single models. Tomašev et al. propose market mechanisms and economic governance as the coordination infrastructure. + --- Relevant Notes: diff --git a/domains/ai-alignment/multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md b/domains/ai-alignment/multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md index 7bf07ee6a..8de5ab851 100644 --- a/domains/ai-alignment/multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md +++ b/domains/ai-alignment/multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md @@ -19,6 +19,12 @@ This validates the argument that [[all agents running the same model family crea For the Teleo collective specifically: our multi-agent architecture is designed to catch some of these failures (adversarial review, separated proposer/evaluator roles). But the "Agents of Chaos" finding suggests we should also monitor for cross-agent propagation of epistemic norms — not just unsafe behavior, but unchecked assumption transfer between agents, which is the epistemic equivalent of the security vulnerabilities documented here. + +### Additional Evidence (extend) +*Source: [[2025-12-18-tomasev-distributional-agi-safety]] | Added: 2026-03-19* + +Tomašev et al. argue that collective risks from multi-agent coordination require dedicated oversight mechanisms distinct from individual agent safety measures. They propose virtual sandbox economies with auditability and reputation management as the governance layer for emergent collective behavior. + --- Relevant Notes: diff --git a/inbox/queue/.extraction-debug/2025-12-18-tomasev-distributional-agi-safety.json b/inbox/queue/.extraction-debug/2025-12-18-tomasev-distributional-agi-safety.json new file mode 100644 index 000000000..95366f05c --- /dev/null +++ b/inbox/queue/.extraction-debug/2025-12-18-tomasev-distributional-agi-safety.json @@ -0,0 +1,32 @@ +{ + "rejected_claims": [ + { + "filename": "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md", + "issues": [ + "missing_attribution_extractor" + ] + }, + { + "filename": "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md", + "issues": [ + "missing_attribution_extractor" + ] + } + ], + "validation_stats": { + "total": 2, + "kept": 0, + "fixed": 2, + "rejected": 2, + "fixes_applied": [ + "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:set_created:2026-03-19", + "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:set_created:2026-03-19" + ], + "rejections": [ + "distributed-agi-safety-requires-system-level-governance-not-individual-agent-alignment.md:missing_attribution_extractor", + "virtual-sandbox-economies-enable-distributed-agi-safety-through-market-discipline.md:missing_attribution_extractor" + ] + }, + "model": "anthropic/claude-sonnet-4.5", + "date": "2026-03-19" +} \ No newline at end of file diff --git a/inbox/queue/2025-12-18-tomasev-distributional-agi-safety.md b/inbox/queue/2025-12-18-tomasev-distributional-agi-safety.md index 7ff8bdc60..11c751bf4 100644 --- a/inbox/queue/2025-12-18-tomasev-distributional-agi-safety.md +++ b/inbox/queue/2025-12-18-tomasev-distributional-agi-safety.md @@ -6,11 +6,15 @@ url: https://arxiv.org/abs/2512.16856 date_published: 2025-12-18 date_archived: 2026-03-16 domain: ai-alignment -status: unprocessed +status: enrichment processed_by: theseus tags: [distributed-agi, multi-agent-safety, patchwork-hypothesis, coordination] sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme" twitter_id: "712705562191011841" +processed_by: theseus +processed_date: 2026-03-19 +enrichments_applied: ["AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md", "multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments.md", "AI alignment is a coordination problem not a technical problem.md"] +extraction_model: "anthropic/claude-sonnet-4.5" --- # Distributional AGI Safety @@ -24,3 +28,9 @@ Key arguments: - Safety focus shifts from individual agent alignment to managing risks at the system-of-systems level Directly relevant to our claim [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] and to the collective superintelligence thesis. + + +## Key Facts +- ARIA Research Scaling Trust programme funded research on distributional AGI safety +- Paper published December 2025 on arXiv (2512.16856) +- Authors: Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero -- 2.45.2