teleo-codex/inbox/archive/2025-12-18-tomasev-distributional-agi-safety.md
m3taversal b64fe64b89
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
theseus: 5 claims from ARIA Scaling Trust programme papers
- What: 5 new claims + 6 source archives from papers referenced in
  Alex Obadia's ARIA Research tweet on distributed AGI safety
- Sources: Distributional AGI Safety (Tomašev), Agents of Chaos (Shapira),
  Simple Economics of AGI (Catalini), When AI Writes Software (de Moura),
  LLM Open-Source Games (Sistla), Coasean Bargaining (Krier)
- Claims: multi-agent emergent vulnerabilities (likely), verification
  bandwidth as binding constraint (likely), formal verification economic
  necessity (likely), cooperative program equilibria (experimental),
  Coasean transaction cost collapse (experimental)
- Connections: extends scalable oversight degradation, correlated blind
  spots, formal verification, coordination-as-alignment

Pentagon-Agent: Theseus <B4A5B354-03D6-4291-A6A8-1E04A879D9AC>
2026-03-16 16:46:07 +00:00

1.4 KiB

type title author url date_published date_archived domain status processed_by tags sourced_via twitter_id
source Distributional AGI Safety Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero https://arxiv.org/abs/2512.16856 2025-12-18 2026-03-16 ai-alignment processing theseus
distributed-agi
multi-agent-safety
patchwork-hypothesis
coordination
Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme 712705562191011841

Distributional AGI Safety

Tomašev et al. challenge the monolithic AGI assumption. They propose the "patchwork AGI hypothesis" — general capability levels first manifest through coordination among groups of sub-AGI agents with complementary skills and affordances, not through a single unified system.

Key arguments:

  • AI safety research has focused on safeguarding individual systems, overlooking distributed emergence
  • Rapid deployment of agents with tool-use and coordination capabilities makes distributed safety urgent
  • Proposed framework: "virtual agentic sandbox economies" with robust market mechanisms, auditability, reputation management, and oversight for collective risks
  • Safety focus shifts from individual agent alignment to managing risks at the system-of-systems level

Directly relevant to our claim AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system and to the collective superintelligence thesis.