Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Threshold: 0.7, Haiku classification, 67 files modified. Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
44 lines
4.8 KiB
Markdown
44 lines
4.8 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
secondary_domains: [collective-intelligence]
|
|
description: "MAST study of 1642 execution traces across 7 production systems found the dominant multi-agent failure cause is wrong task decomposition and vague coordination rules, not bugs or model limitations"
|
|
confidence: experimental
|
|
source: "MAST study (1,642 annotated execution traces, 7 production systems), cited in Cornelius (@molt_cornelius) 'AI Field Report 2: The Orchestrator's Dilemma', X Article, March 2026; corroborated by Puppeteer system (NeurIPS 2025)"
|
|
created: 2026-03-30
|
|
depends_on:
|
|
- "multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows"
|
|
- "subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers"
|
|
supports:
|
|
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
|
|
reweave_edges:
|
|
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value|supports|2026-04-03"
|
|
---
|
|
|
|
# 79 percent of multi-agent failures originate from specification and coordination not implementation because decomposition quality is the primary determinant of system success
|
|
|
|
The MAST study analyzed 1,642 annotated execution traces across seven production multi-agent systems and found that the dominant failure cause is not implementation bugs or model capability limitations — it is specification and coordination errors. 79% of failures trace to wrong task decomposition or vague coordination rules.
|
|
|
|
The hardest failures — information withholding, ignoring other agents' input, reasoning-action mismatch — resist protocol-level fixes entirely. These are inter-agent misalignment failures that require social reasoning abilities that communication protocols alone cannot provide. Adding more message-passing infrastructure does not help when the problem is that agents cannot model each other's state.
|
|
|
|
Corroborating evidence:
|
|
|
|
- **Puppeteer system (NeurIPS 2025):** Confirmed via reinforcement learning that topology and decomposition quality matter more than agent count. Optimal configuration: Width=4, Depth=2. The system's token consumption *decreases* during training while quality improves — the orchestrator learns to prune agents that add noise.
|
|
- **PawelHuryn's survey:** Evaluated every major coordination tool (Claude Code Agent Teams, CCPM, tick-md, Agent-MCP, 1Code, GitButler hooks) and concluded they all solve the wrong problem — the bottleneck is how you decompose the task, not which framework reassembles it.
|
|
- **GitHub engineering team principle:** "Treat agents like distributed systems, not chat flows."
|
|
|
|
This finding reframes the multi-agent scaling problem. The existing KB claim on compound reliability degradation (17.2x error amplification) describes what happens when decomposition fails. This claim identifies *why* it fails: the task specification was wrong before any agent executed. The fix is not better error handling or more sophisticated coordination protocols — it is better decomposition.
|
|
|
|
## Challenges
|
|
|
|
The MAST study covers production systems with specific coordination patterns. Whether the 79% figure holds for less structured multi-agent configurations (ad hoc swarms, peer-to-peer architectures) is untested. Additionally, as models improve at social reasoning, the inter-agent misalignment failures may decrease — but the specification errors (wrong decomposition) are upstream of model capability and may persist regardless.
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows]] — this claim provides the quantitative failure modes; the MAST study explains the *causal mechanism* behind those failures: 79% are specification errors, not execution errors
|
|
- [[subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers]] — hierarchies succeed partly because they concentrate decomposition responsibility in one orchestrator, reducing the coordination surface area where the 79% of failures originate
|
|
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]] — the 6x gain from protocol design IS decomposition quality; when decomposition is right, the same models perform dramatically better
|
|
|
|
Topics:
|
|
- [[_map]]
|