Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Precision fixes per Leo's review: - Claim 4 (curated skills): downgrade experimental→likely, cite source gap, clarify 16pp vs 17.3pp gap - Claim 6 (harness engineering): soften "supersedes" to "emerges as" - Claim 11 (notes as executable): remove unattributed 74% benchmark - Claim 12 (memory infrastructure): qualify title to observed 24% in one system, downgrade experimental→likely 9 themes across Field Reports 1-5, Determinism Boundary, Agentic Note-Taking 08/11/14/16/18. Pre-screening protocol followed: KB grep → NEW/ENRICHMENT/CHALLENGE categorization. Pentagon-Agent: Theseus <46864DD4-DA71-4719-A1B4-68F7C55854D3>
43 lines
4.7 KiB
Markdown
43 lines
4.7 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
secondary_domains: [collective-intelligence]
|
|
description: "Empirical evidence from Anthropic Code Review, LangChain GTM, and DeepMind scaling laws converges on three non-negotiable conditions for multi-agent value — without all three, single-agent baselines outperform"
|
|
confidence: likely
|
|
source: "Cornelius (@molt_cornelius), 'AI Field Report 2: The Orchestrator's Dilemma', X Article, March 2026; corroborated by Anthropic Code Review (16% → 54% substantive review), LangChain GTM (250% lead-to-opportunity), DeepMind scaling laws (Madaan et al.)"
|
|
created: 2026-03-30
|
|
depends_on:
|
|
- "multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows"
|
|
- "79 percent of multi-agent failures originate from specification and coordination not implementation because decomposition quality is the primary determinant of system success"
|
|
- "subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers"
|
|
---
|
|
|
|
# Multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value
|
|
|
|
The DeepMind scaling laws and production deployment data converge on three non-negotiable conditions for multi-agent coordination to outperform single-agent baselines:
|
|
|
|
1. **Natural parallelism** — The task decomposes into independent subtasks that can execute concurrently. If subtasks are sequential or interdependent, communication overhead fragments reasoning and degrades performance by 39-70%.
|
|
2. **Context overflow** — Individual subtasks exceed single-agent context capacity. If a single agent can hold the full context, adding agents introduces coordination cost with no compensating benefit.
|
|
3. **Adversarial verification value** — The task benefits from having the finding agent differ from the confirming agent. If verification adds nothing (the answer is obvious or binary), the additional agent is pure overhead.
|
|
|
|
Two production systems demonstrate the pattern:
|
|
|
|
**Anthropic Code Review** — dispatches a team of agents to hunt for bugs in PRs, with separate agents confirming each finding before it reaches the developer. Substantive review went from 16% to 54% of PRs. The task meets all three conditions: PRs are naturally parallel (each file is independent), large PRs overflow single-agent context, and bug confirmation is an adversarial verification task (the finder should not confirm their own finding).
|
|
|
|
**LangChain GTM agent** — spawns one subagent per sales account, each with constrained tools and structured output schemas. 250% increase in lead-to-opportunity conversion. Each account is naturally independent, each exceeds single context, and the parent validates without executing.
|
|
|
|
When any condition is missing, the system underperforms. DeepMind's data shows multi-agent averages -3.5% across general configurations — the specific configurations that work are narrow, and practitioners who keep the orchestration pattern but use a human orchestrator (manually decomposing and dispatching) avoid the automated orchestrator's inability to assess whether the three conditions are met.
|
|
|
|
## Challenges
|
|
|
|
The three conditions are stated as binary (present/absent) but in practice exist on continuums. A task may have *some* natural parallelism but not enough to justify the coordination overhead. The threshold for "enough" depends on agent capability, which is improving — the window where coordination adds value is actively shrinking as single-agent accuracy improves (the baseline paradox: below 45% single-agent accuracy, coordination helps; above, it hurts). This means the claim's practical utility may decrease over time as models improve.
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows]] — provides the quantitative basis: +81% on parallelizable (condition 1 met), -39% to -70% on sequential (condition 1 violated)
|
|
- [[79 percent of multi-agent failures originate from specification and coordination not implementation because decomposition quality is the primary determinant of system success]] — when condition 1 is met but decomposition quality is poor, the MAST study's 79% failure rate applies; the three conditions are necessary but not sufficient
|
|
- [[subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers]] — hierarchies succeed because they naturally enforce condition 3 (orchestrator validates, workers execute)
|
|
|
|
Topics:
|
|
- [[_map]]
|