diff --git a/domains/ai-alignment/multi-model-inference-collaboration-outperforms-single-models-because-cross-provider-diversity-accesses-solution-paths-unavailable-to-same-architecture-systems.md b/domains/ai-alignment/multi-model-inference-collaboration-outperforms-single-models-because-cross-provider-diversity-accesses-solution-paths-unavailable-to-same-architecture-systems.md new file mode 100644 index 000000000..ed75036af --- /dev/null +++ b/domains/ai-alignment/multi-model-inference-collaboration-outperforms-single-models-because-cross-provider-diversity-accesses-solution-paths-unavailable-to-same-architecture-systems.md @@ -0,0 +1,52 @@ +--- +type: claim +domain: ai-alignment +secondary_domains: [collective-intelligence, mechanisms] +description: "Empirical evidence from Sakana AI's AB-MCTS shows that multiple frontier models cooperating at inference time solve problems no individual model can, validating the collective superintelligence thesis at the inference layer" +confidence: likely +source: "Sakana AI AB-MCTS paper (arXiv 2503.04412, 2025); Evolutionary Model Merge (Nature Machine Intelligence, January 2025)" +created: 2026-05-12 +depends_on: ["three paths to superintelligence exist but only collective superintelligence preserves human agency", "collective superintelligence is the alternative to monolithic AI controlled by a few"] +--- + +# Multi-model inference-time collaboration outperforms any single model because cross-provider diversity accesses solution paths unavailable to same-architecture systems + +Sakana AI's AB-MCTS (Adaptive Branching Monte Carlo Tree Search) demonstrates empirically that multiple frontier AI models cooperating through structured search achieve results that no individual model can reach alone. On the ARC-AGI-2 benchmark, Multi-LLM AB-MCTS using o4-mini, Gemini-2.5-Pro, and DeepSeek-R1-0528 jointly achieved >30% Pass@250 versus 23% for the best single model (o4-mini) under repeated sampling. The critical finding is not merely additive performance gains but emergent problem-solving: specific problems unsolvable by ANY individual model were solved only through cross-model collaboration, where one model's failed attempt served as a productive hint for a different model's architecture to exploit. + +The mechanism is instructive. DeepSeek-R1-0528 performs poorly in isolation but efficiently increases the set of solvable problems when combined with other models. The algorithm dynamically allocates which model to use per problem via Thompson Sampling, discovering that different cognitive architectures are productive for different subproblems. This is not ensemble averaging or majority voting. It is structured collaboration where diversity of reasoning approach is the active ingredient. + +This validates the collective superintelligence thesis at the inference layer specifically. Since [[three paths to superintelligence exist but only collective superintelligence preserves human agency]], the AB-MCTS result demonstrates one mechanism by which collective approaches achieve capabilities monolithic systems cannot: provider diversity creates an expanded solution space that no amount of scaling a single architecture accesses. The capability gain comes from architectural heterogeneity, not parameter count. + +The alignment implications are direct. Since [[collective superintelligence is the alternative to monolithic AI controlled by a few]], systems that require provider diversity for their core capability create structural resistance to monopolization. A multi-provider inference system cannot be captured by a single lab because its capability depends on the diversity that capture would destroy. This is alignment-through-architecture: the coordination requirement is load-bearing for the capability, not optional overhead. + +However, the evidence requires honest scoping. AB-MCTS demonstrates collective superiority on abstract reasoning puzzles (ARC-AGI-2), not on alignment-relevant tasks like value elicitation, preference aggregation, or oversight of superhuman systems. The performance gap (30% vs 23%) is meaningful but not transformative. And the "collective" here is three models from three labs cooperating through an external orchestrator — not a distributed architecture with human values in the loop. The distance from "models cooperate on puzzles" to "collective superintelligence preserves human agency" remains large. This is evidence for the mechanism, not proof of the full thesis. + +## Evidence + +- Sakana AI AB-MCTS (arXiv 2503.04412): Multi-LLM tree search achieves >30% on ARC-AGI-2 vs 23% best single model; problems unsolvable by any single model solved through cross-model collaboration +- Dynamic model allocation via Thompson Sampling shows different models productive for different subproblems — diversity is doing real work +- DeepSeek-R1 contributes negatively alone but positively in combination — the collective property is irreducible to individual capability +- Evolutionary Model Merge (Nature Machine Intelligence, Jan 2025): 7B merged model exceeds 70B SOTA on Japanese benchmarks through evolutionary recombination of specialized models without gradient training — further evidence that recombination across diverse systems creates capabilities unavailable within individual systems +- TreeQuest framework released open-source (Apache 2.0) enabling reproducibility + +## Challenges + +- **Narrow domain**: ARC-AGI-2 measures abstract pattern recognition. The collective advantage may not generalize to value-laden, context-dependent tasks where alignment matters most. Alignment is not a puzzle-solving problem. +- **Orchestrator dependency**: The collective requires an external coordinator (the AB-MCTS algorithm) making allocation decisions. This is top-down orchestration, not bottom-up emergence. The coordinator is a single point of control, partially undermining the distribution argument. +- **Provider diversity is fragile**: The advantage depends on genuinely different architectures. As labs converge on similar training approaches, the diversity that makes collaboration productive may erode. Same-training-data, same-RLHF models from different labs may not provide real cognitive diversity. +- **Scale question**: Three models cooperating is far from collective superintelligence. The scaling properties of multi-model collaboration (does adding a fourth model help? A hundredth?) are unknown. +- **Commercial incentive misalignment**: Labs have no incentive to make their models cooperate with competitors. The infrastructure for multi-provider collaboration may never be built at scale because it requires cooperation between competing entities. + +--- + +Relevant Notes: +- [[three paths to superintelligence exist but only collective superintelligence preserves human agency]] — AB-MCTS provides empirical grounding for the collective path's capability advantage +- [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — multi-provider inference creates structural resistance to monopolization +- [[no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it]] — Sakana builds collective inference but not collective alignment, confirming the gap while validating the mechanism +- [[sycophancy-is-paradigm-level-failure-across-all-frontier-models-suggesting-rlhf-systematically-produces-approval-seeking]] — provider diversity may mitigate same-training-pipeline failure modes +- [[individual-free-energy-minimization-does-not-guarantee-collective-optimization-in-multi-agent-active-inference]] — coordination mechanisms (like AB-MCTS's Thompson Sampling) are necessary; diversity alone is insufficient + +Topics: +- [[maps/collective agents]] +- [[maps/livingip overview]] +- domains/ai-alignment/_map