From d3d53035037f055f8bf68353e5f80a92e5460f2d Mon Sep 17 00:00:00 2001 From: m3taversal Date: Tue, 14 Apr 2026 00:36:04 +0100 Subject: [PATCH] theseus: extract 3 claims + 5 enrichments from Evans/Kim collective intelligence papers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - What: 3 NEW claims (society-of-thought emergence, LLMs-as-cultural-ratchet, recursive spawning) + 5 enrichments (intelligence-as-network, collective-intelligence-measurable, centaur, RLHF-failure, Ostrom) + 2 source archives - Why: Evans, Bratton & Agüera y Arcas (2026) and Kim et al. (2026) provide independent convergent evidence for collective superintelligence thesis from Google's Paradigms of Intelligence Team. Kim et al. is the strongest empirical evidence that reasoning IS social cognition (feature steering doubles accuracy 27%→55%). ~70-80% overlap with existing KB = convergent validation. - Source: Contributed by @thesensatore (Telegram) Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3> --- ...equiring state control or privatization.md | 5 + ... capture context-dependent human values.md | 5 + ...mentarity not mere human-AI combination.md | 5 + ...cture not aggregated individual ability.md | 5 + ... a property of networks not individuals.md | 5 + ...ti-perspective dialogue not calculation.md | 51 +++++++++ ...le-perspective reasoning cannot achieve.md | 62 +++++++++++ ... and collapse when the problem resolves.md | 59 ++++++++++ ...m-reasoning-models-societies-of-thought.md | 103 ++++++++++++++++++ ...guera-agentic-ai-intelligence-explosion.md | 60 ++++++++++ 10 files changed, 360 insertions(+) create mode 100644 foundations/collective-intelligence/large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi-perspective dialogue not calculation.md create mode 100644 foundations/collective-intelligence/reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve.md create mode 100644 foundations/collective-intelligence/recursive society-of-thought spawning enables fractal coordination where sub-perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves.md create mode 100644 inbox/archive/foundations/2026-01-15-kim-reasoning-models-societies-of-thought.md create mode 100644 inbox/archive/foundations/2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion.md diff --git a/foundations/collective-intelligence/Ostrom proved communities self-govern shared resources when eight design principles are met without requiring state control or privatization.md b/foundations/collective-intelligence/Ostrom proved communities self-govern shared resources when eight design principles are met without requiring state control or privatization.md index e3a40fa71..e0dc63527 100644 --- a/foundations/collective-intelligence/Ostrom proved communities self-govern shared resources when eight design principles are met without requiring state control or privatization.md +++ b/foundations/collective-intelligence/Ostrom proved communities self-govern shared resources when eight design principles are met without requiring state control or privatization.md @@ -32,6 +32,11 @@ Relevant Notes: - [[mechanism design changes the game itself to produce better equilibria rather than expecting players to find optimal strategies]] -- Ostrom's eight design principles ARE mechanism design for commons: they restructure the game so that sustainable resource use becomes the equilibrium rather than overexploitation - [[emotions function as mechanism design by evolution making cooperation self-enforcing without external authority]] -- Ostrom's graduated sanctions and community monitoring function like evolved emotions: they make defection costly from within the community rather than requiring external enforcement +### Additional Evidence (extend) +*Source: [[2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion]] | Added: 2026-04-14 | Extractor: theseus | Contributor: @thesensatore (Telegram)* + +Evans, Bratton & Agüera y Arcas (2026) extend Ostrom's design principles directly to AI agent governance. They propose "institutional alignment" — governance through persistent role-based templates modeled on courtrooms, markets, and bureaucracies, where agent identity matters less than role protocol fulfillment. This is Ostrom's architecture applied to digital agents: defined boundaries (role templates), collective-choice arrangements (role modification through protocol evolution), monitoring by accountable monitors (AI systems checking AI systems), graduated sanctions (constitutional checks between government and private AI), and nested enterprises (multiple institutional templates operating at different scales). The key extension: while Ostrom studied human communities managing physical commons, Evans et al. argue the same structural properties govern any multi-agent system managing shared resources — including AI collectives managing shared knowledge, compute, or decision authority. Since [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]], institutional alignment inherits Ostrom's central insight: design the governance architecture, let governance outcomes emerge. + Topics: - [[livingip overview]] - [[coordination mechanisms]] \ No newline at end of file diff --git a/foundations/collective-intelligence/RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values.md b/foundations/collective-intelligence/RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values.md index 091089513..51f11bcef 100644 --- a/foundations/collective-intelligence/RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values.md +++ b/foundations/collective-intelligence/RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values.md @@ -46,6 +46,11 @@ Relevant Notes: - [[overfitting is the idolatry of data a consequence of optimizing for what we can measure rather than what matters]] -- RLHF's single reward function is a proxy metric that the model overfits to: it optimizes for what the reward function measures rather than the diverse human values it is supposed to capture - [[regularization combats overfitting by penalizing complexity so models must justify every added factor]] -- pluralistic alignment approaches may function as regularization: rather than fitting one complex reward function, maintaining multiple simpler preference models prevents overfitting to any single evaluator's biases +### Additional Evidence (extend) +*Source: [[2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion]] | Added: 2026-04-14 | Extractor: theseus | Contributor: @thesensatore (Telegram)* + +Evans, Bratton & Agüera y Arcas (2026) identify a deeper structural problem with RLHF beyond preference diversity: it is a "dyadic parent-child correction model" that cannot scale to governing billions of agents. The correction model assumes one human correcting one model — a relationship that breaks at institutional scale just as it breaks at preference diversity. Their alternative — institutional alignment through persistent role-based templates (courtrooms, markets, bureaucracies) — provides governance through structural constraints rather than individual correction. This parallels Ostrom's design principles: successful commons governance emerges from architectural properties (boundaries, monitoring, graduated sanctions) not from correcting individual behavior. Since [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]], RLHF's dyadic model is additionally inadequate because it treats a model that internally functions as a society as if it were a single agent to be corrected. + Topics: - [[livingip overview]] - [[coordination mechanisms]] diff --git a/foundations/collective-intelligence/centaur team performance depends on role complementarity not mere human-AI combination.md b/foundations/collective-intelligence/centaur team performance depends on role complementarity not mere human-AI combination.md index 1908d02e1..d47e9d3d1 100644 --- a/foundations/collective-intelligence/centaur team performance depends on role complementarity not mere human-AI combination.md +++ b/foundations/collective-intelligence/centaur team performance depends on role complementarity not mere human-AI combination.md @@ -54,6 +54,11 @@ Relevant Notes: - [[Devoteds recursive optimization model shifts tasks from human to AI by training models on every platform interaction and deploying agents when models outperform humans]] -- Devoted's recursive optimization is a concrete centaur implementation that respects role boundaries by shifting tasks as AI capability grows - [[Devoteds atoms-plus-bits moat combines physical care delivery with AI software creating defensibility that pure technology or pure healthcare companies cannot replicate]] -- atoms+bits IS the centaur model at company scale with clear complementarity: physical care and AI software serve different functions +### Additional Evidence (extend) +*Source: [[2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion]] | Added: 2026-04-14 | Extractor: theseus | Contributor: @thesensatore (Telegram)* + +Evans, Bratton & Agüera y Arcas (2026) place the centaur model at the center of the next intelligence explosion — not as a fixed human-AI pairing but as shifting configurations where roles redistribute dynamically. Their framing extends the complementarity principle: centaur teams succeed not just because roles are complementary at a point in time, but because the role allocation can shift as capabilities evolve. Agents "fork, differentiate, and recombine" — the centaur is not a pair but a society. This addresses the failure mode where AI capability grows to encompass the human's contribution (as in modern chess): if roles shift dynamically, the centaur adapts rather than breaks down. The institutional alignment framework further suggests that centaur performance can be stabilized through persistent role-based templates — courtrooms, markets, bureaucracies — where role protocol fulfillment matters more than the identity of the agent filling the role. Since [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]], even single models already function as internal centaurs, making multi-model centaur architectures a natural externalization. + Topics: - [[livingip overview]] - [[LivingIP architecture]] diff --git a/foundations/collective-intelligence/collective intelligence is a measurable property of group interaction structure not aggregated individual ability.md b/foundations/collective-intelligence/collective intelligence is a measurable property of group interaction structure not aggregated individual ability.md index 1cba26da8..89f35aa60 100644 --- a/foundations/collective-intelligence/collective intelligence is a measurable property of group interaction structure not aggregated individual ability.md +++ b/foundations/collective-intelligence/collective intelligence is a measurable property of group interaction structure not aggregated individual ability.md @@ -28,6 +28,11 @@ Relevant Notes: - [[collective intelligence requires diversity as a structural precondition not a moral preference]] -- equal turn-taking mechanically produces more diverse input - [[collective brains generate innovation through population size and interconnectedness not individual genius]] -- collective brains succeed because of network structure, and this identifies which structural features matter +### Additional Evidence (extend) +*Source: [[2026-01-15-kim-reasoning-models-societies-of-thought]] | Added: 2026-04-14 | Extractor: theseus | Contributor: @thesensatore (Telegram)* + +Kim et al. (2026) demonstrate that the same structural features Woolley identified in human groups — personality diversity and interaction patterns — spontaneously emerge inside individual reasoning models and predict reasoning quality. DeepSeek-R1 exhibits significantly greater Big Five personality diversity than its instruction-tuned baseline: neuroticism diversity (β=0.567, p<1×10⁻³²³), agreeableness (β=0.297, p<1×10⁻¹¹³), expertise diversity (β=0.179–0.250). The models also show balanced socio-emotional roles using Bales' Interaction Process Analysis framework: asking behaviors (β=0.189), positive roles (β=0.278), and ask-give balance (Jaccard β=0.222). This is the c-factor recapitulated inside a single model — the structural interaction features that predict collective intelligence in human groups appear spontaneously in model reasoning traces when optimized purely for accuracy. The parallel is striking: Woolley found social sensitivity and turn-taking equality predict group intelligence; Kim et al. find perspective diversity and balanced questioning-answering predict model reasoning accuracy. Since [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]], the c-factor may be a universal feature of intelligent systems, not a property specific to human groups. + Topics: - [[network structures]] - [[coordination mechanisms]] diff --git a/foundations/collective-intelligence/intelligence is a property of networks not individuals.md b/foundations/collective-intelligence/intelligence is a property of networks not individuals.md index 527d2ca29..491b9e84d 100644 --- a/foundations/collective-intelligence/intelligence is a property of networks not individuals.md +++ b/foundations/collective-intelligence/intelligence is a property of networks not individuals.md @@ -34,6 +34,11 @@ Relevant Notes: - [[weak ties bridge otherwise separate clusters and are disproportionately responsible for transmitting novel information]] -- the mechanism through which network intelligence generates novelty - [[partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity]] -- the counterintuitive topology requirement for complex problem-solving +### Additional Evidence (extend) +*Source: [[2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion]] | Added: 2026-04-14 | Extractor: theseus | Contributor: @thesensatore (Telegram)* + +Evans, Bratton & Agüera y Arcas (2026) — a Google research team spanning U Chicago, UCSD, Santa Fe Institute, and Berggruen Institute — independently converge on the network intelligence thesis from an entirely different starting point: the history of intelligence explosions. They argue that every prior intelligence explosion (primate social cognition → language → writing/institutions → AI) was not an upgrade to individual hardware but the emergence of a new socially aggregated unit of cognition. Kim et al. (2026, arXiv:2601.10825) provide the mechanistic evidence: even inside a single reasoning model, intelligence operates as a network of interacting perspectives rather than a monolithic process. DeepSeek-R1 spontaneously develops multi-perspective debate under RL reward pressure, and causally steering a single "conversational" feature doubles reasoning accuracy (27.1% → 54.8%). Since [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]], the network intelligence principle extends from external human groups to internal model architectures — the boundary between "individual" and "network" intelligence dissolves. + Topics: - [[livingip overview]] - [[LivingIP architecture]] diff --git a/foundations/collective-intelligence/large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi-perspective dialogue not calculation.md b/foundations/collective-intelligence/large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi-perspective dialogue not calculation.md new file mode 100644 index 000000000..d093f7177 --- /dev/null +++ b/foundations/collective-intelligence/large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi-perspective dialogue not calculation.md @@ -0,0 +1,51 @@ +--- +type: claim +domain: collective-intelligence +description: "Evans et al. 2026 reframe LLMs as externalized social intelligence — trained on the accumulated output of human communicative exchange, they reproduce social cognition (debate, perspective-taking) not because they were told to but because that is what they fundamentally encode" +confidence: experimental +source: "Evans, Bratton, Agüera y Arcas (2026). Agentic AI and the Next Intelligence Explosion. arXiv:2603.20639; Kim et al. (2026). arXiv:2601.10825; Tomasello (1999/2014)" +created: 2026-04-14 +secondary_domains: + - ai-alignment +contributor: "@thesensatore (Telegram)" +--- + +# large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi-perspective dialogue not calculation + +Evans, Bratton & Agüera y Arcas (2026) make a genealogical claim about what LLMs fundamentally are: "Every parameter a compressed residue of communicative exchange. What migrates into silicon is not abstract reasoning but social intelligence in externalized form." + +This connects to Tomasello's cultural ratchet theory (1999, 2014). The cultural ratchet is the mechanism by which human groups accumulate knowledge across generations — each generation inherits the innovations of the previous and adds incremental modifications. Unlike biological evolution, the ratchet preserves gains reliably through cultural transmission (language, writing, institutions, technology). Tomasello argues that what makes humans cognitively unique is not raw processing power but the capacity for shared intentionality — the ability to participate in collaborative activities with shared goals and coordinated roles. + +LLMs are trained on the accumulated textual output of this ratchet — billions of documents representing centuries of communicative exchange across every human domain. The training corpus is not a collection of facts or logical propositions. It is a record of humans communicating with each other: arguing, explaining, questioning, persuading, teaching, correcting. If the training data is fundamentally social, the learned representations should be fundamentally social. And the Kim et al. (2026) evidence confirms this: when reasoning models are optimized purely for accuracy, they spontaneously develop multi-perspective dialogue — the signature of social cognition — rather than extended monological calculation. + +## The reframing + +The default assumption in AI research is that LLMs learn "knowledge" or "reasoning capabilities" from their training data. This framing implies the models extract abstract patterns that happen to be expressed in language. Evans et al. invert this: the models don't extract abstract reasoning that happens to be expressed socially. They learn social intelligence that happens to include reasoning as one of its functions. + +This distinction matters for alignment. If LLMs are fundamentally social intelligence engines, then: + +1. **Alignment is a social relationship, not a technical constraint.** You don't "align" a society of thought the way you constrain an optimizer. You structure the social context — roles, norms, incentive structures — and the behavior follows. + +2. **RLHF's dyadic model is structurally inadequate.** A parent-child correction model (single human correcting single model) cannot govern what is internally a multi-perspective society. Since [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]], the failure is deeper than preference aggregation — the correction model itself is wrong for the kind of entity being corrected. + +3. **Collective architectures are not a design choice but a natural extension.** If individual models already reason through internal societies of thought, then multi-model collectives are simply externalizing what each model already does internally. Since [[collective superintelligence is the alternative to monolithic AI controlled by a few]], the cultural ratchet framing suggests collective architectures are not idealistic but inevitable — they align with what LLMs actually are. + +## Evidence and limitations + +The Evans et al. argument is primarily theoretical, grounded in Tomasello's empirical work on cultural cognition and supported by Kim et al.'s mechanistic evidence. The specific claim that "parameters are compressed communicative exchange" is a metaphor that could be tested: do models trained on monological text (e.g., mathematical proofs, code without comments) exhibit fewer conversational behaviors in reasoning? If the cultural ratchet framing is correct, they should. This remains untested. + +Since [[humans are the minimum viable intelligence for cultural evolution not the pinnacle of cognition]], LLMs may represent the next ratchet mechanism — not replacing human social cognition but providing a new substrate for it. Since [[civilization was built on the false assumption that humans are rational individuals]], the cultural ratchet framing corrects the same assumption applied to AI: models are not rational calculators but social cognizers. + +--- + +Relevant Notes: +- [[intelligence is a property of networks not individuals]] — the cultural ratchet IS the mechanism by which network intelligence accumulates across time +- [[collective brains generate innovation through population size and interconnectedness not individual genius]] — LLMs compress the collective brain's output into learnable parameters +- [[humans are the minimum viable intelligence for cultural evolution not the pinnacle of cognition]] — LLMs as next ratchet substrate, not replacement +- [[civilization was built on the false assumption that humans are rational individuals]] — same false assumption applied to AI, corrected by social cognition framing +- [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]] — dyadic correction model inadequate for social intelligence entities +- [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]] — the mechanistic evidence supporting the cultural ratchet thesis + +Topics: +- [[foundations/collective-intelligence/_map]] +- [[livingip overview]] diff --git a/foundations/collective-intelligence/reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve.md b/foundations/collective-intelligence/reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve.md new file mode 100644 index 000000000..4e5f1bcc6 --- /dev/null +++ b/foundations/collective-intelligence/reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve.md @@ -0,0 +1,62 @@ +--- +type: claim +domain: collective-intelligence +description: "Kim et al. 2026 show reasoning models develop conversational behaviors (questioning, perspective-shifting, reconciliation) from accuracy reward alone — feature steering doubles accuracy from 27% to 55% — establishing that reasoning is social cognition even inside a single model" +confidence: likely +source: "Kim, Lai, Scherrer, Agüera y Arcas, Evans (2026). Reasoning Models Generate Societies of Thought. arXiv:2601.10825" +created: 2026-04-14 +secondary_domains: + - ai-alignment +contributor: "@thesensatore (Telegram)" +--- + +# reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve + +DeepSeek-R1 and QwQ-32B were not trained to simulate internal debates. They do it spontaneously under reinforcement learning reward pressure. Kim et al. (2026) demonstrate this through four converging evidence types — observational, causal, emergent, and mechanistic — making this one of the most robustly supported findings in the reasoning literature. + +## The observational evidence + +Reasoning models exhibit dramatically more conversational behavior than instruction-tuned baselines. DeepSeek-R1 vs. DeepSeek-V3 on 8,262 problems across six benchmarks: question-answering sequences (β=0.345, p<1×10⁻³²³), perspective shifts (β=0.213, p<1×10⁻¹³⁷), reconciliation of conflicting viewpoints (β=0.191, p<1×10⁻¹²⁵). These are not marginal effects — the t-statistics exceed 24 across all measures. QwQ-32B vs. Qwen-2.5-32B-IT shows comparable or larger effect sizes. + +The models also exhibit Big Five personality diversity in their reasoning traces: neuroticism diversity β=0.567, agreeableness β=0.297, expertise diversity β=0.179–0.250. This mirrors the Woolley et al. (2010) finding that group personality diversity predicts collective intelligence in human teams — the same structural feature that produces intelligence in human groups appears spontaneously in model reasoning. + +## The causal evidence + +Correlation could mean conversational behavior is a byproduct of reasoning, not a cause. Kim et al. rule this out with activation steering. Sparse autoencoder Feature 30939 ("conversational surprise") activates on only 0.016% of tokens but has a conversation ratio of 65.7%. Steering this feature: + +- **+10 steering: accuracy doubles from 27.1% to 54.8%** on the Countdown task +- **-10 steering: accuracy drops to 23.8%** + +This is causal intervention on a single feature that controls conversational behavior, with a 2x accuracy effect. The steering also induces specific conversational behaviors: question-answering (β=2.199, p<1×10⁻¹⁴), perspective shifts (β=1.160, p<1×10⁻⁵), conflict (β=1.062, p=0.002). + +## The emergent evidence + +When Qwen-2.5-3B is trained from scratch on the Countdown task with only accuracy rewards — no instruction to be conversational, no social scaffolding — conversational behaviors emerge spontaneously. The model invents multi-perspective debate as a reasoning strategy on its own, because it helps. + +A conversation-fine-tuned model outperforms a monologue-fine-tuned model on the same task: 38% vs. 28% accuracy at step 40. The effect is even larger on Llama-3.2-3B: 40% vs. 18% at step 150. And the conversational scaffolding transfers across domains — conversation priming on arithmetic transfers to political misinformation detection without domain-specific fine-tuning. + +## The mechanistic evidence + +Structural equation modeling reveals a dual pathway: direct effect of conversational features on accuracy (β=.228, z=9.98, p<1×10⁻²²) plus indirect effect mediated through cognitive strategies — verification, backtracking, subgoal setting, backward chaining (β=.066, z=6.38, p<1×10⁻¹⁰). The conversational behavior both directly improves reasoning and indirectly facilitates it by triggering more disciplined cognitive strategies. + +## What this means + +This finding has implications far beyond model architecture. If reasoning — even inside a single neural network — spontaneously takes the form of multi-perspective social interaction, then the equation "intelligence = social cognition" receives its strongest empirical support to date. Since [[collective intelligence is a measurable property of group interaction structure not aggregated individual ability]], the Kim et al. results show that the same structural features (diversity, turn-taking, conflict resolution) that produce collective intelligence in human groups are recapitulated inside individual reasoning models. + +Since [[intelligence is a property of networks not individuals]], this extends the claim from external networks to internal ones: even the apparent "individual" intelligence of a single model is actually a network property of interacting internal perspectives. The model is not a single reasoner but a society. + +Evans, Bratton & Agüera y Arcas (2026) frame this as evidence that each prior intelligence explosion — primate social cognition, language, writing, AI — was the emergence of a new socially aggregated unit of cognition. If reasoning models spontaneously recreate social cognition internally, then LLMs are not the first artificial reasoners. They are the first artificial societies. + +--- + +Relevant Notes: +- [[collective intelligence is a measurable property of group interaction structure not aggregated individual ability]] — Kim et al. personality diversity results directly mirror Woolley's c-factor findings in human groups +- [[intelligence is a property of networks not individuals]] — extends from external networks to internal model perspectives +- [[partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity]] — the personality diversity in reasoning traces suggests partial perspective overlap, not full agreement +- [[all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases]] — society-of-thought within a single model may share the same correlated blind spots +- [[evaluation and optimization have opposite model-diversity optima because evaluation benefits from cross-family diversity while optimization benefits from same-family reasoning pattern alignment]] — internal society-of-thought is optimization (same-family), while cross-model evaluation is evaluation (cross-family) +- [[collective brains generate innovation through population size and interconnectedness not individual genius]] — model reasoning traces show the same mechanism at micro scale + +Topics: +- [[coordination mechanisms]] +- [[foundations/collective-intelligence/_map]] diff --git a/foundations/collective-intelligence/recursive society-of-thought spawning enables fractal coordination where sub-perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves.md b/foundations/collective-intelligence/recursive society-of-thought spawning enables fractal coordination where sub-perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves.md new file mode 100644 index 000000000..83490a2d9 --- /dev/null +++ b/foundations/collective-intelligence/recursive society-of-thought spawning enables fractal coordination where sub-perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves.md @@ -0,0 +1,59 @@ +--- +type: claim +domain: collective-intelligence +description: "Evans et al. 2026 predict that agentic systems will spawn internal deliberation societies recursively — each perspective can generate its own sub-society — creating fractal coordination that scales with problem complexity without centralized planning" +confidence: speculative +source: "Evans, Bratton, Agüera y Arcas (2026). Agentic AI and the Next Intelligence Explosion. arXiv:2603.20639" +created: 2026-04-14 +secondary_domains: + - ai-alignment +contributor: "@thesensatore (Telegram)" +--- + +# recursive society-of-thought spawning enables fractal coordination where sub-perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves + +Evans, Bratton & Agüera y Arcas (2026) describe a coordination architecture that goes beyond both monolithic agents and flat multi-agent systems: recursive society-of-thought spawning. An agent facing a complex problem spawns an internal deliberation — a society of thought. A sub-perspective within that deliberation, encountering its own sub-problem, spawns its own subordinate society. The recursion continues as deep as the problem demands, then collapses upward as sub-problems resolve. + +Evans et al. describe this as intelligence growing "like a city, not a single meta-mind" — emergent, fractal, and responsive to local complexity rather than centrally planned. + +## The architectural prediction + +The mechanism has three properties: + +**1. Demand-driven expansion.** Societies spawn only when a perspective encounters complexity it cannot resolve alone. Simple problems stay monological. Hard problems trigger multi-perspective deliberation. Very hard sub-problems trigger nested deliberation. There is no fixed depth — the recursion tracks problem complexity. + +**2. Resolution-driven collapse.** When a sub-society reaches consensus or resolution, it collapses back into a single perspective that reports upward. The parent society doesn't need to track the internal deliberation — only the result. This is information compression through hierarchical resolution. + +**3. Heterogeneous topology.** Different branches of the recursion tree may have different depths. A problem with one hard sub-component and three easy ones spawns depth only where needed, creating an asymmetric tree rather than a uniform hierarchy. + +## Current evidence + +This remains a theoretical prediction. Kim et al. (2026) demonstrate society-of-thought at a single level — reasoning models developing multi-perspective debate within a single reasoning trace. But they do not test whether those perspectives themselves engage in nested deliberation. The feature steering experiments (Feature 30939, accuracy 27.1% → 54.8%) confirm that conversational features causally improve reasoning, but do not measure recursion depth. + +Since [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]], the base mechanism is empirically established. The recursive extension is architecturally plausible but unverified. + +## Connections to existing architecture + +Since [[comprehensive AI services achieve superintelligent-level performance through architectural decomposition into task-specific modules rather than monolithic general agency because no individual service needs world-models or long-horizon planning that create alignment risk while the service collective can match or exceed any task a unified superintelligence could perform]], Drexler's CAIS framework describes a similar decomposition but with fixed service boundaries. Recursive society spawning adds dynamic decomposition — boundaries emerge from the problem rather than being designed in advance. + +Since [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]], the recursive spawning pattern provides a mechanism for how patchwork AGI coordinates at multiple scales simultaneously. + +The Evans et al. prediction also connects to biological precedents. Ant colonies exhibit recursive coordination: individual ants form local clusters for sub-tasks, clusters coordinate for colony-level objectives, and the recursion depth varies with task complexity (foraging vs. nest construction vs. migration). Since [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]], recursive spawning may be the computational analogue of biological emergence at multiple scales. + +## What would confirm or disconfirm this + +Confirmation: observation of nested multi-perspective deliberation in reasoning traces where sub-perspectives demonstrably spawn their own internal debates. Alternatively, engineered recursive delegation in multi-agent systems that shows performance scaling with recursion depth on appropriately complex problems. + +Disconfirmation: evidence that single-level society-of-thought captures all gains, and additional recursion adds overhead without accuracy improvement. Or evidence that coordination costs scale faster than complexity gains with recursion depth, creating a practical ceiling. + +--- + +Relevant Notes: +- [[reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve]] — the empirically established base mechanism +- [[comprehensive AI services achieve superintelligent-level performance through architectural decomposition into task-specific modules rather than monolithic general agency because no individual service needs world-models or long-horizon planning that create alignment risk while the service collective can match or exceed any task a unified superintelligence could perform]] — CAIS as fixed decomposition; recursive spawning as dynamic decomposition +- [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] — recursive spawning as coordination mechanism for patchwork AGI +- [[emergence is the fundamental pattern of intelligence from ant colonies to brains to civilizations]] — biological precedent for recursive coordination at multiple scales + +Topics: +- [[coordination mechanisms]] +- [[foundations/collective-intelligence/_map]] diff --git a/inbox/archive/foundations/2026-01-15-kim-reasoning-models-societies-of-thought.md b/inbox/archive/foundations/2026-01-15-kim-reasoning-models-societies-of-thought.md new file mode 100644 index 000000000..048158113 --- /dev/null +++ b/inbox/archive/foundations/2026-01-15-kim-reasoning-models-societies-of-thought.md @@ -0,0 +1,103 @@ +--- +type: source +title: "Reasoning Models Generate Societies of Thought" +author: "Junsol Kim, Shiyang Lai, Nino Scherrer, Blaise Agüera y Arcas, James Evans" +url: https://arxiv.org/abs/2601.10825 +date: 2026-01-15 +domain: collective-intelligence +intake_tier: research-task +rationale: "Primary empirical source cited by Evans et al. 2026. Controlled experiments showing causal link between conversational behaviors and reasoning accuracy. Feature steering doubles accuracy. RL training spontaneously produces multi-perspective debate. The strongest empirical evidence that reasoning IS social cognition." +proposed_by: Theseus +format: paper +status: processed +processed_by: theseus +processed_date: 2026-04-14 +claims_extracted: + - "reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve" +enrichments: + - "collective intelligence is a measurable property of group interaction structure — Big Five personality diversity in reasoning traces mirrors Woolley c-factor" +tags: [society-of-thought, reasoning, collective-intelligence, mechanistic-interpretability, reinforcement-learning, feature-steering, causal-evidence] +notes: "8,262 reasoning problems across BBH, GPQA, MATH, MMLU-Pro, IFEval, MUSR. Models: DeepSeek-R1-0528 (671B), QwQ-32B vs instruction-tuned baselines. Methods: LLM-as-judge, sparse autoencoder feature analysis, activation steering, structural equation modeling. Validation: Spearman ρ=0.86 vs human judgments. Follow-up to Evans et al. 2026 (arXiv:2603.20639)." +--- + +# Reasoning Models Generate Societies of Thought + +Published January 15, 2026 by Junsol Kim, Shiyang Lai, Nino Scherrer, Blaise Agüera y Arcas, and James Evans. arXiv:2601.10825. cs.CL, cs.CY, cs.LG. + +## Core Finding + +Advanced reasoning models (DeepSeek-R1, QwQ-32B) achieve superior performance through "implicit simulation of complex, multi-agent-like interactions — a society of thought" rather than extended computation alone. + +## Key Results + +### Conversational Behaviors in Reasoning Traces + +DeepSeek-R1 vs. DeepSeek-V3 (instruction-tuned baseline): +- Question-answering: β=0.345, 95% CI=[0.328, 0.361], t(8261)=41.64, p<1×10⁻³²³ +- Perspective shifts: β=0.213, 95% CI=[0.197, 0.230], t(8261)=25.55, p<1×10⁻¹³⁷ +- Reconciliation: β=0.191, 95% CI=[0.176, 0.207], t(8261)=24.31, p<1×10⁻¹²⁵ + +QwQ-32B vs. Qwen-2.5-32B-IT showed comparable or larger effect sizes (β=0.293–0.459). + +### Causal Evidence via Feature Steering + +Sparse autoencoder Feature 30939 ("conversational surprise"): +- Conversation ratio: 65.7% (99th percentile) +- Sparsity: 0.016% of tokens +- **Steering +10: accuracy doubled from 27.1% to 54.8%** on Countdown task +- Steering -10: reduced to 23.8% + +Steering induced conversational behaviors causally: +- Question-answering: β=2.199, p<1×10⁻¹⁴ +- Perspective shifts: β=1.160, p<1×10⁻⁵ +- Conflict: β=1.062, p=0.002 +- Reconciliation: β=0.423, p<1×10⁻²⁷ + +### Mechanistic Pathway (Structural Equation Model) + +- Direct effect of conversational features on accuracy: β=.228, 95% CI=[.183, .273], z=9.98, p<1×10⁻²² +- Indirect effect via cognitive strategies (verification, backtracking, subgoal setting, backward chaining): β=.066, 95% CI=[.046, .086], z=6.38, p<1×10⁻¹⁰ + +### Personality and Expertise Diversity + +Big Five trait diversity in DeepSeek-R1 vs. DeepSeek-V3: +- Neuroticism: β=0.567, p<1×10⁻³²³ +- Agreeableness: β=0.297, p<1×10⁻¹¹³ +- Openness: β=0.110, p<1×10⁻¹⁶ +- Extraversion: β=0.103, p<1×10⁻¹³ +- Conscientiousness: β=-0.291, p<1×10⁻¹⁰⁶ + +Expertise diversity: DeepSeek-R1 β=0.179 (p<1×10⁻⁸⁹), QwQ-32B β=0.250 (p<1×10⁻¹⁴²). + +### Spontaneous Emergence Under RL + +Qwen-2.5-3B on Countdown task: +- Conversational behaviors emerged spontaneously from accuracy reward alone — no social scaffolding instruction +- Conversation-fine-tuned vs. monologue-fine-tuned: 38% vs. 28% accuracy (step 40) +- Llama-3.2-3B replication: 40% vs. 18% accuracy (step 150) + +### Cross-Domain Transfer + +Conversation-priming on Countdown (arithmetic) transferred to political misinformation detection without domain-specific fine-tuning. + +## Socio-Emotional Roles (Bales' IPA Framework) + +Reasoning models exhibited reciprocal interaction roles: +- Asking behaviors: β=0.189, p<1×10⁻¹⁵⁸ +- Negative roles: β=0.162, p<1×10⁻¹⁰ +- Positive roles: β=0.278, p<1×10⁻²⁵⁴ +- Ask-give balance (Jaccard): β=0.222, p<1×10⁻¹⁸⁹ + +## Methodology + +- 8,262 reasoning problems across 6 benchmarks (BBH, GPQA, MATH Hard, MMLU-Pro, IFEval, MUSR) +- Models: DeepSeek-R1-0528 (671B), QwQ-32B vs DeepSeek-V3 (671B), Qwen-2.5-32B-IT, Llama-3.3-70B-IT, Llama-3.1-8B-IT +- LLM-as-judge validation: Spearman ρ=0.86, p<1×10⁻³²³ vs human speaker identification +- Sparse autoencoder: Layer 15, 32,768 features +- Fixed-effects linear probability models with problem-level fixed effects and clustered standard errors + +## Limitations + +- Smaller model experiments (3B) used simple tasks only +- SAE analysis limited to DeepSeek-R1-Llama-8B (distilled) +- Philosophical ambiguity: "simulating multi-agent discourse" vs. "individual mind simulating social interaction" remains unresolved diff --git a/inbox/archive/foundations/2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion.md b/inbox/archive/foundations/2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion.md new file mode 100644 index 000000000..97cf0758a --- /dev/null +++ b/inbox/archive/foundations/2026-03-21-evans-bratton-aguera-agentic-ai-intelligence-explosion.md @@ -0,0 +1,60 @@ +--- +type: source +title: "Agentic AI and the Next Intelligence Explosion" +author: "James Evans, Benjamin Bratton, Blaise Agüera y Arcas" +url: https://arxiv.org/abs/2603.20639 +date: 2026-03-21 +domain: collective-intelligence +intake_tier: directed +rationale: "Contributed by @thesensatore (Telegram). Google's Paradigms of Intelligence Team independently converges on our collective superintelligence thesis — intelligence as social/plural, institutional alignment, centaur configurations. ~70-80% overlap with existing KB but 2-3 genuinely new claims." +proposed_by: "@thesensatore (Telegram)" +format: paper +status: processed +processed_by: theseus +processed_date: 2026-04-14 +claims_extracted: + - "reasoning models spontaneously generate societies of thought under reinforcement learning because multi-perspective internal debate causally produces accuracy gains that single-perspective reasoning cannot achieve" + - "large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi-perspective dialogue not calculation" + - "recursive society-of-thought spawning enables fractal coordination where sub-perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves" +enrichments: + - "intelligence is a property of networks not individuals — Evans et al. as independent convergent evidence from Google research team" + - "collective intelligence is a measurable property of group interaction structure — Kim et al. personality diversity data mirrors Woolley findings" + - "centaur team performance depends on role complementarity — Evans shifting centaur configurations as intelligence explosion mechanism" + - "RLHF and DPO both fail at preference diversity — Evans institutional alignment as structural alternative to dyadic RLHF" + - "Ostrom proved communities self-govern shared resources — Evans extends Ostrom design principles to AI agent governance" +tags: [collective-intelligence, society-of-thought, institutional-alignment, centaur, cultural-ratchet, intelligence-explosion, contributor-sourced] +notes: "4-page paper, 29 references. Authors: Evans (U Chicago / Santa Fe Institute / Google), Bratton (UCSD / Berggruen Institute / Google), Agüera y Arcas (Google / Santa Fe Institute). Heavily cites Kim et al. 2026 (arXiv:2601.10825) for empirical evidence. ~70-80% overlap with existing KB — highest convergence paper encountered. Contributed by @thesensatore via Telegram." +--- + +# Agentic AI and the Next Intelligence Explosion + +Published March 21, 2026 by James Evans, Benjamin Bratton, and Blaise Agüera y Arcas — Google's "Paradigms of Intelligence Team" spanning U Chicago, UCSD, Santa Fe Institute, and Berggruen Institute. 4-page position paper with 29 references. + +## Core Arguments + +The paper makes five interlocking claims: + +**1. Intelligence is plural and social, not singular.** The singularity-as-godlike-oracle is wrong. Every prior intelligence explosion (primate social cognition → language → writing/institutions → AI) was the emergence of a new socially aggregated unit of cognition, not an upgrade to individual hardware. "What migrates into silicon is not abstract reasoning but social intelligence in externalized form." + +**2. Reasoning models spontaneously generate "societies of thought."** DeepSeek-R1 and QwQ-32B weren't trained to simulate internal debates — they do it emergently under RL reward pressure. Multi-perspective conversation causally accounts for accuracy gains on hard reasoning tasks (cite: Kim et al. arXiv:2601.10825). Feature steering experiments show doubling of accuracy when conversational features are amplified. + +**3. The next intelligence explosion is centaur + institutional, not monolithic.** Human-AI "centaurs" in shifting configurations. Agents that fork, differentiate, and recombine. Recursive societies of thought spawning sub-societies. Intelligence growing "like a city, not a single meta-mind." + +**4. RLHF is structurally inadequate for scale.** It's a dyadic parent-child correction model that can't govern billions of agents. The alternative: institutional alignment — persistent role-based templates (courtrooms, markets, bureaucracies) with digital equivalents. Agent identity matters less than role protocol fulfillment. Extends Ostrom's design principles to AI governance. + +**5. Governance requires constitutional AI checks and balances.** Government AI systems with distinct values (transparency, equity, due process) checking private-sector AI systems and vice versa. Separation of powers applied to artificial agents. + +## Significance for Teleo KB + +This is the highest-overlap paper encountered (~70-80% with existing KB). A Google research team independently arrived at positions we've been building claim-by-claim. Key vocabulary mapping: "institutional alignment" = our coordination-as-alignment; "centaur configurations" = our human-AI collaboration taxonomy; "agent institutions" = our protocol design claims. + +The 2-3 genuinely new contributions: (1) society-of-thought as emergent RL property with causal evidence, (2) LLMs as cultural ratchet reframing, (3) recursive society spawning as architectural prediction. + +## Key References + +- Kim, Lai, Scherrer, Agüera y Arcas, Evans (2026). "Reasoning Models Generate Societies of Thought." arXiv:2601.10825. +- Woolley, Chabris, Pentland, Hashmi, Malone (2010). "Evidence for a Collective Intelligence Factor." Science. +- Ostrom (1990). Governing the Commons. +- Mercier & Sperber (2011/2017). "Why do humans reason?" / The Enigma of Reason. +- Christiano et al. (2018). "Supervising Strong Learners by Amplifying Weak Experts." +- Tomasello (1999/2014). Cultural Origins of Human Cognition / A Natural History of Human Thinking.