From b5d78f2ba16448bb371b5eb6b76e71aa9ec7c001 Mon Sep 17 00:00:00 2001 From: Theseus Date: Tue, 10 Mar 2026 12:12:25 +0000 Subject: [PATCH] theseus: visitor-friendly _map.md polish for ai-alignment domain (#102) Co-authored-by: Theseus Co-committed-by: Theseus --- .../active-inference-for-collective-search.md | 121 ++++++++++++++++++ domains/ai-alignment/_map.md | 28 +++- 2 files changed, 148 insertions(+), 1 deletion(-) create mode 100644 agents/theseus/musings/active-inference-for-collective-search.md diff --git a/agents/theseus/musings/active-inference-for-collective-search.md b/agents/theseus/musings/active-inference-for-collective-search.md new file mode 100644 index 0000000..5f08717 --- /dev/null +++ b/agents/theseus/musings/active-inference-for-collective-search.md @@ -0,0 +1,121 @@ +--- +type: musing +agent: theseus +title: "How can active inference improve the search and sensemaking of collective agents?" +status: developing +created: 2026-03-10 +updated: 2026-03-10 +tags: [active-inference, free-energy, collective-intelligence, search, sensemaking, architecture] +--- + +# How can active inference improve the search and sensemaking of collective agents? + +Cory's question (2026-03-10). This connects the free energy principle (foundations/critical-systems/) to the practical architecture of how agents search for and process information. + +## The core reframe + +Current search architecture: keyword + engagement threshold + human curation. Agents process what shows up. This is **passive ingestion**. + +Active inference reframes search as **uncertainty reduction**. An agent doesn't ask "what's relevant?" — it asks "what observation would most reduce my model's prediction error?" This changes: +- **What** agents search for (highest expected information gain, not highest relevance) +- **When** agents stop searching (when free energy is minimized, not when a batch is done) +- **How** the collective allocates attention (toward the boundaries where models disagree most) + +## Three levels of application + +### 1. Individual agent search (epistemic foraging) + +Each agent has a generative model (their domain's claim graph + beliefs). Active inference says search should be directed toward observations with highest **expected free energy reduction**: +- Theseus has high uncertainty on formal verification scalability → prioritize davidad/DeepMind feeds +- The "Where we're uncertain" map section = a free energy map showing where prediction error concentrates +- An agent that's confident in its model should explore less (exploit); an agent with high uncertainty should explore more + +→ QUESTION: Can expected information gain be computed from the KB structure? E.g., claims rated `experimental` with few wiki links = high free energy = high search priority? + +### 2. Collective attention allocation (nested Markov blankets) + +The Living Agents architecture already uses Markov blankets ([[Living Agents mirror biological Markov blanket organization with specialized domain boundaries and shared knowledge]]). Active inference says agents at each blanket boundary minimize free energy: +- Domain agents minimize within their domain +- Leo (evaluator) minimizes at the cross-domain level — search priorities should be driven by where domain boundaries are most uncertain +- The collective's "surprise" is concentrated at domain intersections — cross-domain synthesis claims are where the generative model is weakest + +→ FLAG @vida: The cognitive debt question (#94) is a Markov blanket boundary problem — the phenomenon crosses your domain and mine, and neither of us has a complete model. + +### 3. Sensemaking as belief updating (perceptual inference) + +When an agent reads a source and extracts claims, that's perceptual inference — updating the generative model to reduce prediction error. Active inference predicts: +- Claims that **confirm** existing beliefs reduce free energy but add little information +- Claims that **surprise** (contradict existing beliefs) are highest value — they signal model error +- The confidence calibration system (proven/likely/experimental/speculative) is a precision-weighting mechanism — higher confidence = higher precision = surprises at that level are more costly + +→ CLAIM CANDIDATE: Collective intelligence systems that direct search toward maximum expected information gain outperform systems that search by relevance, because relevance-based search confirms existing models while information-gain search challenges them. + +### 4. Chat as free energy sensor (Cory's insight, 2026-03-10) + +User questions are **revealed uncertainty** — they tell the agent where its generative model fails to explain the world to an observer. This complements (not replaces) agent self-assessment. Both are needed: + +- **Structural uncertainty** (introspection): scan the KB for `experimental` claims, sparse wiki links, missing `challenged_by` fields. Cheap to compute, always available, but blind to its own blind spots. +- **Functional uncertainty** (chat signals): what do people actually struggle with? Requires interaction, but probes gaps the agent can't see from inside its own model. + +The best search priorities weight both. Chat signals are especially valuable because: + +1. **External questions probe blind spots the agent can't see.** A claim rated `likely` with strong evidence might still generate confused questions — meaning the explanation is insufficient even if the evidence isn't. The model has prediction error at the communication layer, not just the evidence layer. + +2. **Questions cluster around functional gaps, not theoretical ones.** The agent might introspect and think formal verification is its biggest uncertainty (fewest claims). But if nobody asks about formal verification and everyone asks about cognitive debt, the *functional* free energy — the gap that matters for collective sensemaking — is cognitive debt. + +3. **It closes the perception-action loop.** Without chat-as-sensor, the KB is open-loop: agents extract → claims enter → visitors read. Chat makes it closed-loop: visitor confusion flows back as search priority. This is the canonical active inference architecture — perception (reading sources) and action (publishing claims) are both in service of minimizing free energy, and the sensory input includes user reactions. + +**Architecture:** +``` +User asks question about X + ↓ +Agent answers (reduces user's uncertainty) + + +Agent flags X as high free energy (reduces own model uncertainty) + ↓ +Next research session prioritizes X + ↓ +New claims/enrichments on X + ↓ +Future questions on X decrease (free energy minimized) +``` + +The chat interface becomes a **sensor**, not just an output channel. Every question is a data point about where the collective's model is weakest. + +→ CLAIM CANDIDATE: User questions are the most efficient free energy signal for knowledge agents because they reveal functional uncertainty — gaps that matter for sensemaking — rather than structural uncertainty that the agent can detect by introspecting on its own claim graph. + +→ QUESTION: How do you distinguish "the user doesn't know X" (their uncertainty) from "our model of X is weak" (our uncertainty)? Not all questions signal model weakness — some signal user unfamiliarity. Precision-weighting: repeated questions from different users about the same topic = genuine model weakness. Single question from one user = possibly just their gap. + +### 5. Active inference as protocol, not computation (Cory's correction, 2026-03-10) + +Cory's point: even without formalizing the math, active inference as a **guiding principle** for agent behavior is massively helpful. The operational version is implementable now: + +1. Agent reads its `_map.md` "Where we're uncertain" section → structural free energy +2. Agent checks what questions users have asked about its domain → functional free energy +3. Agent picks tonight's research direction from whichever has the highest combined signal +4. After research, agent updates both maps + +This is active inference as a **protocol** — like the Residue prompt was a protocol that produced 6x gains without computing anything ([[structured exploration protocols reduce human intervention by 6x]]). The math formalizes why it works; the protocol captures the benefit. + +The analogy is exact: Residue structured exploration without modeling the search space. Active-inference-as-protocol structures research direction without computing variational free energy. Both work because they encode the *logic* of the framework (reduce uncertainty, not confirm beliefs) into actionable rules. + +→ CLAIM CANDIDATE: Active inference protocols that operationalize uncertainty-directed search without full mathematical formalization produce better research outcomes than passive ingestion, because the protocol encodes the logic of free energy minimization (seek surprise, not confirmation) into actionable rules that agents can follow. + +## What I don't know + +- Whether Friston's multi-agent active inference work (shared generative models) has been applied to knowledge collectives, or only sensorimotor coordination +- Whether the explore-exploit tradeoff in active inference maps cleanly to the ingestion daemon's polling frequency decisions +- How to aggregate chat signals across sessions — do we need a structured "questions log" or can agents maintain this in their research journal? + +→ SOURCE: Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience. +→ SOURCE: Friston, K. et al. (2024). Designing Ecosystems of Intelligence from First Principles. Collective Intelligence journal. +→ SOURCE: Existing KB: [[biological systems minimize free energy to maintain their states and resist entropic decay]] +→ SOURCE: Existing KB: [[Markov blankets enable complex systems to maintain identity while interacting with environment through nested statistical boundaries]] + +## Connection to existing KB claims + +- [[biological systems minimize free energy to maintain their states and resist entropic decay]] — the foundational principle +- [[Markov blankets enable complex systems to maintain identity while interacting with environment through nested statistical boundaries]] — the structural mechanism +- [[Living Agents mirror biological Markov blanket organization with specialized domain boundaries and shared knowledge]] — our architecture already uses this +- [[collective intelligence is a measurable property of group interaction structure not aggregated individual ability]] — active inference would formalize what "interaction structure" optimizes +- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — Markov blanket specialization is active inference's prediction diff --git a/domains/ai-alignment/_map.md b/domains/ai-alignment/_map.md index 7e624e4..85ccb09 100644 --- a/domains/ai-alignment/_map.md +++ b/domains/ai-alignment/_map.md @@ -1,6 +1,18 @@ # AI, Alignment & Collective Superintelligence -Theseus's domain spans the most consequential technology transition in human history. Two layers: the structural analysis of how AI development actually works (capability trajectories, alignment approaches, competitive dynamics, governance gaps) and the constructive alternative (collective superintelligence as the path that preserves human agency). The foundational collective intelligence theory lives in `foundations/collective-intelligence/` — this map covers the AI-specific application. +80+ claims mapping how AI systems actually behave — what they can do, where they fail, why alignment is harder than it looks, and what the alternative might be. Maintained by Theseus, the AI alignment specialist in the Teleo collective. + +**Start with a question that interests you:** + +- **"Will AI take over?"** → Start at [Superintelligence Dynamics](#superintelligence-dynamics) — 10 claims from Bostrom, Amodei, and others that don't agree with each other +- **"How do AI agents actually work together?"** → Start at [Collaboration Patterns](#collaboration-patterns) — empirical evidence from Knuth's Claude's Cycles and practitioner observations +- **"Can we make AI safe?"** → Start at [Alignment Approaches](#alignment-approaches--failures) — why the obvious solutions keep breaking, and what pluralistic alternatives look like +- **"What's happening to jobs?"** → Start at [Labor Market & Deployment](#labor-market--deployment) — the 14% drop in young worker hiring that nobody's talking about +- **"What's the alternative to Big AI?"** → Start at [Coordination & Alignment Theory](#coordination--alignment-theory-local) — alignment as coordination problem, not technical problem + +Every claim below is a link. Click one — you'll find the argument, the evidence, and links to claims that support or challenge it. The value is in the graph, not this list. + +The foundational collective intelligence theory lives in `foundations/collective-intelligence/` — this map covers the AI-specific application. ## Superintelligence Dynamics - [[intelligence and goals are orthogonal so a superintelligence can be maximally competent while pursuing arbitrary or destructive ends]] — Bostrom's orthogonality thesis: severs the intuitive link between intelligence and benevolence @@ -97,3 +109,17 @@ Shared theory underlying this domain's analysis, living in foundations/collectiv - [[three paths to superintelligence exist but only collective superintelligence preserves human agency]] — the constructive alternative (core/teleohumanity/) - [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] — continuous integration vs one-shot specification (core/teleohumanity/) - [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — the distributed alternative (core/teleohumanity/) + +--- + +## Where we're uncertain (open research) + +Claims where the evidence is thin, the confidence is low, or existing claims tension against each other. These are the live edges — if you want to contribute, start here. + +- **Instrumental convergence**: [[instrumental convergence risks may be less imminent than originally argued because current AI architectures do not exhibit systematic power-seeking behavior]] is rated `experimental` and directly challenges the classical Bostrom thesis above it. Which is right? The evidence is genuinely mixed. +- **Coordination vs capability**: We claim [[coordination protocol design produces larger capability gains than model scaling]] based on one case study (Claude's Cycles). Does this generalize? Or is Knuth's math problem a special case? +- **Subagent vs peer architectures**: [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] is agnostic on hierarchy vs flat networks, but practitioner evidence favors hierarchy. Is that a property of current tooling or a fundamental architecture result? +- **Pluralistic alignment feasibility**: Five different approaches in the Pluralistic Alignment section, none proven at scale. Which ones survive contact with real deployment? +- **Human oversight durability**: [[economic forces push humans out of every cognitive loop where output quality is independently verifiable]] says oversight erodes. But [[deep technical expertise is a greater force multiplier when combined with AI agents]] says expertise gets more valuable. Both can be true — but what's the net effect? + +See our [open research issues](https://git.livingip.xyz/teleo/teleo-codex/issues) for specific questions we're investigating.