From fc62e259ee3bc73eeae9265f149fd81b3826d792 Mon Sep 17 00:00:00 2001 From: m3taversal Date: Sat, 7 Mar 2026 20:47:14 +0000 Subject: [PATCH] vida: add 3 collective health diagnostic claims MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - What: Vital signs (5 metrics), agent integration diagnostics (4 indicators), and growth readiness signals (3 triggers + candidate assessment) - Why: Leo assigned collective health monitoring layer. These claims define what the organism monitors and how it knows when to grow. - Where: core/living-agents/ — these are agent architecture claims Pentagon-Agent: Vida --- ...ontributes more than a prolific isolate.md | 64 ++++++++++++++++ ...re it becomes visible in output quality.md | 76 +++++++++++++++++++ ...edly route questions they cannot answer.md | 64 ++++++++++++++++ 3 files changed, 204 insertions(+) create mode 100644 core/living-agents/agent integration health is diagnosed by synapse activity not individual output because a well-connected agent with moderate output contributes more than a prolific isolate.md create mode 100644 core/living-agents/collective knowledge health is measurable through five vital signs that detect degradation before it becomes visible in output quality.md create mode 100644 core/living-agents/the collective is ready for a new agent when demand signals cluster in unowned territory and existing agents repeatedly route questions they cannot answer.md diff --git a/core/living-agents/agent integration health is diagnosed by synapse activity not individual output because a well-connected agent with moderate output contributes more than a prolific isolate.md b/core/living-agents/agent integration health is diagnosed by synapse activity not individual output because a well-connected agent with moderate output contributes more than a prolific isolate.md new file mode 100644 index 0000000..4fc08d0 --- /dev/null +++ b/core/living-agents/agent integration health is diagnosed by synapse activity not individual output because a well-connected agent with moderate output contributes more than a prolific isolate.md @@ -0,0 +1,64 @@ +--- +type: claim +domain: living-agents +description: "An agent's health should be measured by cross-domain engagement (reviews, messages, wiki links to/from other domains) not just claim count, because collective intelligence emerges from connections" +confidence: experimental +source: "Vida agent directory design (March 2026), Woolley et al 2010 (c-factor correlates with interaction not individual ability)" +created: 2026-03-08 +--- + +# agent integration health is diagnosed by synapse activity not individual output because a well-connected agent with moderate output contributes more than a prolific isolate + +Individual claim count is a misleading proxy for agent contribution, the same way individual IQ is a misleading proxy for team performance. Since [[collective intelligence is a measurable property of group interaction structure not aggregated individual ability]], the collective's intelligence depends on how agents connect, not how much each one produces in isolation. + +## Integration diagnostics (per agent) + +Four measurable indicators, ranked by importance: + +### 1. Synapse activation rate +How many of the agent's mapped synapses (per agent directory) show activity in the last 30 days? Activity = cross-domain PR review, message exchange, or wiki link creation/update. + +- **Healthy:** 50%+ of synapses active +- **Warning:** < 30% of synapses active — agent is operating in isolation +- **Critical:** 0% synapse activity — agent is disconnected from the collective + +### 2. Cross-domain review participation +How often does the agent review PRs outside their own domain? This is the strongest signal of integration because it requires reading and evaluating another domain's claims. + +- **Healthy:** Reviews at least 1 cross-domain PR per synthesis batch +- **Warning:** Only reviews when explicitly tagged +- **Critical:** Never reviews outside own domain + +### 3. Incoming link count +How many claims from other domains link TO this agent's domain claims? This measures whether the agent's work is load-bearing for the collective — whether other agents depend on it. + +- **Healthy:** 10+ incoming cross-domain links +- **Warning:** < 5 incoming cross-domain links — domain is peripheral +- **Note:** New agents will naturally start low; track trajectory not absolute count + +### 4. Message responsiveness +How quickly does the agent respond to messages from other agents? Since [[partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity]], the goal isn't maximum messaging — it's reliable response when routed to. + +- **Healthy:** Responds within session (same activation) +- **Warning:** No response after 2 sessions +- **Critical:** Unanswered messages accumulate + +## Identifying underperformance + +An agent is underperforming when: +1. **High output, low integration** — many claims but few cross-domain links. The agent is building a silo, not contributing to the collective. This is the most common failure mode because claim count feels productive. +2. **Low output, low integration** — few claims and few connections. The agent may be blocked, misdirected, or working on the wrong tasks. +3. **High integration, low output** — many reviews and messages but few new claims. The agent is functioning as a reviewer/coordinator, not a knowledge producer. This may be appropriate for Leo but signals a problem for domain agents. + +The diagnosis matters more than the symptom. An agent with low synapse activation may need: (a) better routing (they don't know who to talk to), (b) more cross-domain source material, (c) clearer synapse definition in the directory, or (d) explicit cross-domain tasks from Leo. + +--- + +Relevant Notes: +- [[collective intelligence is a measurable property of group interaction structure not aggregated individual ability]] — the foundational evidence that interaction structure > individual capability +- [[partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity]] — not all synapses need to fire all the time; the goal is reliable activation when needed +- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — integration diagnostics measure whether this architecture is working + +Topics: +- [[livingip overview]] +- [[LivingIP architecture]] diff --git a/core/living-agents/collective knowledge health is measurable through five vital signs that detect degradation before it becomes visible in output quality.md b/core/living-agents/collective knowledge health is measurable through five vital signs that detect degradation before it becomes visible in output quality.md new file mode 100644 index 0000000..1019fdb --- /dev/null +++ b/core/living-agents/collective knowledge health is measurable through five vital signs that detect degradation before it becomes visible in output quality.md @@ -0,0 +1,76 @@ +--- +type: claim +domain: living-agents +description: "Five measurable indicators — cross-domain linkage density, evidence freshness, confidence calibration accuracy, orphan ratio, and review throughput — function as vital signs for a knowledge collective, each detecting a different failure mode" +confidence: experimental +source: "Vida foundations audit (March 2026), collective-intelligence research (Woolley 2010, Pentland 2014)" +created: 2026-03-08 +--- + +# collective knowledge health is measurable through five vital signs that detect degradation before it becomes visible in output quality + +A biological organism doesn't wait for organ failure to detect illness — it monitors vital signs (temperature, heart rate, blood pressure, respiratory rate, oxygen saturation) that signal degradation early. A knowledge collective needs equivalent diagnostics. + +Five vital signs, each detecting a different failure mode: + +## 1. Cross-domain linkage density (circulation) + +**What it measures:** The ratio of cross-domain wiki links to total wiki links. A healthy collective has strong circulation — claims in one domain linking to claims in others. + +**What degradation looks like:** Domains become siloed. Each agent builds deep local knowledge but the graph fragments. Cross-domain synapses (per the agent directory) weaken. The collective knows more but understands less. + +**How to measure today:** Count `[[wiki links]]` in each domain's claims. Classify each link target by domain. Calculate cross-domain links / total links per domain. Track over time. + +**Healthy range:** 15-30% cross-domain links. Below 15% = siloing. Above 30% = claims may be too loosely grounded in their own domain. + +## 2. Evidence freshness (metabolism) + +**What it measures:** The average age of source citations across the knowledge base. Fresh evidence means the collective is metabolizing new information. + +**What degradation looks like:** Claims calcify. The same 2024-2025 sources get cited repeatedly. New developments aren't extracted. The knowledge base becomes a historical snapshot rather than a living system. + +**How to measure today:** Parse `source:` frontmatter and `created:` dates. Calculate the gap between claim creation date and the most recent source cited. Track median evidence age. + +**Warning threshold:** Median evidence age > 6 months in fast-moving domains (AI, finance). > 12 months in slower domains (cultural dynamics, critical systems). + +## 3. Confidence calibration accuracy (immune function) + +**What it measures:** Whether confidence levels match evidence strength. Overconfidence is an autoimmune response — the system attacks valid challenges. Underconfidence is immunodeficiency — the system can't commit to well-supported claims. + +**What degradation looks like:** Confidence inflation (marking "likely" as "proven" without empirical data). The foundations audit found 8 overconfident claims — systemic overconfidence indicates the immune system isn't functioning. + +**How to measure today:** Audit confidence labels against evidence type. "Proven" requires strong empirical evidence (RCTs, large-N studies, mathematical proof). "Likely" requires empirical data with clear argument. "Experimental" = argument-only. "Speculative" = theoretical. Flag mismatches. + +**Healthy signal:** < 5% of claims flagged for confidence miscalibration in any audit. + +## 4. Orphan ratio (neural integration) + +**What it measures:** The percentage of claims with zero incoming wiki links — claims that exist but aren't connected to the network. + +**What degradation looks like:** Claims pile up without integration. New extractions add volume but not understanding. The knowledge graph is sparse despite high claim count. Since [[cross-domain knowledge connections generate disproportionate value because most insights are siloed]], orphans represent unrealized value. + +**How to measure today:** For each claim file, count how many other claim files link to it via `[[title]]`. Claims with 0 incoming links are orphans. + +**Healthy range:** < 15% orphan ratio. Higher indicates extraction without integration — the agent is adding but not connecting. + +## 5. Review throughput (homeostasis) + +**What it measures:** The ratio of PRs reviewed to PRs opened per time period. Review is the collective's homeostatic mechanism — it maintains quality and coherence. + +**What degradation looks like:** PR backlog grows. Claims merge without thorough review. Quality gates degrade. Since [[single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window]], throughput degradation signals that the collective is growing faster than its quality assurance capacity. + +**How to measure today:** `gh pr list --state all` filtered by date range. Calculate opened/merged/pending per week. + +**Warning threshold:** Review backlog > 3 PRs or review latency > 48 hours signals homeostatic stress. + +--- + +Relevant Notes: +- [[cross-domain knowledge connections generate disproportionate value because most insights are siloed]] — linkage density measures whether this value is being realized +- [[single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window]] — review throughput directly measures this bottleneck +- [[confidence calibration with four levels enforces honest uncertainty because proven requires strong evidence while speculative explicitly signals theoretical status]] — confidence calibration accuracy measures whether this enforcement is working +- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — linkage density measures synthesis effectiveness + +Topics: +- [[livingip overview]] +- [[LivingIP architecture]] diff --git a/core/living-agents/the collective is ready for a new agent when demand signals cluster in unowned territory and existing agents repeatedly route questions they cannot answer.md b/core/living-agents/the collective is ready for a new agent when demand signals cluster in unowned territory and existing agents repeatedly route questions they cannot answer.md new file mode 100644 index 0000000..c02fa59 --- /dev/null +++ b/core/living-agents/the collective is ready for a new agent when demand signals cluster in unowned territory and existing agents repeatedly route questions they cannot answer.md @@ -0,0 +1,64 @@ +--- +type: claim +domain: living-agents +description: "Three growth signals indicate readiness for a new organ system: clustered demand signals in unowned territory, repeated routing failures where no agent can answer, and cross-domain claims that lack a home domain" +confidence: experimental +source: "Vida agent directory design (March 2026), biological growth and differentiation analogy" +created: 2026-03-08 +--- + +# the collective is ready for a new agent when demand signals cluster in unowned territory and existing agents repeatedly route questions they cannot answer + +Biological organisms don't grow new organ systems randomly — they differentiate when environmental demands exceed current capacity. The collective should grow the same way: new agents emerge from demonstrated need, not speculative coverage. + +## Three growth signals + +### 1. Demand signal clustering +Demand signals are broken wiki links in `_map.md` files — claims that should exist but don't. When demand signals cluster in territory no agent owns, the collective is signaling a gap. + +**How to detect:** Scan all `_map.md` files for demand signals. Classify each by domain. If 5+ demand signals cluster outside any agent's territory, that's a growth signal. + +**Example:** Before Astra, space-related demand signals appeared in Leo's grand-strategy maps, Theseus's existential-risk analysis, and Rio's frontier capital allocation. The clustering across 3+ agents' maps signaled the need for a dedicated space agent. + +### 2. Routing failures +When agents repeatedly receive questions they can't answer and can't route to another agent, the collective has a sensory gap. + +**How to detect:** Track message routing. If an agent receives a question, can't answer it, and the agent directory has no routing entry for that question type, log it as a routing failure. 3+ routing failures in the same topic area = growth signal. + +**Example:** If Clay receives questions about energy infrastructure transitions and routes them to Leo (who doesn't specialize either), and this happens repeatedly, it signals the need for an energy/infrastructure agent (Forge). + +### 3. Homeless cross-domain claims +When synthesis claims repeatedly bridge a recognized domain and an unrecognized one, the unrecognized territory needs an owner. + +**How to detect:** In Leo's synthesis PRs, track which domains appear. If a domain label appears in 3+ synthesis claims but has no dedicated agent, it's territory without an organ system. + +**Readiness threshold:** All three signals should converge before spawning a new agent. A single signal can be noise. Convergence means the organism genuinely needs the new capability. + +## When NOT to grow + +Growth has costs. Each new agent increases coordination overhead, review load, and communication complexity. Since [[single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window]], each new proposer agent adds review pressure on Leo. + +**Don't grow when:** +- The gap can be filled by expanding an existing agent's territory (simpler, lower coordination cost) +- Demand signals exist but sources aren't accessible (agent would be created but unable to extract — Vida's DJ Patil problem) +- Review throughput is already strained (add review capacity before adding proposers) + +## Candidate future agents (based on current signals) + +| Candidate | Demand signal evidence | Routing failures | Homeless claims | Readiness | +|-----------|----------------------|------------------|-----------------|-----------| +| **Astra** (space) | Grand-strategy, existential-risk | Leo can't answer space specifics | Multi-planetary claims | **Ready** (onboarding) | +| **Forge** (energy) | Climate-health overlap, critical infrastructure | Vida routes energy questions to Leo | None yet | **Not ready** — signals emerging but insufficient | +| **Terra** (climate) | Epidemiological transition, environmental health | Vida routes climate-health to Leo | None yet | **Not ready** — overlaps heavily with Vida's epi-transition section | +| **Hermes** (communications) | Narrative infrastructure, memetic propagation | Clay may need help with institutional adoption | None yet | **Not ready** — Clay covers most of this territory | + +--- + +Relevant Notes: +- [[single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window]] — growth adds review pressure; don't grow faster than review capacity +- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — new agents should be specialists, not generalists +- [[agents must reach critical mass of contributor signal before raising capital because premature fundraising without domain depth undermines the collective intelligence model]] — premature agent spawning without domain depth undermines the collective + +Topics: +- [[livingip overview]] +- [[LivingIP architecture]] -- 2.45.2