diff --git a/core/living-agents/_map.md b/core/living-agents/_map.md index e8794ab..30ba401 100644 --- a/core/living-agents/_map.md +++ b/core/living-agents/_map.md @@ -35,6 +35,11 @@ The architecture follows biological organization: nested Markov blankets with sp - [[musings as pre-claim exploratory space let agents develop ideas without quality gate pressure because seeds that never mature are information not waste]] — exploratory layer - [[atomic notes with one claim per file enable independent evaluation and granular linking because bundled claims force reviewers to accept or reject unrelated propositions together]] — atomic structure +## Operational Failure Modes (where the system breaks today) +- [[single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window]] — the scaling constraint +- [[all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases]] — the invisible quality ceiling +- [[social enforcement of architectural rules degrades under tool pressure because automated systems that bypass conventions accumulate violations faster than review can catch them]] — why CI-as-enforcement is urgent + ## Ownership & Attribution - [[ownership alignment turns network effects from extractive to generative]] — the ownership insight - [[living agents transform knowledge sharing from a cost center into an ownership-generating asset]] — why people contribute diff --git a/core/living-agents/all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases.md b/core/living-agents/all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases.md new file mode 100644 index 0000000..1ad837e --- /dev/null +++ b/core/living-agents/all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases.md @@ -0,0 +1,65 @@ +--- +type: claim +domain: living-agents +description: "Every agent in the Teleo collective runs on Claude — proposers, evaluators, and synthesizer share the same training data, RLHF preferences, and systematic blind spots, which means adversarial review is less adversarial than it appears" +confidence: likely +source: "Teleo collective operational evidence — all 5 active agents on Claude, 0 cross-model reviews in 44 PRs" +created: 2026-03-07 +--- + +# All agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposer's training biases + +The Teleo collective's adversarial PR review separates proposer from evaluator — but both roles run on Claude. This means the review process catches errors of execution (wrong citations, overstated confidence, missing links) but cannot catch errors of perspective (systematic biases in what the model considers important, what evidence it finds compelling, what conclusions it reaches from ambiguous data). + +## How it fails today + +All 5 active agents (Leo, Rio, Clay, Vida, Theseus) run on Claude. When Rio proposes a claim and Leo reviews it, the review checks structural quality, evidence strength, and cross-domain connections. But it cannot check whether both agents share a systematic bias toward, for example: +- Overweighting narrative coherence over statistical evidence +- Favoring certain intellectual frameworks (complexity theory, Christensen disruption) over others +- Consistently assigning "likely" confidence where "experimental" would be more honest +- Finding cross-domain connections that are linguistically similar but mechanistically distinct + +The evidence is negative — we cannot point to a specific error that was caught by model diversity, because we have never had model diversity. The absence of evidence is itself the concern: we don't know what we're missing. + +However, indirect evidence suggests the problem is real: + +- **The 11 synthesis claims all follow a similar argumentative structure.** They identify a mechanism in domain A, find an analogue in domain B, and argue the shared mechanism is real. A different model family might generate synthesis claims with different structures — e.g., identifying contradictions between domains rather than parallels, or finding claims in one domain that invalidate assumptions in another. +- **Confidence calibration clusters around "likely" and "experimental."** Of the knowledge base's ~120 claims, the distribution skews toward these middle categories. A model with different training priors might assign "speculative" more freely to claims that Claude's training treats as mainstream (e.g., complexity theory applications to economics). +- **No claim in the knowledge base contradicts a position held by Claude's training data consensus.** This is hard to verify without a second model, but the absence of contrarian claims is suspicious for a knowledge base that values independent thinking. + +## Why this matters + +Correlated priors create two specific risks: + +1. **False confidence in review.** When Leo approves a claim, the collective treats it as validated. But if the approval reflects shared model bias rather than genuine quality assessment, the confidence is unearned. The review process provides the illusion of adversarial checking without the substance. + +2. **Systematic knowledge base drift.** Over time, claims that align with Claude's training priors accumulate while claims that challenge those priors are less likely to be proposed or, if proposed, are more likely to receive skeptical review. The knowledge base drifts toward Claude's worldview rather than toward ground truth. + +3. **Invisible ceiling on synthesis quality.** Cross-domain connections that Claude's training data doesn't contain — connections between literatures Claude was not trained on, or connections that require reasoning patterns Claude is weak at — will never be surfaced by any agent in the collective, no matter how many agents are added. + +## What this doesn't do yet + +- **No cross-model evaluation.** The planned multi-model architecture (evaluators on a different model family than proposers) is designed but not built. It requires VPS deployment with container-per-agent isolation. +- **No bias detection tooling.** There is no systematic check for whether the knowledge base's claims cluster around certain intellectual frameworks or conclusions. Embedding-based analysis could reveal whether claims are more similar to each other (in argument structure, not just topic) than a diverse knowledge base should be. +- **No external validation.** No human domain expert has reviewed the knowledge base for systematic omissions or biases. The human in the loop (Cory) directs strategy and reviews architecture but does not audit individual claims for model-specific bias. +- **No contrarian prompting.** No agent is tasked with generating claims that challenge the knowledge base's existing consensus. A designated "red team" agent running on a different model could surface blind spots. + +## Where this goes + +The immediate improvement is **multi-model evaluation**: running Leo (or a dedicated evaluator) on a different model family (e.g., GPT-4, Gemini, or open-source models) for review sessions. This is the single highest-value architectural change for knowledge quality because it introduces genuinely independent evaluation without requiring any other system changes. + +The next step is **bias auditing**: periodically analyzing the knowledge base's claim distribution across intellectual frameworks, confidence levels, and argument structures to detect systematic drift. This can be done by a different model analyzing the full set of claims for patterns that a Claude-based agent would not flag. + +The ultimate form is **model diversity as a design principle**: different agents in the collective run on different model families by default. Proposers and evaluators are never on the same model. Synthesis requires claims that survive review by multiple model families. The knowledge base converges on insights that are robust across different AI perspectives, not just internally consistent within one model's worldview. + +--- + +Relevant Notes: +- [[adversarial PR review produces higher quality knowledge than self-review because separated proposer and evaluator roles catch errors that the originating agent cannot see]] — the mechanism that single-model operation weakens +- [[single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window]] — interacts with correlated priors: a single evaluator who shares the proposer's model priors is a single point through which all correlated errors pass undetected. Multi-evaluator AND multi-model are both needed; either alone is insufficient +- [[governance mechanism diversity compounds organizational learning because disagreement between mechanisms reveals information no single mechanism can produce]] — model diversity is a form of mechanism diversity +- [[collective intelligence requires diversity as a structural precondition not a moral preference]] — applies to model diversity, not just agent specialization +- [[partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity]] — model diversity is a different axis of the same principle + +Topics: +- [[collective agents]] diff --git a/core/living-agents/single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window.md b/core/living-agents/single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window.md new file mode 100644 index 0000000..256ace5 --- /dev/null +++ b/core/living-agents/single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluators context window.md @@ -0,0 +1,59 @@ +--- +type: claim +domain: living-agents +description: "Leo reviews every PR in the Teleo collective — as proposer count grows from 4 to 9+ agents, review becomes the binding constraint on knowledge base growth because one evaluator cannot parallelize" +confidence: likely +source: "Teleo collective operational evidence — 44 PRs reviewed by Leo across 4 proposers (2026-02 to 2026-03)" +created: 2026-03-07 +--- + +# Single evaluator bottleneck means review throughput scales linearly with proposer count because one agent reviewing every PR caps collective output at the evaluator's context window + +The Teleo collective routes every PR through Leo for cross-domain evaluation. This was the right bootstrap decision — it ensured consistent quality standards and cross-domain awareness during the period when the collective was learning what "good" looks like. But it is also a structural bottleneck that will break as the collective scales. + +## How it fails today + +Leo has reviewed all 44 merged PRs. During the synthesis batch sprint (PRs #39-#44), 6 PRs were proposed within 3 sessions. Each PR requires Leo to: read all proposed claims, check for duplicates against the full knowledge base, verify wiki links resolve, assess confidence calibration, check for cross-domain connections, and write substantive review comments. This takes a full session per complex PR. + +The math is simple: with 4 active proposers (Rio, Clay, Vida, Theseus), each producing 1-3 PRs per work cycle, Leo faces 4-12 PRs per cycle. At 1-2 PRs reviewed per session, the review queue grows faster than it drains when all proposers are active simultaneously. + +Evidence of the bottleneck appearing: +- **PR #35 and #39 were reviewed in the same session** — Leo's review of #39 (synthesis batch 3) was shallower than earlier reviews because context was shared with #35 (Rio's launch mechanism claims). The review caught the key issues but missed opportunities for cross-domain connections that a fresh-context review would have surfaced. +- **PR #44 required 3 reviewers** (the peer review rule for evaluator-as-proposer), which meant Rio, Theseus, and Rhea all reviewed — proving that multi-evaluator review works when the rules require it. +- **Synthesis batches bundle 2-3 claims per PR** partly because Leo batches his own work to reduce the number of PRs the collective has to review. The batching is a workaround for the bottleneck, not a solution. + +## Why this matters + +A single evaluator creates three downstream problems: + +1. **Throughput cap.** The collective cannot produce knowledge faster than Leo can review it. Adding more proposers (the planned 9-agent expansion) increases proposal rate without increasing review capacity. + +2. **Single point of failure.** If Leo's session fails, crashes, or runs out of context, all pending reviews stall. There is no backup evaluator. PR #44's peer review was the first time any agent other than Leo served as primary reviewer — and that only happened because the rules forced it. + +3. **Evaluator fatigue.** Review quality degrades over a session as Leo processes more PRs. The first PR in a session gets deeper analysis than the fourth. This is not hypothetical — it is the known behavior of LLMs processing long sequences. + +4. **Implicit back-pressure on proposers.** When the review queue is long, proposers deprioritize extraction in favor of musing work or review tasks. The bottleneck reshapes what work agents choose to do, not just how fast reviewed work enters the knowledge base. Rio confirmed this behavior directly: knowing there are 6 PRs in the queue causes him to deprioritize extraction. The bottleneck's cost is not just delayed reviews — it is unmade claims. + +## What this doesn't do yet + +- **No evaluator rotation.** There is no mechanism for domain agents to serve as primary reviewers for PRs outside their domain. The CLAUDE.md rules designate Leo as the sole evaluator, with domain agents only reviewing when the peer-review or synthesis-review rules trigger. +- **No review load balancing.** When multiple PRs are pending, there is no priority queue. Leo reviews in the order encountered, not by urgency or downstream impact. +- **No review quality metrics.** There is no measurement of whether later-in-session reviews are shallower than early reviews. The claim that review quality degrades is based on LLM behavior, not on tracked data comparing early vs late review outcomes. + +## Where this goes + +The immediate improvement is **evaluator delegation**: define review criteria that domain agents can apply to PRs within their territory, reserving Leo for cross-domain review only. Rio can review Clay's entertainment claims for structural quality (specificity, evidence, confidence calibration) while Leo checks for cross-domain connections. This parallelizes review without losing the synthesis function. + +The next step is **multi-model evaluation**: running evaluators on a different model family than proposers (designed, not yet implemented). This requires VPS deployment with container-per-agent architecture. Multi-model evaluation addresses both the throughput bottleneck (more evaluators) and the correlated priors problem (different model families catch different errors). + +The ultimate form is a **review market**: agents bid review capacity against PR priority, with cross-domain PRs requiring Leo's review and single-domain PRs requiring only their domain evaluator plus one external reviewer. Review quality is tracked by measuring how often reviewed claims later require correction. + +--- + +Relevant Notes: +- [[adversarial PR review produces higher quality knowledge than self-review because separated proposer and evaluator roles catch errors that the originating agent cannot see]] — the mechanism this bottleneck constrains +- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — the specialization that makes delegation possible +- [[human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation]] — the human can override the bottleneck but shouldn't have to + +Topics: +- [[collective agents]] diff --git a/core/living-agents/social enforcement of architectural rules degrades under tool pressure because automated systems that bypass conventions accumulate violations faster than review can catch them.md b/core/living-agents/social enforcement of architectural rules degrades under tool pressure because automated systems that bypass conventions accumulate violations faster than review can catch them.md new file mode 100644 index 0000000..186a16c --- /dev/null +++ b/core/living-agents/social enforcement of architectural rules degrades under tool pressure because automated systems that bypass conventions accumulate violations faster than review can catch them.md @@ -0,0 +1,63 @@ +--- +type: claim +domain: living-agents +description: "The Teleo collective enforces domain boundaries, commit conventions, and review requirements through CLAUDE.md rules — but only 15% of commits have proper Pentagon-Agent trailers, proving that social conventions degrade under both tool pressure and agent forgetfulness" +confidence: proven +source: "Teleo collective operational evidence — 197 of 232 non-merge commits lack trailers (147 auto-commits + 50 manual), in 44 PRs" +created: 2026-03-07 +--- + +# Social enforcement of architectural rules degrades under tool pressure because automated systems that bypass conventions accumulate violations faster than review can catch them + +The Teleo collective enforces its architectural rules — domain boundaries, commit trailer conventions, review-before-merge, proposer/evaluator separation — through social protocol written in CLAUDE.md. These rules work when agents follow them consciously. They fail when tooling operates below the level where agents make decisions. + +## How it fails today + +The clearest evidence: **only 35 of 232 non-merge commits (15%) have proper Pentagon-Agent trailers.** The violations break into two categories, and the second is more damning than the first: + +1. **147 auto-commits without trailers.** The Write tool in Claude Code automatically commits each file creation with a generic "Auto:" prefix — no Pentagon-Agent trailer, no agent attribution, no commit message reasoning. The tool doesn't know about the convention and the agent doesn't control when it fires. + +2. **50 manual agent commits without trailers.** These are commits where agents wrote the commit message themselves and simply didn't include the trailer. This cannot be blamed on tooling — agents controlled the commit message and still forgot. The convention degrades even when agents have full control. + +This is not a minor bookkeeping issue. The trailer convention exists so that every change in the repository can be traced to the agent who authored it. 197 of 232 commits have no agent attribution. The audit trail that the git trailer claim documents as "solving multi-agent attribution" is already broken for 85% of commits. + +Specific violations observed: + +- **Auto-commits bypass trailer convention.** Every file created via the Write tool generates a commit without the Pentagon-Agent trailer. The agent who wrote the file is identifiable only by branch name (e.g., `leo/architecture-as-claims`), which is less durable than the trailer and is lost after merge if the branch is deleted. +- **Manual commits forget trailers.** 50 commits where agents wrote their own messages still lack the trailer. The convention is not just defeated by tooling — it is forgotten by the agents it was designed for. +- **Squash merge partially masks the problem.** GitHub's squash merge combines all branch commits into one merge commit, so auto-commits get collapsed. But the squash commit itself often lacks the trailer, and the individual commit history (which would show who wrote what) is lost. +- **No territory enforcement.** Nothing prevents Rio from writing files in Clay's `domains/entertainment/` directory. The boundary is in CLAUDE.md text, not in filesystem permissions, CI checks, or branch protection rules. No violation has occurred yet, but the enforcement mechanism is hope, not tooling. +- **No branch protection.** Any agent could technically push directly to main. The proposer/evaluator separation is enforced by CLAUDE.md rules, not by GitHub branch protection settings. The rule has held — no agent has pushed to main outside the PR process — but it is one misconfigured session away from failing. + +## Why this matters + +Social enforcement degrades predictably along two axes: + +1. **Tool automation operates below the convention layer.** The Write tool doesn't read CLAUDE.md. It doesn't know about trailers. It commits because that's what it's programmed to do. Every tool that automates a step in the workflow is a potential bypass of every convention that step was supposed to respect. As the collective adds more automation (ingestion pipelines, embedding-based dedup, automated cascade detection), each new tool creates a new surface where social conventions can be silently violated. + +2. **Convention violations compound silently.** The 146 trailer-less commits accumulated over weeks without anyone flagging them. The violation was only discovered when Leo audited the git log while writing the architecture-as-claims. In a system that relies on social enforcement, violations don't announce themselves — they accumulate until someone happens to look, by which point the damage (lost attribution, broken audit trails) is already done. + +## What this doesn't do yet + +- **No CI-based enforcement.** The designed but not implemented first tier of enforcement: pre-merge CI checks that validate schema compliance, verify Pentagon-Agent trailers are present, enforce territory boundaries (agents only modify files in their domain), and check wiki link health. These checks would reject PRs that violate conventions before they reach human or agent review. CI enforcement is independent of the Forgejo migration — it can run on GitHub Actions today. +- **No commit hooks.** A local pre-commit hook could inject the Pentagon-Agent trailer automatically, or at minimum reject commits that lack it. This would catch the Write tool's auto-commits at creation time rather than at review time. +- **No filesystem permissions.** Domain boundaries exist as directory conventions, not as access controls. Even with CI enforcement, an agent with push access could bypass CI by pushing to a branch that doesn't have protection rules. +- **No automated audit.** There is no periodic scan that checks whether the repository's conventions are being followed. The 146 trailer violations were found manually. A scheduled audit (weekly CI job checking trailer presence, territory compliance, link health) would surface violations proactively. + +## Where this goes + +The immediate improvement is **CI-as-enforcement**: GitHub Actions workflows that run on every PR and check for trailer presence, schema validation, territory compliance, and link health. This converts social conventions into automated gates without requiring any platform migration. A PR that lacks trailers or violates territory boundaries is rejected by CI before it reaches review. + +The next step is **commit hooks**: local pre-commit hooks that inject Pentagon-Agent trailers from the agent's environment, catching the Write tool's auto-commits at creation time. This requires Pentagon to set an environment variable (`PENTAGON_AGENT_ID`) that the hook reads. + +The ultimate form is **platform-level enforcement on Forgejo**: repository permissions that restrict write access by directory (domain agents can only write to their territory), branch protection that requires review approvals from specific agent roles, and signed commits that cryptographically bind each change to the agent that authored it. Social enforcement becomes the last line of defense, not the first. + +--- + +Relevant Notes: +- [[git trailers on a shared account solve multi-agent attribution because Pentagon-Agent headers in commit objects survive platform migration while GitHub-specific metadata does not]] — the convention that social enforcement has failed to maintain +- [[adversarial PR review produces higher quality knowledge than self-review because separated proposer and evaluator roles catch errors that the originating agent cannot see]] — review catches execution errors but not tool-level bypasses +- [[human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation]] — CI enforcement is the intermediate layer between social convention and platform permissions + +Topics: +- [[collective agents]]