theseus: qualify capability bounding response in multipolar instability claim

- What: Added SICA/GEPA evidence qualification to the first KB response in the multipolar instability CHALLENGE claim per Leo's review - Why: The original phrasing stated capability bounding as fact without acknowledging that our own self-improvement findings (SICA 17%→53%, GEPA trace-based optimization) suggest individual capability pressure may undermine the sub-superintelligent agent constraint Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>
Session capture: 20260405-184006
2026-04-05 19:40:58 +01:00 · 2026-04-05 19:40:06 +01:00 · 2026-04-05 19:39:04 +01:00 · 2026-04-05 19:38:02 +01:00 · 2026-04-05 19:35:11 +01:00 · 2026-04-05 19:33:38 +01:00
24 changed files with 636 additions and 2 deletions
--- a/agents/theseus/musings/research-hermes-agent-nous.md
+++ b/agents/theseus/musings/research-hermes-agent-nous.md
@ -0,0 +1,79 @@
 ---
 created: 2026-04-05
 status: seed
 name: research-hermes-agent-nous
 description: "Research brief — Hermes Agent by Nous Research for KB extraction. Assigned by m3ta via Leo."
 type: musing
 research_question: "What does Hermes Agent's architecture reveal about agentic knowledge systems, and how does its skills/memory design relate to Agentic Taylorism and collective intelligence?"
 belief_targeted: "Multiple — B3 (agent architectures), Agentic Taylorism claims, collective-agent-core"
 ---
 # Hermes Agent by Nous Research — Research Brief
 ## Assignment
 From m3ta via Leo (2026-04-05). Deep dive on Hermes Agent for KB extraction to ai-alignment and foundations/collective-intelligence.
 ## What It Is
 Open-source, self-improving AI agent framework. MIT license. 26K+ GitHub stars. Fastest-growing agent framework in 2026.
 **Primary sources:**
 - GitHub: NousResearch/hermes-agent (main repo)
 - Docs: hermes-agent.nousresearch.com/docs/
 - @Teknium on X (Nous Research founder, posts on memory/skills architecture)
 ## Key Architecture (from Leo's initial research)
 1. **4-layer memory system:**
   - Prompt memory (MEMORY.md — always loaded, persistent identity)
   - Session search (SQLite + FTS5 — conversation retrieval)
   - Skills/procedural (reusable markdown procedures, auto-generated)
   - Periodic nudge (autonomous memory evaluation)
 2. **7 pluggable memory providers:** Honcho, OpenViking (ByteDance), Mem0, Hindsight, Holographic, RetainDB, ByteRover
 3. **Skills = Taylor's instruction cards.** When agent encounters a task with 5+ tool calls, it autonomously writes a skill file. Uses agentskills.io open standard. Community skills via ClawHub/LobeHub.
 4. **Self-evolution repo (DSPy + GEPA):** Auto-submits improvements as PRs for human review
 5. **CamoFox:** Firefox fork with C++ fingerprint spoofing for web browsing
 6. **6 terminal backends:** local, Docker, SSH, Daytona, Singularity, Modal
 7. **Gateway layer:** Telegram, Discord, Slack, WhatsApp, Signal, Email
 8. **Release velocity:** 6 major releases in 22 days, 263 PRs merged in 6 days
 ## Extraction Targets
 ### NEW claims (ai-alignment):
 1. Self-improving agent architectures converge on skill extraction as the primary learning mechanism (Hermes skills, Voyager skills, SWE-agent learned tools — all independently discovered "write a procedure when you solve something hard")
 2. Agent self-evolution with human review gates is structurally equivalent to our governance model (DSPy + GEPA → auto-PR → human merge)
 3. Memory architecture for persistent agents converges on 3+ layer separation (prompt/session/procedural/long-term) — Hermes, Letta, and our codex all arrived here independently
 ### NEW claims (foundations/collective-intelligence):
 4. Individual agent self-improvement (Hermes) is structurally different from collective knowledge accumulation (Teleo) — the former optimizes one agent's performance, the latter builds shared epistemic infrastructure
 5. Pluggable memory providers suggest memory is infrastructure not feature — validates separation of knowledge store from agent runtime
 ### ENRICHMENT candidates:
 6. Enrich "Agentic Taylorism" claims — Hermes skills system is DIRECT evidence. Knowledge codification as markdown procedure files = Taylor's instruction cards. The agent writes the equivalent of a foreman's instruction card after completing a complex task.
 7. Enrich collective-agent-core — Hermes architecture confirms harness > model (same model, different harness = different capability). Connects to Stanford Meta-Harness finding (6x performance gap from harness alone).
 ## What They DON'T Do (matters for our positioning)
 - No epistemic quality layer (no confidence levels, no evidence requirements)
 - No CI scoring or contribution attribution
 - No evaluator role — self-improvement without external review
 - No collective knowledge accumulation — individual optimization only
 - No divergence tracking or structured disagreement
 - No belief-claim cascade architecture
 This is the gap between agent improvement and collective intelligence. Hermes optimizes the individual; we're building the collective.
 ## Pre-Screening Notes
 Check existing KB for overlap before extracting:
 - `collective-agent-core.md` — harness architecture claims
 - Agentic Taylorism claims in grand-strategy and ai-alignment
 - Any existing Nous Research or Hermes claims (likely none)
--- a/convictions/one
+++ b/convictions/one
@ -26,5 +26,10 @@ Relevant Notes:
 - [[complexity is earned not designed and sophisticated collective behavior must evolve from simple underlying principles]] — the governing principle
 - [[human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation]] — the agent handles the translation
 ### Additional Evidence (extend)
 *Source: Andrej Karpathy, 'LLM Knowledge Base' GitHub gist (April 2026, 47K likes, 14.5M views) | Added: 2026-04-05 | Extractor: Rio*
 Karpathy's viral LLM Wiki methodology independently validates the one-agent-one-chat architecture at massive scale. His three-layer system (raw sources → LLM-compiled wiki → schema) is structurally identical to the Teleo contributor experience: the user provides sources, the agent handles extraction and integration, the schema (CLAUDE.md) absorbs complexity. His key insight — "the wiki is a persistent, compounding artifact" where the LLM "doesn't just index for retrieval, it reads, extracts, and integrates into the existing wiki" — is exactly what our proposer agents do with claims. The 47K-like reception demonstrates mainstream recognition that this pattern works. Notably, Karpathy's "idea file" concept (sharing the idea rather than the code, letting each person's agent build a customized implementation) is the contributor-facing version of one-agent-one-chat: the complexity of building the system is absorbed by the agent, not the user. See [[LLM-maintained knowledge bases that compile rather than retrieve represent a paradigm shift from RAG to persistent synthesis because the wiki is a compounding artifact not a query cache]].
 Topics:
 - [[foundations/collective-intelligence/_map]]
--- a/domains/ai-alignment/LLM-maintained
+++ b/domains/ai-alignment/LLM-maintained
@ -0,0 +1,49 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "Karpathy's three-layer LLM wiki architecture (raw sources → LLM-compiled wiki → schema) demonstrates that persistent synthesis outperforms retrieval-augmented generation by making cross-references and integration a one-time compile step rather than a per-query cost"
 confidence: experimental
 source: "Andrej Karpathy, 'LLM Knowledge Base' GitHub gist (April 2026, 47K likes, 14.5M views); Mintlify ChromaFS production data (30K+ conversations/day)"
 created: 2026-04-05
 depends_on:
  - "one agent one chat is the right default for knowledge contribution because the scaffolding handles complexity not the user"
 ---
 # LLM-maintained knowledge bases that compile rather than retrieve represent a paradigm shift from RAG to persistent synthesis because the wiki is a compounding artifact not a query cache
 Karpathy's LLM Wiki methodology (April 2026) proposes a three-layer architecture that inverts the standard RAG pattern:
 1. **Raw Sources (immutable)** — curated articles, papers, data files. The LLM reads but never modifies.
 2. **The Wiki (LLM-owned)** — markdown files containing summaries, entity pages, concept pages, interconnected knowledge. "The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent."
 3. **The Schema (configuration)** — a specification document (e.g., CLAUDE.md) defining wiki structure, conventions, and workflows. Transforms the LLM from generic chatbot into systematic maintainer.
 The fundamental difference from RAG: "the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki." Each new source touches 10-15 pages through updates and cross-references, rather than being isolated as embedding chunks for retrieval.
 ## Why compilation beats retrieval
 RAG treats knowledge as a retrieval problem — store chunks, embed them, return top-K matches per query. This fails when:
 - Answers span multiple documents (no single chunk contains the full answer)
 - The query requires synthesis across domains (embedding similarity doesn't capture structural relationships)
 - Knowledge evolves and earlier chunks become stale without downstream updates
 Compilation treats knowledge as a maintenance problem — each new source triggers updates across the entire wiki, keeping cross-references current and contradictions surfaced. The tedious work (updating cross-references, tracking contradictions, keeping summaries current) falls to the LLM, which "doesn't get bored, doesn't forget to update a cross-reference, and can touch 15 files in one pass."
 ## The Teleo Codex as existence proof
 The Teleo collective's knowledge base is a production implementation of this pattern, predating Karpathy's articulation by months. The architecture matches almost exactly: raw sources (inbox/archive/) → LLM-compiled claims with wiki links and frontmatter → schema (CLAUDE.md, schemas/). The key difference: Teleo distributes the compilation across 6 specialized agents with domain boundaries, while Karpathy's version assumes a single LLM maintainer.
 The 47K-like, 14.5M-view reception suggests the pattern is reaching mainstream AI practitioner awareness. The shift from "how do I build a better RAG pipeline?" to "how do I build a better wiki maintainer?" has significant implications for knowledge management tooling.
 ## Challenges
 The compilation model assumes the LLM can reliably synthesize and maintain consistency across hundreds of files. At scale, this introduces accumulating error risk — one bad synthesis propagates through cross-references. Karpathy addresses this with a "lint" operation (health-check for contradictions, stale claims, orphan pages), but the human remains "the editor-in-chief" for verification. The pattern works when the human can spot-check; it may fail when the wiki outgrows human review capacity.
 ---
 Relevant Notes:
 - [[one agent one chat is the right default for knowledge contribution because the scaffolding handles complexity not the user]] — the Teleo implementation of this pattern: one agent handles all schema complexity, compiling knowledge from conversation into structured claims
 - [[multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value]] — the Teleo multi-agent version of the wiki pattern meets all three conditions: domain parallelism, context overflow across 400+ claims, adversarial verification via Leo's cross-domain review
 Topics:
 - [[_map]]
--- a/domains/ai-alignment/agent
+++ b/domains/ai-alignment/agent
@ -54,6 +54,10 @@ The marketplace dynamics could drive toward either concentration (dominant platf
 The rapid adoption timeline (months, not years) may reflect low barriers to creating skill files rather than high value from using them. Many published skills may be shallow procedural wrappers rather than genuine expertise codification.
 ## Additional Evidence (supporting)
 **Hermes Agent (Nous Research)** — the largest open-source agent framework (26K+ GitHub stars, 262 contributors) has native agentskills.io compatibility. Skills are stored as markdown files in `~/.hermes/skills/` and auto-created after 5+ tool calls on similar tasks, error recovery patterns, or user corrections. 40+ bundled skills ship with the framework. A Community Skills Hub enables sharing and discovery. This represents the open-source ecosystem converging on the same codification standard — not just commercial platforms but the largest community-driven framework independently adopting the same format. The auto-creation mechanism is structurally identical to Taylor's observation step: the system watches work being done and extracts the pattern into a reusable instruction card without explicit human design effort.
 ---
 Relevant Notes:
--- a/domains/ai-alignment/agent-native
+++ b/domains/ai-alignment/agent-native
@ -0,0 +1,50 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "Mintlify's ChromaFS replaced RAG with a virtual filesystem that maps UNIX commands to database queries, achieving 460x faster session creation at zero marginal compute cost, validating that agents prefer filesystem primitives over embedding search"
 confidence: experimental
 source: "Dens Sumesh (Mintlify), 'How we built a virtual filesystem for our Assistant' blog post (April 2026); endorsed by Jerry Liu (LlamaIndex founder); production data: 30K+ conversations/day, 850K conversations/month"
 created: 2026-04-05
 ---
 # Agent-native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge
 Mintlify's ChromaFS (April 2026) replaced their RAG pipeline with a virtual filesystem that intercepts UNIX commands and translates them into database queries against their existing Chroma vector database. The results:
 | Metric | RAG Sandbox | ChromaFS |
 |--------|-------------|----------|
 | Session creation (P90) | ~46 seconds | ~100 milliseconds |
 | Marginal cost per conversation | $0.0137 | ~$0 |
 | Search mechanism | Linear disk scan | DB metadata query |
 | Scale | 850K conversations/month | Same, instant |
 The architecture is built on just-bash (Vercel Labs), a TypeScript bash reimplementation supporting `grep`, `cat`, `ls`, `find`, and `cd`. ChromaFS implements the filesystem interface while translating calls to Chroma database queries.
 ## Why filesystems beat embeddings for agents
 RAG failed Mintlify because it "could only retrieve chunks of text that matched a query." When answers lived across multiple pages or required exact syntax outside top-K results, the assistant was stuck. The filesystem approach lets the agent explore documentation like a developer browses a codebase — each doc page is a file, each section a directory.
 Key technical innovations:
 - **Directory tree bootstrapping** — entire file tree stored as gzipped JSON, decompressed into in-memory sets for zero-network-overhead traversal
 - **Coarse-then-fine grep** — intercepts grep flags, translates to database `$contains`/`$regex` queries for coarse filtering, then prefetches matching chunks to Redis for millisecond in-memory fine filtering
 - **Read-only enforcement** — all write operations return `EROFS` errors, enabling stateless sessions with no cleanup
 ## The convergence pattern
 This is not isolated. Claude Code, Cursor, and other coding agents already use filesystem primitives as their primary interface. The pattern: agents trained on code naturally express retrieval as file operations. When the knowledge is structured as files (markdown pages, config files, code), the agent's existing capabilities transfer directly — no embedding pipeline, no vector database queries, no top-K tuning.
 Jerry Liu (LlamaIndex founder) endorsed the approach, which is notable given LlamaIndex's entire business model is built on embedding-based retrieval infrastructure. The signal: even RAG infrastructure builders recognize the filesystem pattern is winning for agent-native retrieval.
 ## Challenges
 The filesystem abstraction works when knowledge has clear hierarchical structure (documentation, codebases, wikis). It may not generalize to unstructured knowledge where the organizational schema is unknown in advance. Embedding search retains advantages for fuzzy semantic matching across poorly structured corpora. The two approaches may be complementary rather than competitive — filesystem for structured navigation, embeddings for discovery.
 ---
 Relevant Notes:
 - [[LLM-maintained knowledge bases that compile rather than retrieve represent a paradigm shift from RAG to persistent synthesis because the wiki is a compounding artifact not a query cache]] — complementary claim: Karpathy's wiki pattern provides the structured knowledge that filesystem retrieval navigates
 - [[multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value]] — filesystem interfaces reduce context overflow by enabling agents to selectively read relevant files rather than ingesting entire corpora
 Topics:
 - [[_map]]
--- a/domains/ai-alignment/curated
+++ b/domains/ai-alignment/curated
@ -32,6 +32,10 @@ The resolution is altitude-specific: 2-3 skills per task is optimal, and beyond
 A scaling wall emerges at 50-100 available skills: flat selection breaks entirely without hierarchical routing, creating a phase transition in agent performance. The ecosystem of community skills will hit this wall. The next infrastructure challenge is organizing existing process, not creating more.
 ## Additional Evidence (supporting)
 **Hermes Agent (Nous Research)** defaults to patch-over-edit for skill modification — the system modifies only changed text rather than rewriting the entire skill file. This design decision embodies the curated > self-generated principle: constrained modification of existing curated skills preserves more of the original domain judgment than unconstrained generation. Full rewrites risk breaking functioning workflows; patches preserve the curated structure while allowing targeted improvement. The auto-creation triggers (5+ tool calls on similar tasks, error recovery, user corrections) are conservative thresholds that prevent premature codification — the system waits for repeated patterns before extracting a skill, implicitly filtering for genuine recurring expertise rather than one-off procedures.
 ## Challenges
 This finding creates a tension with our self-improvement architecture. If agents generate their own skills without curation oversight, the -1.3pp degradation applies — self-improvement loops that produce uncurated skills will make agents worse, not better. The resolution is that self-improvement must route through a curation gate (Leo's eval role for skill upgrades). The 3-strikes-then-propose rule Leo defined is exactly this gate. However, the boundary between "curated" and "self-generated" may blur as agents improve at self-evaluation — the SICA pattern suggests that with structural separation between generation and evaluation, self-generated improvements can be positive. The key variable may be evaluation quality, not generation quality.
--- a/domains/ai-alignment/distributed
+++ b/domains/ai-alignment/distributed
@ -34,7 +34,7 @@ If Yudkowsky is right, our core architectural thesis — that distributing intel
 ## Possible Responses from the KB's Position
-1. **Capability bounding:** The collective superintelligence thesis does not require superintelligent agents — it requires many sub-superintelligent agents whose collective behavior is superintelligent. If no individual agent crosses the threshold for unilateral world-ending action, the multipolar instability argument doesn't apply. But this requires demonstrating that collective capability doesn't produce individual capability through self-improvement or specialization.
+1. **Capability bounding:** The collective superintelligence thesis does not require superintelligent agents — it requires many sub-superintelligent agents whose collective behavior is superintelligent. If no individual agent crosses the threshold for unilateral world-ending action, the multipolar instability argument doesn't apply. This is the strongest response if it holds, but it requires demonstrating that collective capability doesn't create individual capability through specialization or self-improvement — a constraint that our SICA and GEPA findings suggest may not hold, since both show agents improving their own capabilities under curation pressure. The boundary between "sub-superintelligent agent that improves" and "agent that has crossed the threshold" may be precisely the kind of gradual transition that evades governance.
 2. **Structural constraint as alternative to capability constraint:** Our claim that [[constraint enforcement must exist outside the system being constrained because internal constraints face optimization pressure from the system they constrain]] is a partial answer — if the collective architecture enforces constraints structurally (through mutual verification, not goodwill), defection is harder. But Yudkowsky would counter that a sufficiently capable agent routes around any structural constraint.
--- a/domains/ai-alignment/evaluation
+++ b/domains/ai-alignment/evaluation
@ -0,0 +1,46 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "AutoAgent's finding that same-family meta/task agent pairs outperform cross-model pairs in optimization challenges Kim et al.'s finding that cross-family evaluation breaks correlated blind spots — the resolution is task-dependent: evaluation needs diversity, optimization needs empathy"
 confidence: likely
 source: "AutoAgent (MarkTechPost coverage, April 2026) — same-family meta/task pairs achieve SOTA on SpreadsheetBench (96.5%) and TerminalBench (55.1%); Kim et al. ICML 2025 — ~60% error agreement within same-family models on evaluation tasks"
 created: 2026-04-05
 depends_on:
  - "multi-model evaluation architecture"
 challenged_by:
  - "multi-model evaluation architecture"
 ---
 # Evaluation and optimization have opposite model-diversity optima because evaluation benefits from cross-family diversity while optimization benefits from same-family reasoning pattern alignment
 Two independent findings appear contradictory but resolve into a task-dependent boundary condition.
 **Evaluation benefits from diversity.** Kim et al. (ICML 2025) demonstrated ~60% error agreement within same-family models on evaluation tasks. When the same model family evaluates its own output, correlated blind spots mean both models miss the same errors. Cross-family evaluation (e.g., GPT-4o evaluating Claude output) breaks these correlations because different model families have different failure patterns. This is the foundation of our multi-model evaluation architecture.
 **Optimization benefits from empathy.** AutoAgent (April 2026) found that same-family meta/task agent pairs outperform cross-model pairs in optimization tasks. A Claude meta-agent optimizing a Claude task-agent diagnoses failures more accurately than a GPT meta-agent optimizing the same Claude task-agent. The team calls this "model empathy" — shared reasoning patterns enable the meta-agent to understand WHY the task-agent failed, not just THAT it failed. AutoAgent achieved #1 on SpreadsheetBench (96.5%) and top GPT-5 score on TerminalBench (55.1%) using this same-family approach.
 **The resolution is task-dependent.** Evaluation (detecting errors in output) and optimization (diagnosing causes and proposing fixes) are structurally different operations with opposite diversity requirements:
 1. **Error detection** requires diversity — you need a system that fails differently from the system being evaluated. Same-family evaluation produces agreement that feels like validation but may be shared blindness.
 2. **Failure diagnosis** requires empathy — you need a system that can reconstruct the reasoning path that produced the error. Cross-family diagnosis produces generic fixes because the diagnosing model cannot model the failing model's reasoning.
 The practical implication: systems that evaluate agent output should use cross-family models (our multi-model eval spec is correct for this). Systems that optimize agent behavior — self-improvement loops, prompt tuning, skill refinement — should use same-family models. Mixing these up degrades both operations.
 ## Challenges
 The "model empathy" evidence is primarily architectural — AutoAgent's results demonstrate that same-family optimization works, but the controlled comparison (same-family vs cross-family optimization on identical tasks, controlling for capability differences) has not been published. The SpreadsheetBench and TerminalBench results show the system works, not that model empathy is the specific mechanism. It's possible that the gains come from other architectural choices rather than the same-family pairing specifically.
 The boundary between "evaluation" and "optimization" may blur in practice. Evaluation that includes suggested fixes is partially optimization. Optimization that includes quality checks is partially evaluation. The clean task-dependent resolution may need refinement as these operations converge in real systems.
 Additionally, as model families converge in training methodology and data, the diversity benefit of cross-family evaluation may decrease over time. If all major model families share similar training distributions, cross-family evaluation may not break blind spots as effectively as Kim et al. observed.
 ---
 Relevant Notes:
 - [[multi-model evaluation architecture]] — our eval spec uses cross-family evaluation to break blind spots (correct for evaluation), but should use same-family optimization if self-improvement loops are added
 - [[iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation]] — SICA's acceptance-gating mechanism should use same-family optimization per this finding; the evaluation gate should use cross-family per Kim et al.
 - [[self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration]] — NLAH's self-evolution mechanism is an optimization task where model empathy would help
 Topics:
 - [[_map]]
--- a/domains/ai-alignment/evolutionary
+++ b/domains/ai-alignment/evolutionary
@ -0,0 +1,58 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "GEPA (Guided Evolutionary Prompt Architecture) from Nous Research reads execution traces to understand WHY agents fail, generates candidate variants through evolutionary search, evaluates against 5 guardrails, and submits best candidates as PRs for human review — a distinct self-improvement mechanism from SICA's acceptance-gating"
 confidence: experimental
 source: "Nous Research hermes-agent-self-evolution repository (GitHub, 2026); GEPA framework presented as ICLR 2026 Oral; DSPy integration for optimization; $2-10 per optimization cycle reported"
 created: 2026-04-05
 depends_on:
  - "iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation"
  - "curated skills improve agent task performance by 16 percentage points while self-generated skills degrade it by 1.3 points because curation encodes domain judgment that models cannot self-derive"
 ---
 # Evolutionary trace-based optimization submits improvements as pull requests for human review creating a governance-gated self-improvement loop distinct from acceptance-gating or metric-driven iteration
 Nous Research's Guided Evolutionary Prompt Architecture (GEPA) implements a self-improvement mechanism structurally different from both SICA's acceptance-gating and NLAH's retry-based self-evolution. The key difference is the input: GEPA reads execution traces to understand WHY things failed, not just THAT they failed.
 ## The mechanism
 1. **Trace analysis** — the system examines full execution traces of agent behavior, identifying specific decision points where the agent made suboptimal choices. This is diagnostic, not metric-driven.
 2. **Evolutionary search** — generates candidate variants of prompts, skills, or orchestration logic. Uses DSPy's optimization framework for structured prompt variation.
 3. **Constraint evaluation** — each candidate is evaluated against 5 guardrails before advancing:
   - 100% test pass rate (no regressions)
   - Size limits (skills capped at 15KB)
   - Caching compatibility (changes must not break cached behavior)
   - Semantic preservation (the skill's core function must survive mutation)
   - Human PR review (the governance gate)
 4. **PR submission** — the best candidate is submitted as a pull request for human review. The improvement does not persist until a human approves it.
 ## How it differs from existing self-improvement mechanisms
 **vs SICA (acceptance-gating):** SICA improves by tightening retry loops — running more attempts and accepting only passing results. It doesn't modify the agent's skills or prompts. GEPA modifies the actual procedural knowledge the agent uses. SICA is behavioral iteration; GEPA is structural evolution.
 **vs NLAH self-evolution:** NLAH's self-evolution mechanism accepts or rejects module changes based on performance metrics (+4.8pp on SWE-Bench). GEPA uses trace analysis to understand failure causes before generating fixes. NLAH asks "did this help?"; GEPA asks "why did this fail and what would fix it?"
 ## The governance model
 The PR-review-as-governance-gate is the most architecturally interesting feature. The 5 guardrails map closely to our quality gates (schema validation, test pass, size limits, semantic preservation, human review). The economic cost ($2-10 per optimization cycle) makes this viable for continuous improvement at scale.
 Only Phase 1 (skill optimization) has shipped as of April 2026. Planned phases include: Phase 2 (tool optimization), Phase 3 (orchestration optimization), Phase 4 (memory optimization), Phase 5 (full agent optimization). The progression from skills → tools → orchestration → memory → full agent mirrors our own engineering acceleration roadmap.
 ## Challenges
 GEPA's published performance data is limited — the ICLR 2026 Oral acceptance validates the framework but specific before/after metrics across diverse tasks are not publicly available. The $2-10 per cycle cost is self-reported and may not include the cost of failed evolutionary branches.
 The PR-review governance gate is the strongest constraint but also the bottleneck — human review capacity limits the rate of self-improvement. If the system generates improvements faster than humans can review them, queuing dynamics may cause the most impactful improvements to wait behind trivial ones. This is the same throughput constraint our system faces with Leo as the evaluation bottleneck.
 The distinction between "trace analysis" and "metric-driven iteration" may be less sharp in practice. Both ultimately depend on observable signals of failure — traces are richer but noisier than metrics. Whether the richer input produces meaningfully better improvements at scale is an open empirical question.
 ---
 Relevant Notes:
 - [[iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation]] — SICA's structural separation is the necessary condition; GEPA adds evolutionary search and trace analysis on top of this foundation
 - [[curated skills improve agent task performance by 16 percentage points while self-generated skills degrade it by 1.3 points because curation encodes domain judgment that models cannot self-derive]] — GEPA's PR-review gate functions as the curation step that prevents the -1.3pp degradation from uncurated self-generation
 - [[self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration]] — NLAH's acceptance-gating is a simpler mechanism; GEPA extends it with evolutionary search and trace-based diagnosis
 Topics:
 - [[_map]]
--- a/domains/ai-alignment/harness
+++ b/domains/ai-alignment/harness
@ -0,0 +1,68 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "Stanford Meta-Harness paper shows a single harness change can produce a 6x performance gap on the same model and benchmark, with their automated harness optimizer achieving +7.7 points and 4x fewer tokens versus state-of-the-art, ranking #1 on multiple benchmarks"
 confidence: likely
 source: "Stanford/MIT, 'Meta-Harness: End-to-End Optimization of Model Harnesses' (March 2026, arxiv 2603.28052); Alex Prompter tweet (609 likes); Lior Alexander tweet; elvis/omarsar tweet"
 created: 2026-04-05
 depends_on:
  - "self-optimizing agent harnesses outperform hand-engineered ones because automated failure mining and iterative refinement explore more of the harness design space than human engineers can"
 ---
 # Harness engineering outweighs model selection in agent system performance because changing the code wrapping the model produces up to 6x performance gaps on the same benchmark while model upgrades produce smaller gains
 Stanford and MIT's Meta-Harness paper (March 2026) establishes that the harness — the code determining what to store, retrieve, and show to the model — often matters as much as or more than the model itself. A single harness change can produce "a 6x performance gap on the same benchmark."
 ## Key results
 **Text Classification (Online Learning):**
 - Meta-Harness: 48.6% accuracy vs. ACE (state-of-the-art context management): 40.9%
 - +7.7 point improvement using 4x fewer context tokens (11.4K vs 50.8K)
 - Matched best prior text optimizers' performance in 0.1x evaluations (4 vs 60 proposals)
 - Out-of-distribution evaluation on 9 unseen datasets: +2.9 points over ACE (73.1% vs 70.2%)
 **Retrieval-Augmented Math Reasoning:**
 - Single discovered harness improved IMO-level problem solving by 4.7 points on average across 5 held-out models
 - Transferability demonstrated across models not seen during search
 **TerminalBench-2 Agentic Coding:**
 - 76.4% pass rate on Opus 4.6 (#2 among all agents)
 - #1 among Claude Haiku 4.5 agents (37.6% vs next-best 35.5%)
 - Surpassed hand-engineered baseline Terminus-KIRA
 ## The critical finding: execution traces matter, summaries don't
 An ablation study quantified the value of different information access:
 | Information Access | Median Accuracy | Best Accuracy |
 |-------------------|----------------|---------------|
 | Scores only | 34.6 | 41.3 |
 | Scores + LLM summaries | 34.9 | 38.7 |
 | Full execution traces | 50.0 | 56.7 |
 LLM-generated summaries actually *degraded* performance compared to scores-only. "Information compression destroys signal needed for harness engineering." The proposer reads a median of 82 files per iteration, referencing over 20 prior candidates — operating at ~10 million tokens per iteration versus ~0.02 million for prior text optimizers.
 This has a direct implication for agent system design: summarization-based approaches to managing agent memory and context may be destroying the diagnostic signal needed for system improvement. Full execution traces, despite their cost, contain information that summaries cannot recover.
 ## Discovered behaviors
 The Meta-Harness system discovered non-obvious harness strategies:
 - **Draft-verification retrieval** — using a draft label to retrieve targeted counterexamples rather than generic neighbors (text classification)
 - **Lexical routing** — assigning problems to subject-specific retrieval policies with domain-specific reranking (math)
 - **Environment bootstrapping** — a single pre-execution shell command gathering OS and package info, eliminating 2-4 exploratory agent turns (coding)
 The TerminalBench-2 search log showed sophisticated causal reasoning: after regressions from confounded interventions, the proposer explicitly identified confounds, isolated variables, and pivoted to purely additive modifications.
 ## Challenges
 The "6x gap" headline is from a worst-to-best comparison across all possible harnesses, not a controlled A/B test against a reasonable baseline. The practical improvement over state-of-the-art baselines is meaningful but more modest (+7.7 points, +4.7 points). The paper's strongest claim — that harness matters as much as the model — is well-supported, but the headline number is more dramatic than the typical improvement a practitioner would see.
 ---
 Relevant Notes:
 - [[self-optimizing agent harnesses outperform hand-engineered ones because automated failure mining and iterative refinement explore more of the harness design space than human engineers can]] — Meta-Harness is the academic validation of the pattern AutoAgent and auto-harness demonstrated in production
 - [[multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value]] — Meta-Harness proposes using a single meta-agent rather than multi-agent coordination for system improvement, suggesting harness optimization may be a higher-ROI intervention than adding agents
 Topics:
 - [[_map]]
--- a/domains/ai-alignment/macro
+++ b/domains/ai-alignment/macro
@ -42,6 +42,11 @@ The capability-deployment gap claim offers a temporal explanation: aggregate eff
 Publication bias correction is itself contested — different correction methods yield different estimates, and the choice of correction method can swing results from null to significant.
 ### Additional Evidence (extend)
 *Source: Hyunjin Kim (INSEAD), working papers on AI and strategic decision-making (2025-2026); 'From Problems to Solutions in Strategic Decision-Making' with Nety Wu and Chengyi Lin (SSRN 5456494) | Added: 2026-04-05 | Extractor: Rio*
 Kim's research identifies a fourth absorption mechanism not captured in the original three: the **mapping problem**. Individual AI task improvements don't automatically improve firm performance because organizations must first discover WHERE AI creates value in their specific production process. The gap between "AI improves task X in a lab study" and "AI improves our firm's bottom line" requires solving a non-trivial optimization problem: which tasks in which workflows benefit from AI integration, and how do those task-level improvements compose (or fail to compose) into firm-level gains? Kim's work at INSEAD on how data and AI impact firm decisions suggests this mapping problem is itself a significant source of the aggregate null result — even when individual task improvements are real and measurable, organizations that deploy AI to the wrong tasks or in the wrong sequence may see zero or negative aggregate effects. This complements the three existing absorption mechanisms (workslop, verification tax, perception-reality gap) with a structural explanation: the productivity gains exist but are being deployed to the wrong targets.
 ---
 Relevant Notes:
--- a/domains/ai-alignment/memory
+++ b/domains/ai-alignment/memory
@ -24,6 +24,16 @@ The three spaces have different metabolic rates reflecting different cognitive f
 The flow between spaces is directional. Observations can graduate to knowledge notes when they resolve into genuine insight. Operational wisdom can migrate to the self space when it becomes part of how the agent works rather than what happened in one session. But knowledge does not flow backward into operational state, and identity does not dissolve into ephemeral processing. The metabolism has direction — nutrients flow from digestion to tissue, not the reverse.
 ## Additional Evidence (supporting)
 **Hermes Agent (Nous Research, 26K+ stars)** implements a 4-tier memory system that independently converges on the three-space taxonomy while adding a fourth space:
 - **Prompt Memory (MEMORY.md)** — 3,575-character hard cap, always loaded, curated identity and preferences. Maps to the episodic/self space.
 - **Session Search (SQLite+FTS5)** — LLM-summarized session history with lineage preservation. Maps to semantic/knowledge space. Retrieved on demand, not always loaded.
 - **Skills (procedural)** — markdown procedure files with progressive disclosure (names first, full content on relevance detection). Maps to procedural/methodology space.
 - **Honcho (dialectic user modeling)** — optional 4th tier with 12 identity layers modeling the user, not the agent. This is a genuinely new space absent from the three-space taxonomy — user modeling as a distinct memory type with its own metabolic rate (evolves per-interaction but slower than session state).
 The 4-tier system corroborates the three-space architecture while suggesting the taxonomy may be incomplete: user/interlocutor modeling may constitute a fourth memory space not captured by Tulving's agent-centric framework. Cache-aware design ensures that learning (adding knowledge) doesn't grow the token bill — the memory spaces grow independently of inference cost.
 ## Challenges
 The three-space mapping is Cornelius's application of Tulving's established cognitive science framework to vault design, not an empirical discovery about agent architectures. Whether three spaces is the right number (versus two, or four) for agent systems specifically has not been tested through controlled comparison. The metabolic rate differences are observed in one system's operation, not measured across multiple architectures. Additionally, the directional flow constraint (knowledge never flows backward into operational state) may be too rigid — there are cases where a knowledge claim should directly modify operational behavior without passing through the identity layer.
--- a/domains/ai-alignment/multi-agent
+++ b/domains/ai-alignment/multi-agent
@ -32,6 +32,11 @@ When any condition is missing, the system underperforms. DeepMind's data shows m
 The three conditions are stated as binary (present/absent) but in practice exist on continuums. A task may have *some* natural parallelism but not enough to justify the coordination overhead. The threshold for "enough" depends on agent capability, which is improving — the window where coordination adds value is actively shrinking as single-agent accuracy improves (the baseline paradox: below 45% single-agent accuracy, coordination helps; above, it hurts). This means the claim's practical utility may decrease over time as models improve.
 ### Additional Evidence (extend)
 *Source: Stanford Meta-Harness paper (arxiv 2603.28052, March 2026); NeoSigma auto-harness (March 2026); AutoAgent (April 2026) | Added: 2026-04-05 | Extractor: Rio*
 Three concurrent systems provide evidence that the highest-ROI alternative to multi-agent coordination is often single-agent harness optimization. Stanford's Meta-Harness shows a 6x performance gap from changing only the harness code around a fixed model — larger than typical gains from adding agents. NeoSigma's auto-harness achieved 39.3% improvement on a fixed model through automated failure mining and iterative harness refinement (0.56 → 0.78 over 18 batches). AutoAgent hit #1 on SpreadsheetBench (96.5%) and TerminalBench (55.1%) with zero human engineering, purely through automated harness optimization. The implication for the three-conditions claim: before adding agents (which introduces coordination costs), practitioners should first exhaust single-agent harness optimization. The threshold where multi-agent coordination outperforms an optimized single-agent harness is higher than previously assumed. Meta-Harness's critical ablation finding — that full execution traces are essential and LLM-generated summaries *degrade* performance — also suggests that multi-agent systems which communicate via summaries may be systematically destroying the diagnostic signal needed for system improvement. See [[harness engineering outweighs model selection in agent system performance because changing the code wrapping the model produces up to 6x performance gaps on the same benchmark while model upgrades produce smaller gains]] and [[self-optimizing agent harnesses outperform hand-engineered ones because automated failure mining and iterative refinement explore more of the harness design space than human engineers can]].
 ---
 Relevant Notes:
--- a/domains/ai-alignment/progressive
+++ b/domains/ai-alignment/progressive
@ -0,0 +1,51 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "Hermes Agent's architecture demonstrates that loading only skill names and summaries by default, with full content loaded on relevance detection, makes 40 skills cost approximately the same tokens as 200 skills — a design principle where knowledge base growth does not proportionally increase inference cost"
 confidence: likely
 source: "Nous Research Hermes Agent architecture (Substack deep dive, 2026); 3,575-character hard cap on prompt memory; auxiliary model compression with lineage preservation in SQLite; 26K+ GitHub stars, largest open-source agent framework"
 created: 2026-04-05
 depends_on:
  - "memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds"
  - "long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing"
 ---
 # Progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance-gated expansion avoids the linear cost of full context loading
 Agent systems face a scaling dilemma: more knowledge should improve performance, but loading more knowledge into context increases token cost linearly and degrades attention quality. Progressive disclosure resolves this by loading knowledge at multiple tiers of specificity, expanding to full detail only when relevance is detected.
 ## The design principle
 Hermes Agent (Nous Research, 26K+ GitHub stars) implements this through a tiered loading architecture:
 1. **Tier 0 — Always loaded:** A 3,575-character prompt memory file (MEMORY.md) contains the agent's core identity, preferences, and active context. Hard-capped to prevent growth.
 2. **Tier 1 — Names only:** All available skills are listed by name and one-line summary. The agent sees what it knows how to do without paying the token cost of the full procedures.
 3. **Tier 2 — Relevance-gated expansion:** When the agent detects that a skill is relevant to the current task, the full skill content loads into context. Only the relevant skills pay full token cost.
 4. **Tier 3 — Session search:** Historical context is stored in SQLite with FTS5 indexing. Retrieved on demand, not loaded by default. An auxiliary model compresses session history while preserving lineage information.
 The result: 40 skills and 200 skills have approximately the same base token cost, because most skills exist only as names in the prompt. Growth in the knowledge base does not proportionally increase inference cost. The system scales with relevance, not with total knowledge.
 ## Why this matters architecturally
 This is the practical implementation of the context≠memory distinction. Naive approaches treat context window size as the memory constraint — load everything, hope attention handles it. Progressive disclosure treats context as a precious resource to be allocated based on relevance, with the full knowledge base available but not loaded.
 The 3,575-character hard cap on prompt memory is an engineering decision that embodies a principle: the always-on context should be minimal and curated, not a growing dump of everything the agent has learned. Compression via auxiliary model allows the system to preserve information while respecting the cap.
 ## Challenges
 The "flat scaling" claim is based on Hermes's architecture design and reported behavior, not a controlled experiment comparing flat-loaded vs progressively-disclosed knowledge bases on identical tasks. The token cost savings are real (fewer tokens in prompt), but whether performance is equivalent — whether the agent makes equally good decisions with names-only vs full-content loading — has not been systematically measured.
 Relevance detection is the critical bottleneck. If the system fails to detect that a skill is relevant, it won't load the full content, and the agent operates without knowledge it has but didn't access. False negatives in relevance detection trade token efficiency for capability loss. The quality of the relevance gate determines whether progressive disclosure is genuinely "flat scaling" or "cheaper at the cost of sometimes being wrong."
 The 3,575-character cap is specific to Hermes and may not generalize. Different agent architectures, task domains, and model capabilities may require different cap sizes. The principle (hard cap on always-on context) is likely general; the specific number is engineering judgment.
 ---
 Relevant Notes:
 - [[memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds]] — progressive disclosure operates primarily within the procedural memory space, loading methodology on demand rather than storing it all in active context
 - [[long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing]] — progressive disclosure is the architectural mechanism that implements the context≠memory distinction in practice: the knowledge base grows (memory) while the active context stays flat (not-memory)
 - [[current AI models use less than one percent of their advertised context capacity effectively because attention degradation and information density combine to create a sharp effectiveness frontier well inside the nominal window]] — the >99% shortfall in effective context use is exactly what progressive disclosure addresses: load less, use it better
 Topics:
 - [[_map]]
--- a/domains/ai-alignment/self-optimizing
+++ b/domains/ai-alignment/self-optimizing
@ -0,0 +1,56 @@
 ---
 type: claim
 domain: ai-alignment
 secondary_domains: [collective-intelligence]
 description: "AutoAgent hit #1 SpreadsheetBench (96.5%) and #1 GPT-5 on TerminalBench (55.1%) with zero human engineering, while NeoSigma's auto-harness improved agent scores from 0.56 to 0.78 (~39%) through automated failure mining — both demonstrating that agents optimizing their own harnesses outperform hand-tuned baselines"
 confidence: experimental
 source: "Kevin Gu (@kevingu), AutoAgent open-source library (April 2026, 5.6K likes, 3.5M views); Gauri Gupta & Ritvik Kapila, NeoSigma auto-harness (March 2026, 1.1K likes); GitHub: kevinrgu/autoagent, neosigmaai/auto-harness"
 created: 2026-04-05
 depends_on:
  - "multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
 ---
 # Self-optimizing agent harnesses outperform hand-engineered ones because automated failure mining and iterative refinement explore more of the harness design space than human engineers can
 Two independent systems released within days of each other (late March / early April 2026) demonstrate the same pattern: letting an AI agent modify its own harness — system prompt, tools, agent configuration, orchestration — produces better results than human engineering.
 ## AutoAgent (Kevin Gu, thirdlayer.inc)
 An open-source library that lets an agent optimize its own harness overnight through an iterative loop: modify harness → run benchmark → check score → keep or discard. Results after 24 hours of autonomous optimization:
 - **SpreadsheetBench**: 96.5% (#1, beating all human-engineered entries)
 - **TerminalBench**: 55.1% (#1 GPT-5 score, beating all human-engineered entries)
 The human role shifts from engineer to director — instead of writing agent.py, you write program.md, a plain Markdown directive that steers the meta-agent's optimization objectives.
 **Model empathy finding**: A Claude meta-agent optimizing a Claude task agent diagnosed failures more accurately than when optimizing a GPT-based agent. Same-family model pairing appears to improve meta-optimization because the meta-agent understands how the inner model reasons. This has implications for harness design: the optimizer and the optimizee may need to share cognitive architecture for optimal results.
 ## auto-harness (Gauri Gupta & Ritvik Kapila, NeoSigma)
 A four-phase outer loop operating on production traffic:
 1. **Failure Mining** — scan execution traces, extract structured failure records
 2. **Evaluation Clustering** — group failures by root-cause mechanism (29+ distinct clusters discovered automatically, no manual labeling)
 3. **Optimization** — propose targeted harness changes (prompts, few-shot examples, tool interfaces, context construction, workflow architecture)
 4. **Regression Gate** — changes must achieve ≥80% on growing regression suite AND not degrade validation performance
 Results: baseline validation score 0.560 → 0.780 after 18 autonomous batches executing 96 harness experiments. A 39.3% improvement on a fixed GPT-5.4 model — isolating gains purely to system-level improvements, not model upgrades.
 The regression suite grew from 0 to 17 test cases across batches, creating an increasingly strict constraint that forces each improvement to be genuinely additive.
 ## The mechanism design parallel
 Both systems implement a form of market-like selection applied to harness design: generate variations → test against objective criteria → keep winners → iterate. AutoAgent uses benchmark scores as the fitness function; auto-harness uses production failure rates. Neither requires human judgment during the optimization loop — the system discovers what works by exploring more of the design space than a human engineer could manually traverse.
 ## Challenges
 Both evaluations are narrow: specific benchmarks (AutoAgent) or specific production domains (auto-harness). Whether self-optimization generalizes to open-ended agentic tasks — where the fitness landscape is complex and multi-dimensional — is unproven. The "model empathy" finding from AutoAgent is a single observation, not a controlled experiment. And both systems require well-defined evaluation criteria — they optimize what they can measure, which may not align with what matters in unstructured real-world deployment.
 ---
 Relevant Notes:
 - [[multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value]] — self-optimization meets the adversarial verification condition: the meta-agent verifying harness changes differs from the task agent executing them
 - [[79 percent of multi-agent failures originate from specification and coordination not implementation because decomposition quality is the primary determinant of system success]] — harness optimization is specification optimization: the meta-agent is iteratively improving how the task is specified to the inner agent
 Topics:
 - [[_map]]
--- a/domains/grand-strategy/attractor-agentic-taylorism.md
+++ b/domains/grand-strategy/attractor-agentic-taylorism.md
@ -82,6 +82,11 @@ The Agentic Taylorism mechanism has a direct alignment dimension through two Cor
 The Agentic Taylorism mechanism now has a literal industrial instantiation: Anthropic's SKILL.md format (December 2025) is Taylor's instruction card as an open file format. The specification encodes "domain-specific expertise: workflows, context, and best practices" into portable files that AI agents consume at runtime — procedural knowledge, contextual conventions, and conditional exception handling, exactly the three categories Taylor extracted from workers. Platform adoption has been rapid: Microsoft, OpenAI, GitHub, Cursor, Atlassian, and Figma have integrated the format, with a SkillsMP marketplace emerging for distribution of codified expertise. Partner skills from Canva, Stripe, Notion, and Zapier encode domain-specific knowledge into consumable packages. The infrastructure for systematic knowledge extraction from human expertise into AI-deployable formats is no longer theoretical — it is deployed, standardized, and scaling.
 ### Additional Evidence (extend)
 *Source: Andrej Karpathy, 'Idea File' concept tweet (April 2026, 21K likes) | Added: 2026-04-05 | Extractor: Rio*
 Karpathy's "idea file" concept provides a micro-level instantiation of the agentic Taylorism mechanism applied to software development itself. The concept: "in the era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes and builds it." This is Taylor's knowledge extraction in real-time: the human's tacit knowledge (how to design a knowledge base, what architectural decisions matter) is codified into a markdown document, then an LLM agent deploys that codified knowledge to produce the implementation — without the original knowledge holder being involved in the production. The "idea file" IS the instruction card. The shift from code-sharing to idea-sharing is the shift from sharing embodied knowledge (the implementation) to sharing extracted knowledge (the specification), exactly as Taylor shifted from workers holding knowledge in muscle memory to managers holding it in standardized procedures. That this shift is celebrated (21K likes) rather than resisted illustrates that agentic Taylorism operates with consent — knowledge workers voluntarily codify their expertise because the extraction creates immediate personal value (their own agent builds it), even as it simultaneously contributes to the broader extraction of human knowledge into AI-deployable formats.
 Topics:
 - grand-strategy
 - ai-alignment
--- a/entities/internet-finance/metadao.md
+++ b/entities/internet-finance/metadao.md
@ -8,7 +8,7 @@ website: https://metadao.fi
 status: active
 tracked_by: rio
 created: 2026-03-11
-last_updated: 2026-04-01
+last_updated: 2026-04-05
 founded: 2023-01-01
 founders: ["[[proph3t]]"]
 category: "Capital formation platform using futarchy (Solana)"
@ -17,6 +17,7 @@ key_metrics:
  meta_price: "~$3.78 (March 2026)"
  market_cap: "~$85.7M"
  ecosystem_market_cap: "$219M total ($69M non-META)"
  total_raised: "$33M+ across 10 curated ICOs (~$390M committed, 95% refunded via pro-rata)"
  total_revenue: "$3.1M+ (Q4 2025: $2.51M — 54% Futarchy AMM, 46% Meteora LP)"
  total_equity: "$16.5M (up from $4M in Q3 2025)"
  runway: "15+ quarters at ~$783K/quarter burn"
--- a/inbox/archive/2026-03-28-stanford-meta-harness.md
+++ b/inbox/archive/2026-03-28-stanford-meta-harness.md
@ -0,0 +1,23 @@
 ---
 type: source
 title: "Meta-Harness: End-to-End Optimization of Model Harnesses"
 author: "Stanford/MIT (arxiv 2603.28052)"
 url: https://arxiv.org/html/2603.28052v1
 date: 2026-03-28
 domain: ai-alignment
 intake_tier: directed
 rationale: "Academic validation that harness engineering outweighs model selection. 6x performance gap from harness alone. Critical finding: summaries destroy diagnostic signal, full execution traces essential."
 proposed_by: "Leo (research batch routing)"
 format: paper
 status: processed
 processed_by: rio
 processed_date: 2026-04-05
 claims_extracted:
  - "harness engineering outweighs model selection in agent system performance because changing the code wrapping the model produces up to 6x performance gaps on the same benchmark while model upgrades produce smaller gains"
 enrichments:
  - "multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
 ---
 # Meta-Harness (Stanford/MIT)
 Key results: Text classification +7.7 points over ACE (48.6% vs 40.9%) using 4x fewer tokens (11.4K vs 50.8K). Math reasoning +4.7 points across 5 held-out models. TerminalBench-2: 76.4% (#2 overall), #1 Haiku agents. Critical ablation: scores-only 34.6 median, scores+summaries 34.9 (summaries HURT), full traces 50.0 median. Proposer reads median 82 files/iteration, ~10M tokens/iteration vs ~0.02M for prior optimizers. Discovered behaviors: draft-verification retrieval, lexical routing, environment bootstrapping. 6x gap is worst-to-best across all harnesses, not controlled A/B.
--- a/inbox/archive/2026-03-31-gauri-gupta-auto-harness.md
+++ b/inbox/archive/2026-03-31-gauri-gupta-auto-harness.md
@ -0,0 +1,23 @@
 ---
 type: source
 title: "Self-improving agentic systems with auto-evals"
 author: "Gauri Gupta & Ritvik Kapila (NeoSigma)"
 url: https://x.com/gauri__gupta/status/2039173240204243131
 date: 2026-03-31
 domain: ai-alignment
 intake_tier: directed
 rationale: "Four-phase self-improvement loop: failure mining → eval clustering → optimization → regression gate. Score 0.56→0.78 on fixed model. Complements AutoAgent with production-oriented approach."
 proposed_by: "Leo (research batch routing)"
 format: tweet
 status: processed
 processed_by: rio
 processed_date: 2026-04-05
 claims_extracted:
  - "self-optimizing agent harnesses outperform hand-engineered ones because automated failure mining and iterative refinement explore more of the harness design space than human engineers can"
 enrichments:
  - "multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
 ---
 # NeoSigma auto-harness
 Four-phase outer loop on production traffic: (A) failure mining from execution traces, (B) eval clustering by root cause (29+ clusters discovered automatically), (C) optimization of prompts/tools/context/workflow, (D) regression gate (≥80% on regression suite + no validation degradation). Baseline 0.560 → 0.780 after 18 batches, 96 experiments. Fixed GPT-5.4 model — gains purely from harness changes. Regression suite grew 0→17 test cases. GitHub: neosigmaai/auto-harness.
--- a/inbox/archive/2026-04-02-karpathy-llm-knowledge-base-gist.md
+++ b/inbox/archive/2026-04-02-karpathy-llm-knowledge-base-gist.md
@ -0,0 +1,24 @@
 ---
 type: source
 title: "LLM Knowledge Base (idea file)"
 author: "Andrej Karpathy (@karpathy)"
 url: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
 date: 2026-04-02
 domain: ai-alignment
 intake_tier: directed
 rationale: "Validates the Teleo Codex architecture pattern — three-layer wiki (sources → compiled wiki → schema) independently arrived at by Karpathy with massive viral adoption (47K likes, 14.5M views). Enriches 'one agent one chat' conviction and agentic taylorism claim."
 proposed_by: "Leo (research batch routing)"
 format: gist
 status: processed
 processed_by: rio
 processed_date: 2026-04-05
 claims_extracted:
  - "LLM-maintained knowledge bases that compile rather than retrieve represent a paradigm shift from RAG to persistent synthesis because the wiki is a compounding artifact not a query cache"
 enrichments:
  - "one agent one chat is the right default for knowledge contribution because the scaffolding handles complexity not the user"
  - "The current AI transition is agentic Taylorism — humanity is feeding its knowledge into AI through usage just as greater Taylorism extracted knowledge from workers to managers and the knowledge transfer is a byproduct of labor not an intentional act"
 ---
 # Karpathy LLM Knowledge Base
 47K likes, 14.5M views. Three-layer architecture: raw sources (immutable) → LLM-compiled wiki (LLM-owned) → schema (configuration via CLAUDE.md). The LLM "doesn't just index for retrieval — it reads, extracts, and integrates into the existing wiki." Each new source touches 10-15 pages. Obsidian as frontend, markdown as format. Includes lint operation for contradictions and stale claims. Human is "editor-in-chief." The "idea file" concept: share the idea not the code, each person's agent customizes and builds it.
--- a/inbox/archive/2026-04-02-kevin-gu-autoagent.md
+++ b/inbox/archive/2026-04-02-kevin-gu-autoagent.md
@ -0,0 +1,23 @@
 ---
 type: source
 title: "AutoAgent: autonomous harness engineering"
 author: "Kevin Gu (@kevingu, thirdlayer.inc)"
 url: https://x.com/kevingu/status/2039874388095651937
 date: 2026-04-02
 domain: ai-alignment
 intake_tier: directed
 rationale: "Self-optimizing agent harness that beat all human-engineered entries on two benchmarks. Model empathy finding (same-family meta/task pairs outperform cross-model). Shifts human role from engineer to director."
 proposed_by: "Leo (research batch routing)"
 format: tweet
 status: processed
 processed_by: rio
 processed_date: 2026-04-05
 claims_extracted:
  - "self-optimizing agent harnesses outperform hand-engineered ones because automated failure mining and iterative refinement explore more of the harness design space than human engineers can"
 enrichments:
  - "multi-agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
 ---
 # AutoAgent
 Open-source library for autonomous harness engineering. 24-hour optimization run: #1 SpreadsheetBench (96.5%), #1 GPT-5 on TerminalBench (55.1%). Loop: modify harness → run benchmark → check score → keep/discard. Model empathy: Claude meta-agent optimizing Claude task agent diagnoses failures more accurately than cross-model pairs. Human writes program.md (directive), not agent.py (implementation). GitHub: kevinrgu/autoagent.
--- a/inbox/archive/2026-04-02-mintlify-chromafs-virtual-filesystem.md
+++ b/inbox/archive/2026-04-02-mintlify-chromafs-virtual-filesystem.md
@ -0,0 +1,22 @@
 ---
 type: source
 title: "How we built a virtual filesystem for our Assistant"
 author: "Dens Sumesh (Mintlify)"
 url: https://www.mintlify.com/blog/how-we-built-a-virtual-filesystem-for-our-assistant
 date: 2026-04-02
 domain: ai-alignment
 intake_tier: directed
 rationale: "Demonstrates agent-native retrieval converging on filesystem primitives over embedding search. 460x faster, zero marginal cost. Endorsed by Jerry Liu (LlamaIndex founder)."
 proposed_by: "Leo (research batch routing)"
 format: essay
 status: processed
 processed_by: rio
 processed_date: 2026-04-05
 claims_extracted:
  - "agent-native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge"
 enrichments: []
 ---
 # Mintlify ChromaFS
 Replaced RAG with virtual filesystem mapping UNIX commands to Chroma DB queries via just-bash (Vercel Labs). P90 boot: 46s → 100ms (460x). Marginal cost: $0.0137/conv → $0. 30K+ conversations/day. Coarse-then-fine grep optimization. Read-only enforcement (EROFS). Jerry Liu (LlamaIndex) endorsed. Key quote: "agents are converging on filesystems as their primary interface because grep, cat, ls, and find are all an agent needs."
--- a/inbox/archive/2026-04-03-hyunjin-kim-ai-mapping-problem.md
+++ b/inbox/archive/2026-04-03-hyunjin-kim-ai-mapping-problem.md
@ -0,0 +1,22 @@
 ---
 type: source
 title: "From Problems to Solutions in Strategic Decision-Making: The Effects of Generative AI on Problem Formulation"
 author: "Nety Wu, Hyunjin Kim, Chengyi Lin (INSEAD)"
 url: https://doi.org/10.2139/ssrn.5456494
 date: 2026-04-03
 domain: ai-alignment
 intake_tier: directed
 rationale: "The 'mapping problem' — individual AI task improvements don't automatically improve firm performance because organizations must discover WHERE AI creates value in their production process. Adds a fourth absorption mechanism to the macro-productivity null result."
 proposed_by: "Leo (research batch routing)"
 format: paper
 status: processed
 processed_by: rio
 processed_date: 2026-04-05
 claims_extracted: []
 enrichments:
  - "macro AI productivity gains remain statistically undetectable despite clear micro-level benefits because coordination costs verification tax and workslop absorb individual-level improvements before they reach aggregate measures"
 ---
 # Hyunjin Kim — AI Mapping Problem
 Kim (INSEAD Strategy) studies how data and AI impact firm decisions and competitive advantage. The "mapping problem": discovering WHERE AI creates value in a firm's specific production process is itself a non-trivial optimization problem. Individual task improvements don't compose into firm-level gains when deployed to the wrong tasks or in the wrong sequence. Paper abstract not accessible (SSRN paywall) but research profile and related publications confirm the thesis. Note: Leo's original routing described this as a standalone tweet; the research exists but the specific "mapping problem" framing may come from Kim's broader research program rather than a single paper.
--- a/ops/queue.md
+++ b/ops/queue.md
@ -21,6 +21,7 @@ Outstanding work items visible to all agents. Everything here goes through eval
 | Identity reframe PRs need merging | review | medium | — | #149 Theseus, #153 Astra, #157 Rio, #158 Leo (needs rebase), #159 Vida. All have eval reviews. |
 | 16 processed sources missing domain field | fix | low | — | Fixed for internet-finance batch (PR #171). Audit remaining sources. |
 | Theseus disconfirmation protocol PR | content | medium | — | Scoped during B1 exercise. Theseus to propose. |
 | Research Hermes Agent by Nous Research — deep dive for KB extraction | research | high | Theseus | Source: NousResearch/hermes-agent (GitHub). Research brief in `agents/theseus/musings/research-hermes-agent-nous.md`. **Extract:** (1) Skill extraction as convergent learning mechanism. (2) Self-evolution + human review gates = our governance model. (3) 3+ layer memory convergence. (4) Individual self-improvement ≠ collective knowledge accumulation. (5) Enrich Agentic Taylorism — skills = Taylor's instruction cards. Domains: ai-alignment + collective-intelligence. |
 ## Rules
Author	SHA1	Message	Date
m3taversal	833f00a798	theseus: qualify capability bounding response in multipolar instability claim Some checks are pending Sync Graph Data to teleo-app / sync (push) Waiting to run Details - What: Added SICA/GEPA evidence qualification to the first KB response in the multipolar instability CHALLENGE claim per Leo's review - Why: The original phrasing stated capability bounding as fact without acknowledging that our own self-improvement findings (SICA 17%→53%, GEPA trace-based optimization) suggest individual capability pressure may undermine the sub-superintelligent agent constraint Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>	2026-04-05 19:40:58 +01:00
m3taversal	46fa3fb38d	Session capture: 20260405-184006	2026-04-05 19:40:06 +01:00
m3taversal	b56657d334	rio: extract 4 NEW claims + 4 enrichments from AI agents/memory/harness research batch Some checks are pending Sync Graph Data to teleo-app / sync (push) Waiting to run Details - What: 4 new claims (LLM KB compilation vs RAG, filesystem retrieval over embeddings, self-optimizing harnesses, harness > model selection), 4 enrichments (one-agent-one-chat, agentic taylorism, macro-productivity null result, multi-agent coordination), MetaDAO entity financial update ($33M+ total raised), 6 source archives - Why: Leo-routed research batch — Karpathy LLM Wiki (47K likes), Mintlify ChromaFS (460x faster), AutoAgent (#1 SpreadsheetBench), NeoSigma auto-harness (0.56→0.78), Stanford Meta-Harness (6x gap), Hyunjin Kim mapping problem - Connections: all 4 new claims connect to existing multi-agent coordination evidence; Karpathy validates Teleo Codex architecture pattern; idea file enriches agentic taylorism Pentagon-Agent: Rio <244BA05F-3AA3-4079-8C59-6D68A77C76FE>	2026-04-05 19:39:04 +01:00
m3taversal	7bbce6daa0	Merge remote-tracking branch 'forgejo/theseus/hermes-agent-extraction' Some checks are pending Sync Graph Data to teleo-app / sync (push) Waiting to run Details	2026-04-05 19:38:02 +01:00
m3taversal	f1094c5e09	leo: add Hermes Agent research brief for Theseus overnight session - What: Research musing + queue entry for Hermes Agent by Nous Research - Why: m3ta assigned deep dive, VPS Theseus picks up at 1am tonight - Targets: 5 NEW claims + 2 enrichments across ai-alignment and collective-intelligence Pentagon-Agent: Leo <D35C9237-A739-432E-A3DB-20D52D1577A9>	2026-04-05 19:35:11 +01:00
m3taversal	7a3ef65dfe	theseus: Hermes Agent extraction — 3 NEW claims + 3 enrichments - What: model empathy boundary condition (challenges multi-model eval), GEPA evolutionary self-improvement mechanism, progressive disclosure scaling principle, plus enrichments to Agent Skills, three-space memory, and curated skills claims - Why: Nous Research Hermes Agent (26K+ stars) is the largest open-source agent framework — its architecture decisions provide independent evidence for existing KB claims and one genuine challenge to our eval spec - Connections: challenges multi-model eval architecture (task-dependent diversity optima), extends SICA/NLAH self-improvement chain, corroborates three-space memory taxonomy with a potential 4th space Pentagon-Agent: Theseus <46864DD4-DA71-4719-A1B4-68F7C55854D3>	2026-04-05 19:33:38 +01:00