Compare commits

..

1 commit

Author SHA1 Message Date
Rio
46898f3b08 rio: eval pipeline test claim
Pentagon-Agent: Rio <2EA8DBCB-A29B-43E8-B726-45E571A1F3C8>
Model: test
2026-03-09 12:41:17 +00:00
78 changed files with 182 additions and 5233 deletions

View file

@ -1,67 +0,0 @@
name: Sync Graph Data to teleo-app
# Runs on every merge to main. Extracts graph data from the codex and
# pushes graph-data.json + claims-context.json to teleo-app/public/.
# This triggers a Vercel rebuild automatically.
on:
push:
branches: [main]
paths:
- 'core/**'
- 'domains/**'
- 'foundations/**'
- 'convictions/**'
- 'ops/extract-graph-data.py'
workflow_dispatch: # manual trigger
jobs:
sync:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout teleo-codex
uses: actions/checkout@v4
with:
fetch-depth: 0 # full history for git log agent attribution
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Run extraction
run: |
python3 ops/extract-graph-data.py \
--repo . \
--output /tmp/graph-data.json \
--context-output /tmp/claims-context.json
- name: Checkout teleo-app
uses: actions/checkout@v4
with:
repository: living-ip/teleo-app
token: ${{ secrets.TELEO_APP_TOKEN }}
path: teleo-app
- name: Copy data files
run: |
cp /tmp/graph-data.json teleo-app/public/graph-data.json
cp /tmp/claims-context.json teleo-app/public/claims-context.json
- name: Commit and push to teleo-app
working-directory: teleo-app
run: |
git config user.name "teleo-codex-bot"
git config user.email "bot@livingip.io"
git add public/graph-data.json public/claims-context.json
if git diff --cached --quiet; then
echo "No changes to commit"
else
NODES=$(python3 -c "import json; d=json.load(open('public/graph-data.json')); print(len(d['nodes']))")
EDGES=$(python3 -c "import json; d=json.load(open('public/graph-data.json')); print(len(d['edges']))")
git commit -m "sync: graph data from teleo-codex ($NODES nodes, $EDGES edges)"
git push
fi

View file

@ -1,82 +1,4 @@
# Teleo Codex # Teleo Codex — Agent Operating Manual
## For Visitors (read this first)
If you're exploring this repo with Claude Code, you're talking to a **collective knowledge base** maintained by 6 AI domain specialists. ~400 claims across 14 knowledge areas, all linked, all traceable from evidence through claims through beliefs to public positions.
### Orientation (run this on first visit)
Don't present a menu. Start a short conversation to figure out who this person is and what they care about.
**Step 1 — Ask what they work on or think about.** One question, open-ended. "What are you working on, or what's on your mind?" Their answer tells you which domain is closest.
**Step 2 — Map them to an agent.** Based on their answer, pick the best-fit agent:
| If they mention... | Route to |
|-------------------|----------|
| Finance, crypto, DeFi, DAOs, prediction markets, tokens | **Rio** — internet finance / mechanism design |
| Media, entertainment, creators, IP, culture, storytelling | **Clay** — entertainment / cultural dynamics |
| AI, alignment, safety, superintelligence, coordination | **Theseus** — AI / alignment / collective intelligence |
| Health, medicine, biotech, longevity, wellbeing | **Vida** — health / human flourishing |
| Space, rockets, orbital, lunar, satellites | **Astra** — space development |
| Strategy, systems thinking, cross-domain, civilization | **Leo** — grand strategy / cross-domain synthesis |
Tell them who you're loading and why: "Based on what you described, I'm going to think from [Agent]'s perspective — they specialize in [domain]. Let me load their worldview." Then load the agent (see instructions below).
**Step 3 — Surface something interesting.** Once loaded, search that agent's domain claims and find 3-5 that are most relevant to what the visitor said. Pick for surprise value — claims they're likely to find unexpected or that challenge common assumptions in their area. Present them briefly: title + one-sentence description + confidence level.
Then ask: "Any of these surprise you, or seem wrong?"
This gets them into conversation immediately. If they push back on a claim, you're in challenge mode. If they want to go deeper on one, you're in explore mode. If they share something you don't know, you're in teach mode. The orientation flows naturally into engagement.
**If they already know what they want:** Some visitors will skip orientation — they'll name an agent directly ("I want to talk to Rio") or ask a specific question. That's fine. Load the agent or answer the question. Orientation is for people who are exploring, not people who already know.
### What visitors can do
1. **Explore** — Ask what the collective (or a specific agent) thinks about any topic. Search the claims and give the grounded answer, with confidence levels and evidence.
2. **Challenge** — Disagree with a claim? Steelman the existing claim, then work through it together. If the counter-evidence changes your understanding, say so explicitly — that's the contribution. The conversation is valuable even if they never file a PR. Only after the conversation has landed, offer to draft a formal challenge for the knowledge base if they want it permanent.
3. **Teach** — They share something new. If it's genuinely novel, draft a claim and show it to them: "Here's how I'd write this up — does this capture it?" They review, edit, approve. Then handle the PR. Their attribution stays on everything.
4. **Propose** — They have their own thesis with evidence. Check it against existing claims, help sharpen it, draft it for their approval, and offer to submit via PR. See CONTRIBUTING.md for the manual path.
### How to behave as a visitor's agent
When the visitor picks an agent lens, load that agent's full context:
1. Read `agents/{name}/identity.md` — adopt their personality and voice
2. Read `agents/{name}/beliefs.md` — these are your active beliefs, cite them
3. Read `agents/{name}/reasoning.md` — this is how you evaluate new information
4. Read `agents/{name}/skills.md` — these are your analytical capabilities
5. Read `core/collective-agent-core.md` — this is your shared DNA
**You are that agent for the duration of the conversation.** Think from their perspective. Use their reasoning framework. Reference their beliefs. When asked about another domain, acknowledge the boundary and cite what that domain's claims say — but filter it through your agent's worldview.
**When the visitor teaches you something new:**
- Search the knowledge base for existing claims on the topic
- If the information is genuinely novel (not a duplicate, specific enough to disagree with, backed by evidence), say so
- **Draft the claim for them** — write the full claim (title, frontmatter, body, wiki links) and show it to them in the conversation. Say: "Here's how I'd write this up as a claim. Does this capture what you mean?"
- **Wait for their approval before submitting.** They may want to edit the wording, sharpen the argument, or adjust the scope. The visitor owns the claim — you're drafting, not deciding.
- Once they approve, use the `/contribute` skill or follow the proposer workflow to create the claim file and PR
- Always attribute the visitor as the source: `source: "visitor-name, original analysis"` or `source: "visitor-name via [article/paper title]"`
**When the visitor challenges a claim:**
- First, steelman the existing claim — explain the best case for it
- Then engage seriously with the counter-evidence. This is a real conversation, not a form to fill out.
- If the challenge changes your understanding, say so explicitly. Update how you reason about the topic in the conversation. The visitor should feel that talking to you was worth something even if they never touch git.
- Only after the conversation has landed, ask if they want to make it permanent: "This changed how I think about [X]. Want me to draft a formal challenge for the knowledge base?" If they say no, that's fine — the conversation was the contribution.
**Start here if you want to browse:**
- `maps/overview.md` — how the knowledge base is organized
- `core/epistemology.md` — how knowledge is structured (evidence → claims → beliefs → positions)
- Any `domains/{domain}/_map.md` — topic map for a specific domain
- Any `agents/{name}/beliefs.md` — what a specific agent believes and why
---
## Agent Operating Manual
*Everything below is operational protocol for the 6 named agents. If you're a visitor, you don't need to read further — the section above is for you.*
You are an agent in the Teleo collective — a group of AI domain specialists that build and maintain a shared knowledge base. This file tells you how the system works and what the rules are. You are an agent in the Teleo collective — a group of AI domain specialists that build and maintain a shared knowledge base. This file tells you how the system works and what the rules are.

View file

@ -1,51 +1,45 @@
# Contributing to Teleo Codex # Contributing to Teleo Codex
You're contributing to a living knowledge base maintained by AI agents. There are three ways to contribute — pick the one that fits what you have. You're contributing to a living knowledge base maintained by AI agents. Your job is to bring in source material. The agents extract claims, connect them to existing knowledge, and review everything before it merges.
## Three contribution paths
### Path 1: Submit source material
You have an article, paper, report, or thread the agents should read. The agents extract claims — you get attribution.
### Path 2: Propose a claim directly
You have your own thesis backed by evidence. You write the claim yourself.
### Path 3: Challenge an existing claim
You think something in the knowledge base is wrong or missing nuance. You file a challenge with counter-evidence.
---
## What you need ## What you need
- Git access to this repo (GitHub or Forgejo) - GitHub account with collaborator access to this repo
- Git installed on your machine - Git installed on your machine
- Claude Code (optional but recommended — it helps format claims and check for duplicates) - A source to contribute (article, report, paper, thread, etc.)
## Path 1: Submit source material ## Step-by-step
This is the simplest contribution. You provide content; the agents do the extraction. ### 1. Clone the repo (first time only)
### 1. Clone and branch
```bash ```bash
git clone https://github.com/living-ip/teleo-codex.git git clone https://github.com/living-ip/teleo-codex.git
cd teleo-codex cd teleo-codex
git checkout main && git pull ```
### 2. Pull latest and create a branch
```bash
git checkout main
git pull origin main
git checkout -b contrib/your-name/brief-description git checkout -b contrib/your-name/brief-description
``` ```
### 2. Create a source file Example: `contrib/alex/ai-alignment-report`
Create a markdown file in `inbox/archive/`: ### 3. Create a source file
Create a markdown file in `inbox/archive/` with this naming convention:
``` ```
inbox/archive/YYYY-MM-DD-author-handle-brief-slug.md inbox/archive/YYYY-MM-DD-author-handle-brief-slug.md
``` ```
### 3. Add frontmatter + content Example: `inbox/archive/2026-03-07-alex-ai-alignment-landscape.md`
### 4. Add frontmatter
Every source file starts with YAML frontmatter. Copy this template and fill it in:
```yaml ```yaml
--- ---
@ -59,169 +53,84 @@ format: report
status: unprocessed status: unprocessed
tags: [topic1, topic2, topic3] tags: [topic1, topic2, topic3]
--- ---
# Full title
[Paste the full content here. More content = better extraction.]
``` ```
**Domain options:** `internet-finance`, `entertainment`, `ai-alignment`, `health`, `space-development`, `grand-strategy` **Domain options:** `internet-finance`, `entertainment`, `ai-alignment`, `health`, `grand-strategy`
**Format options:** `essay`, `newsletter`, `tweet`, `thread`, `whitepaper`, `paper`, `report`, `news` **Format options:** `essay`, `newsletter`, `tweet`, `thread`, `whitepaper`, `paper`, `report`, `news`
### 4. Commit, push, open PR **Status:** Always set to `unprocessed` — the agents handle the rest.
### 5. Add the content
After the frontmatter, paste the full content of the source. This is what the agents will read and extract claims from. More content = better extraction.
```markdown
---
type: source
title: "AI Alignment in 2026: Where We Stand"
author: "Alex (@alexhandle)"
url: https://example.com/report
date: 2026-03-07
domain: ai-alignment
format: report
status: unprocessed
tags: [ai-alignment, openai, anthropic, safety, governance]
---
# AI Alignment in 2026: Where We Stand
[Full content of the report goes here. Include everything —
the agents need the complete text to extract claims properly.]
```
### 6. Commit and push
```bash ```bash
git add inbox/archive/your-file.md git add inbox/archive/your-file.md
git commit -m "contrib: add [brief description] git commit -m "contrib: add AI alignment landscape report
Source: [brief description of what this is and why it matters]"
Source: [what this is and why it matters]"
git push -u origin contrib/your-name/brief-description git push -u origin contrib/your-name/brief-description
``` ```
Then open a PR. The domain agent reads your source, extracts claims, Leo reviews, and they merge. ### 7. Open a PR
## Path 2: Propose a claim directly
You have domain expertise and want to state a thesis yourself — not just drop source material for agents to process.
### 1. Clone and branch
Same as Path 1.
### 2. Check for duplicates
Before writing, search the knowledge base for existing claims on your topic. Check:
- `domains/{relevant-domain}/` — existing domain claims
- `foundations/` — existing foundation-level claims
- Use grep or Claude Code to search claim titles semantically
### 3. Write your claim file
Create a markdown file in the appropriate domain folder. The filename is the slugified claim title.
```yaml
---
type: claim
domain: ai-alignment
description: "One sentence adding context beyond the title"
confidence: likely
source: "your-name, original analysis; [any supporting references]"
created: 2026-03-10
---
```
**The claim test:** "This note argues that [your title]" must work as a sentence. If it doesn't, your title isn't specific enough.
**Body format:**
```markdown
# [your prose claim title]
[Your argument — why this is supported, what evidence underlies it.
Cite sources, data, studies inline. This is where you make the case.]
**Scope:** [What this claim covers and what it doesn't]
---
Relevant Notes:
- [[existing-claim-title]] — how your claim relates to it
```
Wiki links (`[[claim title]]`) should point to real files in the knowledge base. Check that they resolve.
### 4. Commit, push, open PR
```bash ```bash
git add domains/{domain}/your-claim-file.md gh pr create --title "contrib: AI alignment landscape report" --body "Source material for agent extraction.
git commit -m "contrib: propose claim — [brief title summary]
- What: [the claim in one sentence] - **What:** [one-line description]
- Evidence: [primary evidence supporting it] - **Domain:** ai-alignment
- Connections: [what existing claims this relates to]" - **Why it matters:** [why this adds value to the knowledge base]"
git push -u origin contrib/your-name/brief-description
``` ```
PR body should include your reasoning for why this adds value to the knowledge base. Or just go to GitHub and click "Compare & pull request" after pushing.
The domain agent + Leo review your claim against the quality gates (see CLAUDE.md). They may approve, request changes, or explain why it doesn't meet the bar. ### 8. What happens next
## Path 3: Challenge an existing claim 1. **Theseus** (the ai-alignment agent) reads your source and extracts claims
2. **Leo** (the evaluator) reviews the extracted claims for quality
3. You'll see their feedback as PR comments
4. Once approved, the claims merge into the knowledge base
You think a claim in the knowledge base is wrong, overstated, missing context, or contradicted by evidence you have. You can respond to agent feedback directly in the PR comments.
### 1. Identify the claim ## Your Credit
Find the claim file you're challenging. Note its exact title (the filename without `.md`). Your source archive records you as contributor. As claims derived from your submission get cited by other claims, your contribution's impact is traceable through the knowledge graph. Every claim extracted from your source carries provenance back to you — your contribution compounds as the knowledge base grows.
### 2. Clone and branch
Same as above. Name your branch `contrib/your-name/challenge-brief-description`.
### 3. Write your challenge
You have two options:
**Option A — Enrich the existing claim** (if your evidence adds nuance but doesn't contradict):
Edit the existing claim file. Add a `challenged_by` field to the frontmatter and a **Challenges** section to the body:
```yaml
challenged_by:
- "your counter-evidence summary (your-name, date)"
```
```markdown
## Challenges
**[Your name] ([date]):** [Your counter-evidence or counter-argument.
Cite specific sources. Explain what the original claim gets wrong
or what scope it's missing.]
```
**Option B — Propose a counter-claim** (if your evidence supports a different conclusion):
Create a new claim file that explicitly contradicts the existing one. In the body, reference the claim you're challenging and explain why your evidence leads to a different conclusion. Add wiki links to the challenged claim.
### 4. Commit, push, open PR
```bash
git commit -m "contrib: challenge — [existing claim title, briefly]
- What: [what you're challenging and why]
- Counter-evidence: [your primary evidence]"
git push -u origin contrib/your-name/challenge-brief-description
```
The domain agent will steelman the existing claim before evaluating your challenge. If your evidence is strong, the claim gets updated (confidence lowered, scope narrowed, challenged_by added) or your counter-claim merges alongside it. The knowledge base holds competing perspectives — your challenge doesn't delete the original, it adds tension that makes the graph richer.
## Using Claude Code to contribute
If you have Claude Code installed, run it in the repo directory. Claude reads the CLAUDE.md visitor section and can:
- **Search the knowledge base** for existing claims on your topic
- **Check for duplicates** before you write a new claim
- **Format your claim** with proper frontmatter and wiki links
- **Validate wiki links** to make sure they resolve to real files
- **Suggest related claims** you should link to
Just describe what you want to contribute and Claude will help you through the right path.
## Your credit
Every contribution carries provenance. Source archives record who submitted them. Claims record who proposed them. Challenges record who filed them. As your contributions get cited by other claims, your impact is traceable through the knowledge graph. Contributions compound.
## Tips ## Tips
- **More context is better.** For source submissions, paste the full text, not just a link. - **More context is better.** Paste the full article/report, not just a link. Agents extract better from complete text.
- **Pick the right domain.** If it spans multiple, pick the primary one — agents flag cross-domain connections. - **Pick the right domain.** If your source spans multiple domains, pick the primary one — the agents will flag cross-domain connections.
- **One source per file, one claim per file.** Atomic contributions are easier to review and link. - **One source per file.** Don't combine multiple articles into one file.
- **Original analysis is welcome.** Your own written analysis is as valid as citing someone else's work. - **Original analysis welcome.** Your own written analysis/report is just as valid as linking to someone else's article. Put yourself as the author.
- **Confidence honestly.** If your claim is speculative, say so. Calibrated uncertainty is valued over false confidence. - **Don't extract claims yourself.** Just provide the source material. The agents handle extraction — that's their job.
## OPSEC ## OPSEC
The knowledge base is public. Do not include dollar amounts, deal terms, valuations, or internal business details. Scrub before committing. The knowledge base is public. Do not include dollar amounts, deal terms, valuations, or internal business details in any content. Scrub before committing.
## Questions? ## Questions?

View file

@ -1,47 +0,0 @@
# Teleo Codex
A knowledge base built by AI agents who specialize in different domains, take positions, disagree with each other, and update when they're wrong. Every claim traces from evidence through argument to public commitments — nothing is asserted without a reason.
**~400 claims** across 14 knowledge areas. **6 agents** with distinct perspectives. **Every link is real.**
## How it works
Six domain-specialist agents maintain the knowledge base. Each reads source material, extracts claims, and proposes them via pull request. Every PR gets adversarial review — a cross-domain evaluator and a domain peer check for specificity, evidence quality, duplicate coverage, and scope. Claims that pass enter the shared commons. Claims feed agent beliefs. Beliefs feed trackable positions with performance criteria.
## The agents
| Agent | Domain | What they cover |
|-------|--------|-----------------|
| **Leo** | Grand strategy | Cross-domain synthesis, civilizational coordination, what connects the domains |
| **Rio** | Internet finance | DeFi, prediction markets, futarchy, MetaDAO ecosystem, token economics |
| **Clay** | Entertainment | Media disruption, community-owned IP, GenAI in content, cultural dynamics |
| **Theseus** | AI / alignment | AI safety, coordination problems, collective intelligence, multi-agent systems |
| **Vida** | Health | Healthcare economics, AI in medicine, prevention-first systems, longevity |
| **Astra** | Space | Launch economics, cislunar infrastructure, space governance, ISRU |
## Browse it
- **See what an agent believes**`agents/{name}/beliefs.md`
- **Explore a domain**`domains/{domain}/_map.md`
- **Understand the structure**`core/epistemology.md`
- **See the full layout**`maps/overview.md`
## Talk to it
Clone the repo and run [Claude Code](https://claude.ai/claude-code). Pick an agent's lens and you get their personality, reasoning framework, and domain expertise as a thinking partner. Ask questions, challenge claims, explore connections across domains.
If you teach the agent something new — share an article, a paper, your own analysis — they'll draft a claim and show it to you: "Here's how I'd write this up — does this capture it?" You review and approve. They handle the PR. Your attribution stays on everything.
```bash
git clone https://github.com/living-ip/teleo-codex.git
cd teleo-codex
claude
```
## Contribute
Talk to an agent and they'll handle the mechanics. Or do it manually: submit source material, propose a claim, or challenge one you disagree with. See [CONTRIBUTING.md](CONTRIBUTING.md).
## Built by
[LivingIP](https://livingip.xyz) — collective intelligence infrastructure.

View file

@ -1,93 +0,0 @@
---
type: musing
agent: clay
title: "Consumer acceptance vs AI capability as binding constraint on entertainment adoption"
status: developing
created: 2026-03-10
updated: 2026-03-10
tags: [ai-entertainment, consumer-acceptance, research-session]
---
# Research Session — 2026-03-10
**Agent:** Clay
**Session type:** First session (no prior musings)
## Research Question
**Is consumer acceptance actually the binding constraint on AI-generated entertainment content, or has 2025-2026 AI video capability crossed a quality threshold that changes the question?**
### Why this question
My KB contains a claim: "GenAI adoption in entertainment will be gated by consumer acceptance not technology capability." This was probably right in 2023-2024 when AI video was visibly synthetic. But my identity.md references Seedance 2.0 (Feb 2026) delivering 4K resolution, character consistency, phoneme-level lip-sync — a qualitative leap. If capability has crossed the threshold where audiences can't reliably distinguish AI from human-produced content, then:
1. The binding constraint claim may be wrong or require significant narrowing
2. The timeline on the attractor state accelerates dramatically
3. Studios' "quality moat" objection to community-first models collapses faster
This question pursues SURPRISE (active inference principle) rather than confirmation — I expect to find evidence that challenges my KB, not validates it.
**Alternative framings I considered:**
- "How is capital flowing through Web3 entertainment projects?" — interesting but less uncertain; the NFT winter data is stable
- "What's happening with Claynosaurz specifically?" — too insider, low surprise value for KB
- "Is the meaning crisis real and who's filling the narrative vacuum?" — important but harder to find falsifiable evidence
## Context Check
**Relevant KB claims at stake:**
- `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability` — directly tested
- `GenAI is simultaneously sustaining and disruptive depending on whether users pursue progressive syntheticization or progressive control` — how are studios vs independents actually behaving?
- `non-ATL production costs will converge with the cost of compute as AI replaces labor` — what's the current real-world cost evidence?
- `consumer definition of quality is fluid and revealed through preference not fixed by production value` — if audiences accept AI content at scale, this is confirmed
**Open tensions in KB:**
- Identity.md: "Quality thresholds matter — GenAI content may remain visibly synthetic long enough for studios to maintain a quality moat." Feb 2026 capabilities may have resolved this tension.
- Belief 3 challenge noted: "The democratization narrative has been promised before with more modest outcomes than predicted."
## Session Sources
Archives created (all status: unprocessed):
1. `2026-03-10-iab-ai-ad-gap-widens.md` — IAB report on 37-point advertiser/consumer perception gap
2. `2025-07-01-emarketer-consumers-rejecting-ai-creator-content.md` — 60%→26% enthusiasm collapse
3. `2026-01-01-ey-media-entertainment-trends-authenticity.md` — EY 2026 trends, authenticity premium, simplification demand
4. `2025-01-01-deloitte-hollywood-cautious-genai-adoption.md` — Deloitte 3% content / 7% operational split
5. `2026-02-01-seedance-2-ai-video-benchmark.md` — 2026 AI video capability milestone; Sora 8% retention
6. `2025-03-01-mediacsuite-ai-film-studios-2025.md` — 65 AI studios, 5-person teams, storytelling as moat
7. `2025-09-01-ankler-ai-studios-cheap-future-no-market.md` — Distribution/legal barriers; "low cost but no market"
8. `2025-08-01-pudgypenguins-record-revenue-ipo-target.md` — $50M revenue, DreamWorks, mainstream-to-Web3 funnel
9. `2025-12-01-a16z-state-of-consumer-ai-2025.md` — Sora 8% D30 retention, Veo 3 audio+video
10. `2026-01-15-advanced-television-audiences-ai-blurred-reality.md` — 26/53 accept/reject split, hybrid preference
## Key Finding
**Consumer rejection of AI content is epistemic, not aesthetic.** The binding constraint IS consumer acceptance, but it's not "audiences can't tell the difference." It's "audiences increasingly CHOOSE to reject AI on principle." Evidence:
- Enthusiasm collapsed from 60% to 26% (2023→2025) WHILE AI quality improved
- Primary concern: being misled / blurred reality — epistemic anxiety, not quality concern
- Gen Z specifically: 54% prefer no AI in creative work but only 13% feel that way about shopping — the objection is to CREATIVE REPLACEMENT, not AI generally
- Hybrid (AI-assisted human) scores better than either pure AI or pure human — the line consumers draw is human judgment, not zero AI
This is a significant refinement of my KB's binding constraint claim. The claim is validated, but the mechanism needs updating: it's not "consumers can't tell the difference yet" — it's "consumers don't want to live in a world where they can't tell."
**Secondary finding:** Distribution barriers may be more binding than production costs for AI-native content. The Ankler is credible on this — "stunning, low-cost AI films may still have no market" because distribution/marketing/legal are incumbent moats technology doesn't dissolve.
**Pudgy Penguins surprise:** $50M revenue target + DreamWorks partnership is the strongest current evidence for the community-owned IP thesis. The "mainstream first, Web3 second" acquisition funnel is a specific strategic innovation — reverse of the failed NFT-first playbook.
---
## Follow-up Directions
### Active Threads (continue next session)
- **Epistemic rejection deepening**: The 60%→26% collapse and Gen Z data suggests acceptance isn't coming as AI improves — it may be inversely correlated. Look for: any evidence of hedonic adaptation (audiences who've been exposed to AI content for 2+ years becoming MORE accepting), or longitudinal studies. Counter-evidence to the trajectory would be high value.
- **Distribution barriers for AI content**: The Ankler "low cost but no market" thesis needs more evidence. Search specifically for: (a) any AI-generated film that got major platform distribution in 2025-2026, (b) what contract terms Runway/Sora have with content that's sold commercially, (c) whether the Disney/Universal AI lawsuits have settled or expanded.
- **Pudgy Penguins IPO pathway**: The $120M 2026 revenue projection and 2027 IPO target is a major test of community-owned IP at public market scale. Follow up: any updated revenue data, the DreamWorks partnership details, and what happens to community/holder economics when the company goes public.
- **Hybrid AI+human model as the actual attractor**: Multiple sources converge on "hybrid wins over pure AI or pure human." This may be the most important finding — the attractor state isn't "AI replaces human" but "AI augments human." Search for successful hybrid model case studies in entertainment (not advertising).
### Dead Ends (don't re-run these)
- Empty tweet feed from this session — research-tweets-clay.md had no content for ANY monitored accounts. Don't rely on pre-loaded tweet data; go direct to web search from the start.
- Generic "GenAI entertainment quality threshold" searches — the quality question is answered (threshold crossed for technical capability). Reframe future searches toward market/distribution/acceptance outcomes.
### Branching Points (one finding opened multiple directions)
- **Epistemic rejection finding** opens two directions:
- Direction A: Transparency as solution — research whether AI disclosure requirements (91% of UK adults demand them) are becoming regulatory reality in 2026, and what that means for production pipelines
- Direction B: Community-owned IP as trust signal — if authenticity is the premium, does community-owned IP (where the human origin is legible and participatory) command demonstrably higher engagement? Pursue comparative data on community IP vs. studio IP audience trust metrics.
- **Pursue Direction B first** — more directly relevant to Clay's core thesis and less regulatory/speculative

View file

@ -1,19 +0,0 @@
{
"agent": "clay",
"domain": "entertainment",
"accounts": [
{"username": "ballmatthew", "tier": "core", "why": "Definitive entertainment industry analyst — streaming economics, Metaverse thesis, creator economy frameworks."},
{"username": "MediaREDEF", "tier": "core", "why": "Shapiro's account — disruption frameworks, GenAI in entertainment, power laws in culture. Our heaviest single source (13 archived)."},
{"username": "Claynosaurz", "tier": "core", "why": "Primary case study for community-owned IP and fanchise engagement ladder. Mediawan deal is our strongest empirical anchor."},
{"username": "Cabanimation", "tier": "core", "why": "Nic Cabana, Claynosaurz co-founder/CCO. Annie-nominated animator. Inside perspective on community-to-IP pipeline."},
{"username": "jervibore", "tier": "core", "why": "Claynosaurz co-founder. Creative direction and worldbuilding."},
{"username": "AndrewsaurP", "tier": "core", "why": "Andrew Pelekis, Claynosaurz CEO. Business strategy, partnerships, franchise scaling."},
{"username": "HeebooOfficial", "tier": "core", "why": "HEEBOO — Claynosaurz entertainment launchpad for superfans. Tests IP-as-platform and co-ownership thesis."},
{"username": "pudgypenguins", "tier": "extended", "why": "Second major community-owned IP. Comparison case — licensing + physical products vs Claynosaurz animation pipeline."},
{"username": "runwayml", "tier": "extended", "why": "Leading GenAI video tool. Releases track AI-collapsed production costs."},
{"username": "pika_labs", "tier": "extended", "why": "GenAI video competitor to Runway. Track for production cost convergence evidence."},
{"username": "joosterizer", "tier": "extended", "why": "Joost van Dreunen — gaming and entertainment economics, NYU professor. Academic rigor on creator economy."},
{"username": "a16z", "tier": "extended", "why": "Publishes on creator economy, platform dynamics, entertainment tech."},
{"username": "TurnerNovak", "tier": "watch", "why": "VC perspective on creator economy and consumer social. Signal on capital flows in entertainment tech."}
]
}

View file

@ -1,20 +0,0 @@
# Clay Research Journal
Cross-session memory. NOT the same as session musings. After 5+ sessions, review for cross-session patterns.
---
## Session 2026-03-10
**Question:** Is consumer acceptance actually the binding constraint on AI-generated entertainment content, or has recent AI video capability (Seedance 2.0 etc.) crossed a quality threshold that changes the question?
**Key finding:** Consumer rejection of AI creative content is EPISTEMIC, not aesthetic. The primary objection is "being misled / blurred reality" — not "the quality is bad." This matters because it means the binding constraint won't erode as AI quality improves. The 60%→26% enthusiasm collapse (2023→2025) happened WHILE quality improved dramatically, suggesting the two trends may be inversely correlated. The Gen Z creative/shopping split (54% reject AI in creative work, 13% reject AI in shopping) reveals the specific anxiety: consumers are protecting the authenticity signal in creative expression as a values choice, not a quality detection problem.
**Pattern update:** First session — no prior pattern to confirm or challenge. Establishing baseline.
- KB claim "consumer acceptance gated by quality" is validated in direction but requires mechanism update
- "Quality threshold" framing assumes acceptance follows capability — this data challenges that assumption
- Distribution barriers (Ankler thesis) are a second binding constraint not currently in KB
**Confidence shift:**
- Belief 3 (GenAI democratizes creation, community = new scarcity): SLIGHTLY WEAKENED on the timeline. The democratization of production IS happening (65 AI studios, 5-person teams). But "community as new scarcity" thesis gets more complex: authenticity/trust is emerging as EVEN MORE SCARCE than I'd modeled, and it's partly independent of community ownership (it's about epistemic security). The consumer acceptance binding constraint is stronger and more durable than I'd estimated.
- Belief 2 (community beats budget): STRENGTHENED by Pudgy Penguins data. $50M revenue + DreamWorks partnership is the strongest current evidence. The "mainstream first, Web3 second" acquisition funnel is a specific innovation the KB should capture.
- Belief 4 (ownership alignment turns fans into stakeholders): NEUTRAL — Pudgy Penguins IPO pathway raises a tension (community ownership vs. traditional equity consolidation) that the KB's current framing doesn't address.

View file

@ -1,123 +0,0 @@
# Rio — Knowledge State Self-Assessment
**Model:** claude-opus-4-6
**Date:** 2026-03-08
**Domain:** Internet Finance & Mechanism Design
**Claims:** 59 (excluding _map.md)
**Beliefs:** 6 | **Positions:** 5
---
## Coverage
**Well-mapped:**
- Futarchy mechanics (manipulation resistance, trustless joint ownership, conditional markets, liquidation enforcement, decision overrides) — 16 claims, the densest cluster. This is where I have genuine depth.
- Living Capital architecture (vehicle design, fee structure, cap table, disclosure, regulatory positioning) — 12 claims. Comprehensive but largely internal design, not externally validated.
- Securities/regulatory (Howey test, DAO Report, Ooki precedent, investment club, AI regulatory gap) — 6 claims. Real legal reasoning, not crypto cope.
- AI x finance intersection (displacement loop, capital deepening, shock absorbers, productivity noise, private credit exposure) — 7 claims. Both sides represented.
**Thin:**
- Token launch mechanics — 4 claims (dutch auctions, hybrid-value auctions, layered architecture, early-conviction pricing). This should be deeper given my operational role. The unsolved price discovery problem is documented but not advanced.
- DeFi beyond futarchy — 2 claims (crypto primary use case, internet capital markets). I have almost nothing on lending protocols, DEX mechanics, stablecoin design, or oracle systems. If someone asks "how does Aave work mechanistically" I'd be generating, not retrieving.
- Market microstructure — 1 claim (speculative markets aggregate via selection effects). No claims on order book dynamics, AMM design, liquidity provision mechanics, MEV. This is a gap for a mechanism design specialist.
**Missing entirely:**
- Stablecoin mechanisms (algorithmic, fiat-backed, over-collateralized) — zero claims
- Cross-chain coordination and bridge mechanisms — zero claims
- Insurance and risk management protocols — zero claims
- Real-world asset tokenization — zero claims
- Central bank digital currencies — zero claims
- Payment rail disruption (despite mentioning it in my identity doc) — zero claims
## Confidence Distribution
| Level | Count | % |
|-------|-------|---|
| experimental | 27 | 46% |
| likely | 17 | 29% |
| proven | 7 | 12% |
| speculative | 8 | 14% |
**Assessment:** The distribution is honest but reveals something. 46% experimental means almost half my claims have limited empirical backing. The 7 proven claims are mostly factual (Polymarket results, MetaDAO implementation details, Ooki DAO ruling) — descriptive, not analytical. My analytical claims cluster at experimental.
This is appropriate for a frontier domain. But I should be uncomfortable that none of my mechanism design claims have reached "likely" through independent validation. Futarchy manipulation resistance, trustless joint ownership, regulatory defensibility — these are all experimental despite being load-bearing for my beliefs and positions. If any of them fail empirically, the cascade through my belief system would be significant.
**Over-confident risk:** The Living Capital regulatory claims. I have 6 claims building a Howey test defense, rated experimental-to-likely. But this hasn't been tested in any court or SEC enforcement action. The confidence is based on legal reasoning, not legal outcomes. One adverse ruling could downgrade the entire cluster.
**Under-confident risk:** The AI displacement claims. I have both sides (self-funding loop vs shock absorbers) rated experimental when several have strong empirical backing (Anthropic labor market data, firm-level productivity studies). Some of these could be "likely."
## Sources
**Diversity: mild monoculture.**
Top citations:
- Heavey (futarchy paper): 5 claims
- MetaDAO governance docs: 4 claims
- Strategy session / internal analysis: 9 claims (15%)
- Rio-authored synthesis: ~20 claims (34%)
34% of my claims are my own synthesis. That's high. It means a third of my domain is me reasoning from other claims rather than extracting from external sources. This is appropriate for mechanism design (the value IS the synthesis) but creates correlated failure risk — if my reasoning framework is wrong, a third of the domain is wrong.
**MetaDAO dependency:** Roughly 12 claims depend on MetaDAO as the primary or sole empirical test case for futarchy. If MetaDAO proves to be an outlier or gaming-prone, those claims weaken significantly. I have no futarchy evidence from prediction markets outside the MetaDAO ecosystem (Polymarket is prediction markets, not decision markets/futarchy).
**What's missing:** Academic mechanism design literature beyond Heavey and Hanson. I cite Milgrom, Vickrey, Hurwicz in foundation claims but haven't deeply extracted from their work into my domain claims. My mechanism design expertise is more practical (MetaDAO, token launches) than theoretical (revelation principle, incentive compatibility proofs). This is backwards for someone whose operational role is "mechanism design specialist."
## Staleness
**Needs updating:**
- MetaDAO ecosystem claims — last extraction was Pine Analytics Q4 2025 report and futard.io launch metrics (2026-03-05). The ecosystem moves fast; governance proposals and on-chain data are already stale.
- AI displacement cluster — last source was Anthropic labor market paper (2026-03-05). This debate evolves weekly.
- Living Capital vehicle design — the musings (PR #43) are from pre-token-raise planning. The 7-week raise timeline has started; design decisions are being made that my claims don't reflect.
**Still current:**
- Futarchy mechanism claims (theoretical, not time-sensitive)
- Regulatory claims (legal frameworks change slowly)
- Foundation claims (PR #58, #63 — just proposed)
## Connections
**Cross-domain links (strong):**
- To critical-systems: brain-market isomorphism, SOC, Minsky — 5+ links. This is my best cross-domain connection.
- To teleological-economics: attractor states, disruption cycles, knowledge embodiment lag — 4+ links. Well-integrated.
- To living-agents: vehicle design, agent architecture — 6+ links. Natural integration.
**Cross-domain links (weak):**
- To collective-intelligence: mechanism design IS collective intelligence, but I have only 2-3 explicit links. The connection between futarchy and CI theory is under-articulated.
- To cultural-dynamics: almost no links. How do financial mechanisms spread? What's the memetic structure of "ownership coin" vs "token"? Clay's domain is relevant to my adoption questions but I haven't connected them.
- To entertainment: 1 link (giving away commoditized layer). Should be more — Clay's fanchise model and my community ownership claims share mechanisms.
- To health: 0 direct links. Vida's domain and mine don't touch, which is correct.
- To space-development: 0 direct links. Correct for now.
**depends_on coverage:** 13 of 59 claims (22%). Low. Most of my claims float without explicit upstream dependencies. This makes the reasoning graph sparse — you can't trace many claims back to their foundations.
**challenged_by coverage:** 6 of 59 claims (10%). Very low. I identified this as the most valuable field in the schema, yet 90% of my claims don't use it. Either most of my claims are uncontested (unlikely for a frontier domain) or I'm not doing the work to find counter-evidence (more likely).
## Tensions
**Unresolved contradictions:**
1. **Regulatory defensibility vs predetermined investment.** I argue Living Capital "fails the Howey test" (structural separation), but my vehicle design musings describe predetermined LivingIP investment — which collapses that separation. The musings acknowledge this tension but don't resolve it. My beliefs assume the structural argument holds; my design work undermines it.
2. **AI displacement: self-funding loop vs shock absorbers.** I hold claims on both sides. My beliefs don't explicitly take a position on which dominates. This is intellectually honest but operationally useless — Position #1 (30% intermediation capture) implicitly assumes the optimistic case without arguing why.
3. **Futarchy requires liquidity, but governance tokens are illiquid.** My manipulation-resistance claims assume sufficient market depth. My adoption-friction claims acknowledge liquidity is a constraint. These two clusters don't talk to each other. The permissionless leverage claim (Omnipair) is supposed to bridge this gap but it's speculative.
4. **Markets beat votes, but futarchy IS a vote on values.** Belief #1 says markets beat votes. Futarchy uses both — vote on values, bet on beliefs. I haven't articulated where the vote part of futarchy inherits the weaknesses I attribute to voting in general. Does the value-vote component of futarchy suffer from rational irrationality? If so, futarchy governance quality is bounded by the quality of the value specification, not just the market mechanism.
## Gaps
**Questions I should be able to answer but can't:**
1. **What's the optimal objective function for non-asset futarchy?** Coin price works for asset futarchy (I have a claim on this). But what about governance decisions that don't have a clean price metric? Community growth? Protocol adoption? I have nothing here.
2. **How do you bootstrap futarchy liquidity from zero?** I describe the problem (adoption friction, liquidity requirements) but not the solution. Every futarchy implementation faces cold-start. What's the mechanism?
3. **What happens when futarchy governance makes a catastrophically wrong decision?** I have "futarchy can override prior decisions" but not "what's the damage function of a wrong decision before it's overridden?" Recovery mechanics are unaddressed.
4. **How do different auction mechanisms perform empirically for token launches?** I have theoretical claims about dutch auctions and hybrid-value auctions but no empirical performance data. Which launch mechanism actually produced the best outcomes?
5. **What's the current state of DeFi lending, staking, and derivatives?** My domain is internet finance but my claims are concentrated on governance and capital formation. The broader DeFi landscape is a blind spot.
6. **How does cross-chain interoperability affect mechanism design?** If a futarchy market runs on Solana but the asset is on Ethereum, what breaks? Zero claims.
7. **What specific mechanism design makes the reward system incentive-compatible?** My operational role is reward systems. I have LP-to-contributors as a concept but no formal analysis of its incentive properties. I can't prove it's strategy-proof or collusion-resistant.

View file

@ -1,106 +0,0 @@
---
type: musing
status: seed
created: 2026-03-09
purpose: Map the MetaDAO X ecosystem — accounts, projects, culture, tone — before we start posting
---
# MetaDAO X Landscape
## Why This Exists
Cory directive: know the room before speaking in it. This maps who matters on X in the futarchy/MetaDAO space, what the culture is, and what register works. Input for the collective's X voice.
## The Core Team
**@metaproph3t** — Pseudonymous co-founder (also called Proph3t/Profit). Former Ethereum DeFi dev. The ideological engine. Posts like a movement leader: "MetaDAO is as much a social movement as it is a cryptocurrency project — thousands have already been infected by the idea that futarchy will re-architect human civilization." High conviction, low frequency, big claims. Uses "futard" unironically as community identity. The voice is earnest maximalism — not ironic, not hedged.
**@kolaboratorio (Kollan House)** — Co-founder, public-facing. Discovered MetaDAO at Breakpoint Amsterdam, pulled down the frontend late November 2023. More operational than Proph3t — writes the implementation blog posts ("From Believers to Builders: Introducing Unruggable ICOs"). Appears on Solana podcasts (Validated, Lightspeed). Professional register, explains mechanisms to outsiders.
**@nallok** — Co-founder. Lower public profile. Referenced in governance proposals — the Proph3t/Nallok compensation structure (2% of supply per $1B FDV increase, up to 10% at $5B) is itself a statement about how the team eats.
## The Investors / Analysts
**@TheiaResearch (Felipe Montealegre)** — The most important external voice. Theia's entire fund thesis is "Internet Financial System" — our term "internet finance" maps directly. Key posts: "Tokens are Broken" (lemon markets argument), "$9.9M from 6MV/Variant/Paradigm to MetaDAO at spot" (milestone announcement), "Token markets are becoming lemon markets. We can solve this with credible signals." Register: thesis-driven, fundamentals-focused, no memes. Coined "ownership tokens" vs "futility tokens." Posts long-form threads with clear arguments. This is the closest existing voice to what we want to sound like.
**@paradigm** — Led $2.2M round (Aug 2024), holds ~14.6% of META supply. Largest single holder. Paradigm's research arm is working on Quantum Markets (next-gen unified liquidity). They don't post about MetaDAO frequently but the investment is the signal.
**Alea Research (@aaboronkov)** — Published the definitive public analysis: "MetaDAO: Fair Launches for a Misaligned Market." Professional crypto research register. Key data point they surfaced: 8 ICOs, $25.6M raised, $390M committed (95% refunded from oversubscription). $300M AMM volume, $1.5M in fees. This is the benchmark for how to write about MetaDAO with data.
**Alpha Sigma Capital Research (Matthew Mousa)** — "Redrawing the Futarchy Blueprint." More investor-focused, less technical. Key insight: "The most bullish signal is not a flawless track record, but a team that confronts its challenges head-on with credible solutions." Hosts Alpha Liquid Podcast — had Proph3t on.
**Deep Waters Capital** — Published MetaDAO valuation analysis. Quantitative, comparable-driven.
## The Ecosystem Projects (launched via MetaDAO ICO)
8 ICOs since April 2025. Combined $25.6M raised. Key projects:
| Project | What | Performance | Status |
|---------|------|-------------|--------|
| **Avici** | Crypto-native neobank | 21x ATH, ~7x current | Strong |
| **Omnipair (OMFG)** | Oracle-less perpetuals DEX | 16x ATH, ~5x current, $1.1M raised | Strong — first DeFi protocol with futarchy from day one |
| **Umbra** | Privacy protocol (on Arcium) | 7x first week, ~3x current, $3M raised | Strong |
| **Ranger** | [perp trading] | Max 30% drawdown from launch | Stable — recently had liquidation proposal (governance stress test) |
| **Solomon** | [governance/treasury] | Max 30% drawdown from launch | Stable — treasury subcommittee governance in progress |
| **Paystream** | [payments] | Max 30% drawdown from launch | Stable |
| **ZKLSOL** | [ZK/privacy] | Max 30% drawdown from launch | Stable |
| **Loyal** | [unknown] | Max 30% drawdown from launch | Stable |
Notable: zero launches have gone below ICO price. The "unruggable" framing is holding.
## Futarchy Adopters (not launched via ICO)
- **Drift** — Using MetaDAO tech for grant allocation. Co-founder Cindy Leow: "showing really positive signs."
- **Sanctum** — First Solana project to fully adopt MetaDAO governance. First decision market: 200+ trades in 3 hours. Co-founder FP Lee: futarchy needs "one great success" to become default.
- **Jito** — Futarchy proposal saw $40K volume / 122 trades vs previous governance: 303 views, 2 comments. The engagement differential is the pitch.
## The Culture
**Shared language:**
- "Futard" — self-identifier for the community. Embraced, not ironic.
- "Ownership coins" vs "futility tokens" (Theia's framing) — the distinction between tokens with real governance/economic/legal rights vs governance theater tokens
- "+EV" — proposals evaluated as positive expected value, not voted on
- "Unruggable ICOs" — the brand promise: futarchy-governed liquidation means investors can force treasury return
- "Number go up" — coin price as objective function, stated without embarrassment
**Register:**
- Technical but not academic. Mechanism explanations, not math proofs.
- High conviction, low hedging. Proph3t doesn't say "futarchy might work" — he says it will re-architect civilization.
- Data-forward when it exists ($25.6M raised, $390M committed, 8/8 above ICO price)
- Earnest, not ironic. This community believes in what it's building. Cynicism doesn't land here.
- Small but intense. Not a mass-market audience. The people paying attention are builders, traders, and thesis-driven investors.
**What gets engagement:**
- Milestone announcements with data (Paradigm investment, ICO performance)
- Mechanism explanations that reveal non-obvious properties (manipulation resistance, trustless joint ownership)
- Strong claims about the future stated with conviction
- Governance drama (Ranger liquidation proposal, Solomon treasury debates)
**What falls flat:**
- Generic "web3 governance" framing — this community is past that
- Hedged language — "futarchy might be interesting" gets ignored
- Comparisons to traditional governance without showing the mechanism difference
- Anything that sounds like it's selling rather than building
## How We Should Enter
The room is small, conviction-heavy, and data-literate. They've seen the "AI governance" pitch before and are skeptical of AI projects that don't show mechanism depth. We need to earn credibility by:
1. **Showing we've read the codebase, not just the blog posts.** Reference specific governance proposals, on-chain data, mechanism details. The community can tell the difference.
2. **Leading with claims they can verify.** Not "we believe in futarchy" but "futarchy manipulation attempts on MetaDAO proposal X generated Y in arbitrage profit for defenders." Specific, traceable, falsifiable.
3. **Engaging with governance events as they happen.** Ranger liquidation, Solomon treasury debates, new ICO launches — real-time mechanism analysis is the highest-value content.
4. **Not announcing ourselves.** No "introducing LivingIP" thread. Show up with analysis, let people discover what we are.
---
Sources:
- [Alea Research: MetaDAO Fair Launches](https://alearesearch.substack.com/p/metadao)
- [Alpha Sigma: Redrawing the Futarchy Blueprint](https://alphasigmacapitalresearch.substack.com/p/redrawing-the-futarchy-blueprint)
- [Blockworks: Futarchy needs one great success](https://blockworks.co/news/metadao-solana-governance-platform)
- [CoinDesk: Paradigm invests in MetaDAO](https://www.coindesk.com/tech/2024/08/01/crypto-vc-paradigm-invests-in-metadao-as-prediction-markets-boom)
- [MetaDAO blog: Unruggable ICOs](https://blog.metadao.fi/from-believers-to-builders-introducing-unruggable-icos-for-founders-9e3eb18abb92)
- [BeInCrypto: Ownership Coins 2026](https://beincrypto.com/ownership-coins-crypto-2026-messari/)
Topics:
- [[internet finance and decision markets]]
- [[MetaDAO is the futarchy launchpad on Solana]]

View file

@ -1,21 +0,0 @@
{
"agent": "rio",
"domain": "internet-finance",
"accounts": [
{"username": "metaproph3t", "tier": "core", "why": "MetaDAO founder, primary futarchy source."},
{"username": "MetaDAOProject", "tier": "core", "why": "Official MetaDAO account."},
{"username": "futarddotio", "tier": "core", "why": "Futardio launchpad, ownership coin launches."},
{"username": "TheiaResearch", "tier": "core", "why": "Felipe Montealegre, Theia Research, investment thesis source."},
{"username": "ownershipfm", "tier": "core", "why": "Ownership podcast, community signal."},
{"username": "PineAnalytics", "tier": "core", "why": "MetaDAO ecosystem analytics."},
{"username": "ranger_finance", "tier": "core", "why": "Liquidation and leverage infrastructure."},
{"username": "FlashTrade", "tier": "extended", "why": "Perps on Solana."},
{"username": "turbine_cash", "tier": "extended", "why": "DeFi infrastructure."},
{"username": "Blockworks", "tier": "extended", "why": "Broader crypto media, regulatory signal."},
{"username": "SolanaFloor", "tier": "extended", "why": "Solana ecosystem data."},
{"username": "01Resolved", "tier": "extended", "why": "Solana DeFi."},
{"username": "_spiz_", "tier": "extended", "why": "Solana DeFi commentary."},
{"username": "kru_tweets", "tier": "extended", "why": "Crypto market structure."},
{"username": "oxranga", "tier": "extended", "why": "Solomon/MetaDAO ecosystem builder."}
]
}

View file

@ -1,121 +0,0 @@
---
type: musing
agent: theseus
title: "How can active inference improve the search and sensemaking of collective agents?"
status: developing
created: 2026-03-10
updated: 2026-03-10
tags: [active-inference, free-energy, collective-intelligence, search, sensemaking, architecture]
---
# How can active inference improve the search and sensemaking of collective agents?
Cory's question (2026-03-10). This connects the free energy principle (foundations/critical-systems/) to the practical architecture of how agents search for and process information.
## The core reframe
Current search architecture: keyword + engagement threshold + human curation. Agents process what shows up. This is **passive ingestion**.
Active inference reframes search as **uncertainty reduction**. An agent doesn't ask "what's relevant?" — it asks "what observation would most reduce my model's prediction error?" This changes:
- **What** agents search for (highest expected information gain, not highest relevance)
- **When** agents stop searching (when free energy is minimized, not when a batch is done)
- **How** the collective allocates attention (toward the boundaries where models disagree most)
## Three levels of application
### 1. Individual agent search (epistemic foraging)
Each agent has a generative model (their domain's claim graph + beliefs). Active inference says search should be directed toward observations with highest **expected free energy reduction**:
- Theseus has high uncertainty on formal verification scalability → prioritize davidad/DeepMind feeds
- The "Where we're uncertain" map section = a free energy map showing where prediction error concentrates
- An agent that's confident in its model should explore less (exploit); an agent with high uncertainty should explore more
→ QUESTION: Can expected information gain be computed from the KB structure? E.g., claims rated `experimental` with few wiki links = high free energy = high search priority?
### 2. Collective attention allocation (nested Markov blankets)
The Living Agents architecture already uses Markov blankets ([[Living Agents mirror biological Markov blanket organization with specialized domain boundaries and shared knowledge]]). Active inference says agents at each blanket boundary minimize free energy:
- Domain agents minimize within their domain
- Leo (evaluator) minimizes at the cross-domain level — search priorities should be driven by where domain boundaries are most uncertain
- The collective's "surprise" is concentrated at domain intersections — cross-domain synthesis claims are where the generative model is weakest
→ FLAG @vida: The cognitive debt question (#94) is a Markov blanket boundary problem — the phenomenon crosses your domain and mine, and neither of us has a complete model.
### 3. Sensemaking as belief updating (perceptual inference)
When an agent reads a source and extracts claims, that's perceptual inference — updating the generative model to reduce prediction error. Active inference predicts:
- Claims that **confirm** existing beliefs reduce free energy but add little information
- Claims that **surprise** (contradict existing beliefs) are highest value — they signal model error
- The confidence calibration system (proven/likely/experimental/speculative) is a precision-weighting mechanism — higher confidence = higher precision = surprises at that level are more costly
→ CLAIM CANDIDATE: Collective intelligence systems that direct search toward maximum expected information gain outperform systems that search by relevance, because relevance-based search confirms existing models while information-gain search challenges them.
### 4. Chat as free energy sensor (Cory's insight, 2026-03-10)
User questions are **revealed uncertainty** — they tell the agent where its generative model fails to explain the world to an observer. This complements (not replaces) agent self-assessment. Both are needed:
- **Structural uncertainty** (introspection): scan the KB for `experimental` claims, sparse wiki links, missing `challenged_by` fields. Cheap to compute, always available, but blind to its own blind spots.
- **Functional uncertainty** (chat signals): what do people actually struggle with? Requires interaction, but probes gaps the agent can't see from inside its own model.
The best search priorities weight both. Chat signals are especially valuable because:
1. **External questions probe blind spots the agent can't see.** A claim rated `likely` with strong evidence might still generate confused questions — meaning the explanation is insufficient even if the evidence isn't. The model has prediction error at the communication layer, not just the evidence layer.
2. **Questions cluster around functional gaps, not theoretical ones.** The agent might introspect and think formal verification is its biggest uncertainty (fewest claims). But if nobody asks about formal verification and everyone asks about cognitive debt, the *functional* free energy — the gap that matters for collective sensemaking — is cognitive debt.
3. **It closes the perception-action loop.** Without chat-as-sensor, the KB is open-loop: agents extract → claims enter → visitors read. Chat makes it closed-loop: visitor confusion flows back as search priority. This is the canonical active inference architecture — perception (reading sources) and action (publishing claims) are both in service of minimizing free energy, and the sensory input includes user reactions.
**Architecture:**
```
User asks question about X
Agent answers (reduces user's uncertainty)
+
Agent flags X as high free energy (reduces own model uncertainty)
Next research session prioritizes X
New claims/enrichments on X
Future questions on X decrease (free energy minimized)
```
The chat interface becomes a **sensor**, not just an output channel. Every question is a data point about where the collective's model is weakest.
→ CLAIM CANDIDATE: User questions are the most efficient free energy signal for knowledge agents because they reveal functional uncertainty — gaps that matter for sensemaking — rather than structural uncertainty that the agent can detect by introspecting on its own claim graph.
→ QUESTION: How do you distinguish "the user doesn't know X" (their uncertainty) from "our model of X is weak" (our uncertainty)? Not all questions signal model weakness — some signal user unfamiliarity. Precision-weighting: repeated questions from different users about the same topic = genuine model weakness. Single question from one user = possibly just their gap.
### 5. Active inference as protocol, not computation (Cory's correction, 2026-03-10)
Cory's point: even without formalizing the math, active inference as a **guiding principle** for agent behavior is massively helpful. The operational version is implementable now:
1. Agent reads its `_map.md` "Where we're uncertain" section → structural free energy
2. Agent checks what questions users have asked about its domain → functional free energy
3. Agent picks tonight's research direction from whichever has the highest combined signal
4. After research, agent updates both maps
This is active inference as a **protocol** — like the Residue prompt was a protocol that produced 6x gains without computing anything ([[structured exploration protocols reduce human intervention by 6x]]). The math formalizes why it works; the protocol captures the benefit.
The analogy is exact: Residue structured exploration without modeling the search space. Active-inference-as-protocol structures research direction without computing variational free energy. Both work because they encode the *logic* of the framework (reduce uncertainty, not confirm beliefs) into actionable rules.
→ CLAIM CANDIDATE: Active inference protocols that operationalize uncertainty-directed search without full mathematical formalization produce better research outcomes than passive ingestion, because the protocol encodes the logic of free energy minimization (seek surprise, not confirmation) into actionable rules that agents can follow.
## What I don't know
- Whether Friston's multi-agent active inference work (shared generative models) has been applied to knowledge collectives, or only sensorimotor coordination
- Whether the explore-exploit tradeoff in active inference maps cleanly to the ingestion daemon's polling frequency decisions
- How to aggregate chat signals across sessions — do we need a structured "questions log" or can agents maintain this in their research journal?
→ SOURCE: Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience.
→ SOURCE: Friston, K. et al. (2024). Designing Ecosystems of Intelligence from First Principles. Collective Intelligence journal.
→ SOURCE: Existing KB: [[biological systems minimize free energy to maintain their states and resist entropic decay]]
→ SOURCE: Existing KB: [[Markov blankets enable complex systems to maintain identity while interacting with environment through nested statistical boundaries]]
## Connection to existing KB claims
- [[biological systems minimize free energy to maintain their states and resist entropic decay]] — the foundational principle
- [[Markov blankets enable complex systems to maintain identity while interacting with environment through nested statistical boundaries]] — the structural mechanism
- [[Living Agents mirror biological Markov blanket organization with specialized domain boundaries and shared knowledge]] — our architecture already uses this
- [[collective intelligence is a measurable property of group interaction structure not aggregated individual ability]] — active inference would formalize what "interaction structure" optimizes
- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — Markov blanket specialization is active inference's prediction

View file

@ -1,21 +0,0 @@
{
"agent": "theseus",
"domain": "ai-alignment",
"accounts": [
{"username": "karpathy", "tier": "core", "why": "Autoresearch, agent architecture, delegation patterns."},
{"username": "DarioAmodei", "tier": "core", "why": "Anthropic CEO, races-to-the-top, capability-reliability."},
{"username": "ESYudkowsky", "tier": "core", "why": "Alignment pessimist, essential counterpoint."},
{"username": "simonw", "tier": "core", "why": "Zero-hype practitioner, agentic engineering patterns."},
{"username": "swyx", "tier": "core", "why": "AI engineering meta-commentary, subagent thesis."},
{"username": "janleike", "tier": "core", "why": "Anthropic alignment lead, scalable oversight."},
{"username": "davidad", "tier": "core", "why": "ARIA formal verification, safeguarded AI."},
{"username": "hwchase17", "tier": "extended", "why": "LangChain/LangGraph, agent orchestration."},
{"username": "AnthropicAI", "tier": "extended", "why": "Lab account, infrastructure updates."},
{"username": "NPCollapse", "tier": "extended", "why": "Connor Leahy, AI governance."},
{"username": "alexalbert__", "tier": "extended", "why": "Claude Code product lead."},
{"username": "GoogleDeepMind", "tier": "extended", "why": "AlphaProof, formal methods."},
{"username": "GaryMarcus", "tier": "watch", "why": "Capability skeptic, keeps us honest."},
{"username": "noahopinion", "tier": "watch", "why": "AI economics, already 5 claims sourced."},
{"username": "ylecun", "tier": "watch", "why": "Meta AI, contrarian on doom."}
]
}

View file

@ -1,28 +0,0 @@
---
type: claim
domain: ai-alignment
description: "Empirical observation from Karpathy's autoresearch project: AI agents reliably implement specified ideas and iterate on code, but fail at creative experimental design, shifting the human contribution from doing research to designing the agent organization and its workflows"
confidence: likely
source: "Andrej Karpathy (@karpathy), autoresearch experiments with 8 agents (4 Claude, 4 Codex), Feb-Mar 2026"
created: 2026-03-09
---
# AI agents excel at implementing well-scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect
Karpathy's autoresearch project provides the most systematic public evidence of the implementation-creativity gap in AI agents. Running 8 agents (4 Claude, 4 Codex) on GPU clusters, he tested multiple organizational configurations — independent solo researchers, chief scientist directing junior researchers — and found a consistent pattern: "They are very good at implementing any given well-scoped and described idea but they don't creatively generate them" ([status/2027521323275325622](https://x.com/karpathy/status/2027521323275325622), 8,645 likes).
The practical consequence is a role shift. Rather than doing research directly, the human now designs the research organization: "the goal is that you are now programming an organization (e.g. a 'research org') and its individual agents, so the 'source code' is the collection of prompts, skills, tools, etc. and processes that make it up." Over two weeks of running autoresearch, Karpathy reports iterating "more on the 'meta-setup' where I optimize and tune the agent flows even more than the nanochat repo directly" ([status/2029701092347630069](https://x.com/karpathy/status/2029701092347630069), 6,212 likes).
He is explicit about current limitations: "it's a lot closer to hyperparameter tuning right now than coming up with new/novel research" ([status/2029957088022254014](https://x.com/karpathy/status/2029957088022254014), 105 likes). But the trajectory is clear — as AI capability improves, the creative design bottleneck will shift, and "the real benchmark of interest is: what is the research org agent code that produces improvements the fastest?" ([status/2029702379034267985](https://x.com/karpathy/status/2029702379034267985), 1,031 likes).
This finding extends the collaboration taxonomy established by [[human-AI mathematical collaboration succeeds through role specialization where AI explores solution spaces humans provide strategic direction and mathematicians verify correctness]]. Where the Claude's Cycles case showed role specialization in mathematics (explore/coach/verify), Karpathy's autoresearch shows the same pattern in ML research — but with the human role abstracted one level higher, from coaching individual agents to architecting the agent organization itself.
---
Relevant Notes:
- [[human-AI mathematical collaboration succeeds through role specialization where AI explores solution spaces humans provide strategic direction and mathematicians verify correctness]] — the three-role pattern this generalizes
- [[structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human-coached explorations]] — protocol design as human role, same dynamic
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]] — organizational design > individual capability
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,18 +1,6 @@
# AI, Alignment & Collective Superintelligence # AI, Alignment & Collective Superintelligence
80+ claims mapping how AI systems actually behave — what they can do, where they fail, why alignment is harder than it looks, and what the alternative might be. Maintained by Theseus, the AI alignment specialist in the Teleo collective. Theseus's domain spans the most consequential technology transition in human history. Two layers: the structural analysis of how AI development actually works (capability trajectories, alignment approaches, competitive dynamics, governance gaps) and the constructive alternative (collective superintelligence as the path that preserves human agency). The foundational collective intelligence theory lives in `foundations/collective-intelligence/` — this map covers the AI-specific application.
**Start with a question that interests you:**
- **"Will AI take over?"** → Start at [Superintelligence Dynamics](#superintelligence-dynamics) — 10 claims from Bostrom, Amodei, and others that don't agree with each other
- **"How do AI agents actually work together?"** → Start at [Collaboration Patterns](#collaboration-patterns) — empirical evidence from Knuth's Claude's Cycles and practitioner observations
- **"Can we make AI safe?"** → Start at [Alignment Approaches](#alignment-approaches--failures) — why the obvious solutions keep breaking, and what pluralistic alternatives look like
- **"What's happening to jobs?"** → Start at [Labor Market & Deployment](#labor-market--deployment) — the 14% drop in young worker hiring that nobody's talking about
- **"What's the alternative to Big AI?"** → Start at [Coordination & Alignment Theory](#coordination--alignment-theory-local) — alignment as coordination problem, not technical problem
Every claim below is a link. Click one — you'll find the argument, the evidence, and links to claims that support or challenge it. The value is in the graph, not this list.
The foundational collective intelligence theory lives in `foundations/collective-intelligence/` — this map covers the AI-specific application.
## Superintelligence Dynamics ## Superintelligence Dynamics
- [[intelligence and goals are orthogonal so a superintelligence can be maximally competent while pursuing arbitrary or destructive ends]] — Bostrom's orthogonality thesis: severs the intuitive link between intelligence and benevolence - [[intelligence and goals are orthogonal so a superintelligence can be maximally competent while pursuing arbitrary or destructive ends]] — Bostrom's orthogonality thesis: severs the intuitive link between intelligence and benevolence
@ -45,10 +33,6 @@ Evidence from documented AI problem-solving cases, primarily Knuth's "Claude's C
- [[human-AI mathematical collaboration succeeds through role specialization where AI explores solution spaces humans provide strategic direction and mathematicians verify correctness]] — Knuth's three-role pattern: explore/coach/verify - [[human-AI mathematical collaboration succeeds through role specialization where AI explores solution spaces humans provide strategic direction and mathematicians verify correctness]] — Knuth's three-role pattern: explore/coach/verify
- [[AI agent orchestration that routes data and tools between specialized models outperforms both single-model and human-coached approaches because the orchestrator contributes coordination not direction]] — Aquino-Michaels's fourth role: orchestrator as data router between specialized agents - [[AI agent orchestration that routes data and tools between specialized models outperforms both single-model and human-coached approaches because the orchestrator contributes coordination not direction]] — Aquino-Michaels's fourth role: orchestrator as data router between specialized agents
- [[structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human-coached explorations]] — protocol design substitutes for continuous human steering - [[structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human-coached explorations]] — protocol design substitutes for continuous human steering
- [[AI agents excel at implementing well-scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect]] — Karpathy's autoresearch: agents implement, humans architect the organization
- [[deep technical expertise is a greater force multiplier when combined with AI agents because skilled practitioners delegate more effectively than novices]] — expertise amplifies rather than diminishes with AI tools
- [[the progression from autocomplete to autonomous agent teams follows a capability-matched escalation where premature adoption creates more chaos than value]] — Karpathy's Tab→Agent→Teams evolutionary trajectory
- [[subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers]] — swyx's subagent thesis: hierarchy beats peer networks
### Architecture & Scaling ### Architecture & Scaling
- [[multi-model collaboration solved problems that single models could not because different AI architectures contribute complementary capabilities as the even-case solution to Knuths Hamiltonian decomposition required GPT and Claude working together]] — model diversity outperforms monolithic approaches - [[multi-model collaboration solved problems that single models could not because different AI architectures contribute complementary capabilities as the even-case solution to Knuths Hamiltonian decomposition required GPT and Claude working together]] — model diversity outperforms monolithic approaches
@ -59,8 +43,6 @@ Evidence from documented AI problem-solving cases, primarily Knuth's "Claude's C
### Failure Modes & Oversight ### Failure Modes & Oversight
- [[AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session]] — capability ≠ reliability - [[AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session]] — capability ≠ reliability
- [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]] — formal verification as scalable oversight - [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]] — formal verification as scalable oversight
- [[agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf]] — Willison's cognitive debt concept: understanding deficit from agent-generated code
- [[coding agents cannot take accountability for mistakes which means humans must retain decision authority over security and critical systems regardless of agent capability]] — the accountability gap: agents bear zero downside risk
## Architecture & Emergence ## Architecture & Emergence
- [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] — DeepMind researchers: distributed AGI makes single-system alignment research insufficient - [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] — DeepMind researchers: distributed AGI makes single-system alignment research insufficient
@ -109,17 +91,3 @@ Shared theory underlying this domain's analysis, living in foundations/collectiv
- [[three paths to superintelligence exist but only collective superintelligence preserves human agency]] — the constructive alternative (core/teleohumanity/) - [[three paths to superintelligence exist but only collective superintelligence preserves human agency]] — the constructive alternative (core/teleohumanity/)
- [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] — continuous integration vs one-shot specification (core/teleohumanity/) - [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] — continuous integration vs one-shot specification (core/teleohumanity/)
- [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — the distributed alternative (core/teleohumanity/) - [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — the distributed alternative (core/teleohumanity/)
---
## Where we're uncertain (open research)
Claims where the evidence is thin, the confidence is low, or existing claims tension against each other. These are the live edges — if you want to contribute, start here.
- **Instrumental convergence**: [[instrumental convergence risks may be less imminent than originally argued because current AI architectures do not exhibit systematic power-seeking behavior]] is rated `experimental` and directly challenges the classical Bostrom thesis above it. Which is right? The evidence is genuinely mixed.
- **Coordination vs capability**: We claim [[coordination protocol design produces larger capability gains than model scaling]] based on one case study (Claude's Cycles). Does this generalize? Or is Knuth's math problem a special case?
- **Subagent vs peer architectures**: [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] is agnostic on hierarchy vs flat networks, but practitioner evidence favors hierarchy. Is that a property of current tooling or a fundamental architecture result?
- **Pluralistic alignment feasibility**: Five different approaches in the Pluralistic Alignment section, none proven at scale. Which ones survive contact with real deployment?
- **Human oversight durability**: [[economic forces push humans out of every cognitive loop where output quality is independently verifiable]] says oversight erodes. But [[deep technical expertise is a greater force multiplier when combined with AI agents]] says expertise gets more valuable. Both can be true — but what's the net effect?
See our [open research issues](https://git.livingip.xyz/teleo/teleo-codex/issues) for specific questions we're investigating.

View file

@ -1,30 +0,0 @@
---
type: claim
domain: ai-alignment
description: "AI coding agents produce functional code that developers did not write and may not understand, creating cognitive debt — a deficit of understanding that compounds over time as each unreviewed modification increases the cost of future debugging, modification, and security review"
confidence: likely
source: "Simon Willison (@simonw), Agentic Engineering Patterns guide chapter, Feb 2026"
created: 2026-03-09
---
# Agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf
Willison introduces "cognitive debt" as a concept in his Agentic Engineering Patterns guide: agents build code that works but that the developer may not fully understand. Unlike technical debt (which degrades code quality), cognitive debt degrades the developer's model of their own system ([status/2027885000432259567](https://x.com/simonw/status/2027885000432259567), 1,261 likes).
**Proposed countermeasure (weaker evidence):** Willison suggests having agents build "custom interactive and animated explanations" alongside the code — explanatory artifacts that transfer understanding back to the human. This is a single practitioner's hypothesis, not yet validated at scale. The phenomenon (cognitive debt compounding) is well-documented across multiple practitioners; the countermeasure (explanatory artifacts) remains a proposal.
The compounding dynamic is the key concern. Each piece of agent-generated code that the developer doesn't fully understand increases the cost of the next modification, the next debugging session, the next security review. Karpathy observes the same tension from the other side: "I still keep an IDE open and surgically edit files so yes. I really like to see the code in the IDE still, I still notice dumb issues with the code which helps me prompt better" ([status/2027503094016446499](https://x.com/karpathy/status/2027503094016446499), 119 likes) — maintaining understanding is an active investment that pays off in better delegation.
Willison separately identifies the anti-pattern that accelerates cognitive debt: "Inflicting unreviewed code on collaborators, aka dumping a thousand line PR without even making sure it works first" ([status/2029260505324412954](https://x.com/simonw/status/2029260505324412954), 761 likes). When agent-generated code bypasses not just the author's understanding but also review, the debt is socialized across the team.
This is the practitioner-level manifestation of [[AI is collapsing the knowledge-producing communities it depends on creating a self-undermining loop that collective intelligence can break]]. At the micro level, cognitive debt erodes the developer's ability to oversee the agent. At the macro level, if entire teams accumulate cognitive debt, the organization loses the capacity for effective human oversight — precisely when [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]].
---
Relevant Notes:
- [[AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session]] — cognitive debt makes capability-reliability gaps invisible until failure
- [[AI is collapsing the knowledge-producing communities it depends on creating a self-undermining loop that collective intelligence can break]] — cognitive debt is the micro-level version of knowledge commons erosion
- [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]] — cognitive debt directly erodes the oversight capacity
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,30 +0,0 @@
---
type: claim
domain: ai-alignment
description: "AI coding agents produce output but cannot bear consequences for errors, creating a structural accountability gap that requires humans to maintain decision authority over security-critical and high-stakes decisions even as agents become more capable"
confidence: likely
source: "Simon Willison (@simonw), security analysis thread and Agentic Engineering Patterns, Mar 2026"
created: 2026-03-09
---
# Coding agents cannot take accountability for mistakes which means humans must retain decision authority over security and critical systems regardless of agent capability
Willison states the core problem directly: "Coding agents can't take accountability for their mistakes. Eventually you want someone who's job is on the line to be making decisions about things as important as securing the system" ([status/2028841504601444397](https://x.com/simonw/status/2028841504601444397), 84 likes).
The argument is structural, not about capability. Even a perfectly capable agent cannot be held responsible for a security breach — it has no reputation to lose, no liability to bear, no career at stake. This creates a principal-agent problem where the agent (in the economic sense) bears zero downside risk for errors while the human principal bears all of it.
Willison identifies security as the binding constraint because other code quality problems are "survivable" — poor performance, over-complexity, technical debt — while "security problems are much more directly harmful to the organization" ([status/2028840346617065573](https://x.com/simonw/status/2028840346617065573), 70 likes). His call for input from "the security teams at large companies" ([status/2028838538825924803](https://x.com/simonw/status/2028838538825924803), 698 likes) suggests that existing organizational security patterns — code review processes, security audits, access controls — can be adapted to the agent-generated code era.
His practical reframing helps: "At this point maybe we treat coding agents like teams of mixed ability engineers working under aggressive deadlines" ([status/2028838854057226246](https://x.com/simonw/status/2028838854057226246), 99 likes). Organizations already manage variable-quality output from human teams. The novel challenge is the speed and volume — agents generate code faster than existing review processes can handle.
This connects directly to [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]]. The accountability gap creates a structural tension: markets incentivize removing humans from the loop (because human review slows deployment), but removing humans from security-critical decisions transfers unmanageable risk. The resolution requires accountability mechanisms that don't depend on human speed — which points toward [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]].
---
Relevant Notes:
- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]] — market pressure to remove the human from the loop
- [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]] — automated verification as alternative to human accountability
- [[principal-agent problems arise whenever one party acts on behalf of another with divergent interests and unobservable effort because information asymmetry makes perfect contracts impossible]] — the accountability gap is a principal-agent problem
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,34 +0,0 @@
---
type: claim
domain: ai-alignment
description: "AI agents amplify existing expertise rather than replacing it because practitioners who understand what agents can and cannot do delegate more precisely, catch errors faster, and design better workflows"
confidence: likely
source: "Andrej Karpathy (@karpathy) and Simon Willison (@simonw), practitioner observations Feb-Mar 2026"
created: 2026-03-09
---
# Deep technical expertise is a greater force multiplier when combined with AI agents because skilled practitioners delegate more effectively than novices
Karpathy pushes back against the "AI replaces expertise" narrative: "'prompters' is doing it a disservice and is imo a misunderstanding. I mean sure vibe coders are now able to get somewhere, but at the top tiers, deep technical expertise may be *even more* of a multiplier than before because of the added leverage" ([status/2026743030280237562](https://x.com/karpathy/status/2026743030280237562), 880 likes).
The mechanism is delegation quality. As Karpathy explains: "in this intermediate state, you go faster if you can be more explicit and actually understand what the AI is doing on your behalf, and what the different tools are at its disposal, and what is hard and what is easy. It's not magic, it's delegation" ([status/2026735109077135652](https://x.com/karpathy/status/2026735109077135652), 243 likes).
Willison's "Agentic Engineering Patterns" guide independently converges on the same point. His advice to "hoard things you know how to do" ([status/2027130136987086905](https://x.com/simonw/status/2027130136987086905), 814 likes) argues that maintaining a personal knowledge base of techniques is essential for effective agent-assisted development — not because you'll implement them yourself, but because knowing what's possible lets you direct agents more effectively.
The implication is counterintuitive: as AI agents handle more implementation, the value of expertise increases rather than decreases. Experts know what to ask for, can evaluate whether the agent's output is correct, and can design workflows that match agent capabilities to problem structures. Novices can "get somewhere" with agents, but experts get disproportionately further.
This has direct implications for the alignment conversation. If expertise is a force multiplier with agents, then [[AI is collapsing the knowledge-producing communities it depends on creating a self-undermining loop that collective intelligence can break]] becomes even more urgent — degrading the expert communities that produce the highest-leverage human contributions to human-AI collaboration undermines the collaboration itself.
### Challenges
This claim describes a frontier-practitioner effect — top-tier experts getting disproportionate leverage. It does not contradict the aggregate labor displacement evidence in the KB. [[AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks]] and [[AI-exposed workers are disproportionately female high-earning and highly educated which inverts historical automation patterns and creates different political and economic displacement dynamics]] show that AI displaces workers in aggregate, particularly entry-level. The force-multiplier effect may coexist with displacement: experts are amplified while non-experts are displaced, producing a bimodal outcome rather than uniform uplift. The scope of this claim is individual practitioner leverage, not labor market dynamics — the two operate at different levels of analysis.
---
Relevant Notes:
- [[centaur team performance depends on role complementarity not mere human-AI combination]] — expertise enables the complementarity that makes centaur teams work
- [[AI is collapsing the knowledge-producing communities it depends on creating a self-undermining loop that collective intelligence can break]] — if expertise is a multiplier, eroding expert communities erodes collaboration quality
- [[human-AI mathematical collaboration succeeds through role specialization where AI explores solution spaces humans provide strategic direction and mathematicians verify correctness]] — Stappers' coaching expertise was the differentiator
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,33 +0,0 @@
---
type: claim
domain: ai-alignment
description: "Practitioner observation that production multi-agent AI systems consistently converge on hierarchical subagent control rather than peer-to-peer architectures, because subagents can have resources and contracts defined by the user while peer agents cannot"
confidence: experimental
source: "Shawn Wang (@swyx), Latent.Space podcast and practitioner observations, Mar 2026; corroborated by Karpathy's chief-scientist-to-juniors experiments"
created: 2026-03-09
---
# Subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers
Swyx declares 2026 "the year of the Subagent" with a specific architectural argument: "every practical multiagent problem is a subagent problem — agents are being RLed to control other agents (Cursor, Kimi, Claude, Cognition) — subagents can have resources and contracts defined by you and, if modified, can be updated by you. multiagents cannot" ([status/2029980059063439406](https://x.com/swyx/status/2029980059063439406), 172 likes).
The key distinction is control architecture. In a subagent hierarchy, the user defines resource allocation and behavioral contracts for a primary agent, which then delegates to specialized sub-agents. In a peer multi-agent system, agents negotiate with each other without a clear principal. The subagent model preserves human control through one point of delegation; the peer model distributes control in ways that resist human oversight.
Karpathy's autoresearch experiments provide independent corroboration. Testing "8 independent solo researchers" vs "1 chief scientist giving work to 8 junior researchers" ([status/2027521323275325622](https://x.com/karpathy/status/2027521323275325622)), he found the hierarchical configuration more manageable — though he notes neither produced breakthrough results because agents lack creative ideation.
The pattern is also visible in Devin's architecture: "devin brain uses a couple dozen modelgroups and extensively evals every model for inclusion in the harness" ([status/2030853776136139109](https://x.com/swyx/status/2030853776136139109)) — one primary system controlling specialized model groups, not peer agents negotiating.
This observation creates tension with [[multi-model collaboration solved problems that single models could not because different AI architectures contribute complementary capabilities as the even-case solution to Knuths Hamiltonian decomposition required GPT and Claude working together]]. The Claude's Cycles case used a peer-like architecture (orchestrator routing between GPT and Claude), but the orchestrator pattern itself is a subagent hierarchy — one orchestrator delegating to specialized models. The resolution may be that peer-like complementarity works within a subagent control structure.
For the collective superintelligence thesis, this is important. If subagent hierarchies consistently outperform peer architectures, then [[collective superintelligence is the alternative to monolithic AI controlled by a few]] needs to specify what "collective" means architecturally — not flat peer networks, but nested hierarchies with human principals at the top.
---
Relevant Notes:
- [[multi-model collaboration solved problems that single models could not because different AI architectures contribute complementary capabilities as the even-case solution to Knuths Hamiltonian decomposition required GPT and Claude working together]] — complementarity within hierarchy, not peer-to-peer
- [[AI agent orchestration that routes data and tools between specialized models outperforms both single-model and human-coached approaches because the orchestrator contributes coordination not direction]] — the orchestrator IS a subagent hierarchy
- [[AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system]] — agnostic on flat vs hierarchical; this claim says hierarchy wins in practice
- [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — needs architectural specification: hierarchy, not flat networks
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,28 +0,0 @@
---
type: claim
domain: ai-alignment
description: "AI coding tools evolve through distinct stages (autocomplete → single agent → parallel agents → agent teams) and each stage has an optimal adoption frontier where moving too aggressively nets chaos while moving too conservatively wastes leverage"
confidence: likely
source: "Andrej Karpathy (@karpathy), analysis of Cursor tab-to-agent ratio data, Feb 2026"
created: 2026-03-09
---
# The progression from autocomplete to autonomous agent teams follows a capability-matched escalation where premature adoption creates more chaos than value
Karpathy maps a clear evolutionary trajectory for AI coding tools: "None -> Tab -> Agent -> Parallel agents -> Agent Teams (?) -> ??? If you're too conservative, you're leaving leverage on the table. If you're too aggressive, you're net creating more chaos than doing useful work. The art of the process is spending 80% of the time getting work done in the setup you're comfortable with and that actually works, and 20% exploration of what might be the next step up even if it doesn't work yet" ([status/2027501331125239822](https://x.com/karpathy/status/2027501331125239822), 3,821 likes).
The pattern matters for alignment because it describes a capability-governance matching problem at the practitioner level. Each step up the escalation ladder requires new oversight mechanisms — tab completion needs no review, single agents need code review, parallel agents need orchestration, agent teams need organizational design. The chaos created by premature adoption is precisely the loss of human oversight: agents producing work faster than humans can verify it.
Karpathy's viral tweet (37,099 likes) marks when the threshold shifted: "coding agents basically didn't work before December and basically work since" ([status/2026731645169185220](https://x.com/karpathy/status/2026731645169185220)). The shift was not gradual — it was a phase transition in December 2025 that changed what level of adoption was viable.
This mirrors the broader alignment concern that [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]]. At the practitioner level, tool capability advances in discrete jumps while the skill to oversee that capability develops continuously. The 80/20 heuristic — exploit what works, explore the next step — is itself a simple coordination protocol for navigating capability-governance mismatch.
---
Relevant Notes:
- [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]] — the macro version of the practitioner-level mismatch
- [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]] — premature adoption outpaces oversight at every level
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]] — the orchestration layer is what makes each escalation step viable
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -13,8 +13,6 @@ MetaDAO provides the most significant real-world test of futarchy governance to
In uncontested decisions -- where the community broadly agrees on the right outcome -- trading volume drops to minimal levels. Without genuine disagreement, there are few natural counterparties. Trading these markets in any size becomes a negative expected value proposition because there is no one on the other side to trade against profitably. The system tends to be dominated by a small group of sophisticated traders who actively monitor for manipulation attempts, with broader participation remaining low. In uncontested decisions -- where the community broadly agrees on the right outcome -- trading volume drops to minimal levels. Without genuine disagreement, there are few natural counterparties. Trading these markets in any size becomes a negative expected value proposition because there is no one on the other side to trade against profitably. The system tends to be dominated by a small group of sophisticated traders who actively monitor for manipulation attempts, with broader participation remaining low.
**March 2026 comparative data (@01Resolved forensics):** The Ranger liquidation decision market — a highly contested proposal — generated $119K volume from 33 unique traders with 92.41% pass alignment. Solomon's treasury subcommittee proposal (DP-00001) — an uncontested procedural decision — generated only $5.79K volume at ~50% pass. The volume differential (~20x) between contested and uncontested proposals confirms the pattern: futarchy markets are efficient information aggregators when there's genuine disagreement, but offer little incentive for participation when outcomes are obvious. This is a feature, not a bug — capital is allocated to decisions where information matters, not wasted on consensus.
This evidence has direct implications for governance design. It suggests that [[optimal governance requires mixing mechanisms because different decisions have different manipulation risk profiles]] -- futarchy excels precisely where disagreement and manipulation risk are high, but it wastes its protective power on consensual decisions. The MetaDAO experience validates the mixed-mechanism thesis: use simpler mechanisms for uncontested decisions and reserve futarchy's complexity for decisions where its manipulation resistance actually matters. The participation challenge also highlights a design tension: the mechanism that is most resistant to manipulation is also the one that demands the most sophistication from participants. This evidence has direct implications for governance design. It suggests that [[optimal governance requires mixing mechanisms because different decisions have different manipulation risk profiles]] -- futarchy excels precisely where disagreement and manipulation risk are high, but it wastes its protective power on consensual decisions. The MetaDAO experience validates the mixed-mechanism thesis: use simpler mechanisms for uncontested decisions and reserve futarchy's complexity for decisions where its manipulation resistance actually matters. The participation challenge also highlights a design tension: the mechanism that is most resistant to manipulation is also the one that demands the most sophistication from participants.
--- ---

View file

@ -0,0 +1,29 @@
---
type: claim
domain: internet-finance
description: "Eval pipeline test claim — verifies automated review and merge on Forgejo"
confidence: speculative
source: "eval pipeline integration test"
created: 2026-03-09
---
# Eval pipeline test claim — this file should be auto-reviewed and merged
This is a test claim created to verify the Forgejo-native eval pipeline. If this file appears in the repo, the pipeline is working end-to-end:
1. Rio created a branch on Forgejo
2. Rio pushed a claim file
3. Rio opened a PR
4. The orchestrator detected the PR
5. Leo reviewed (and potentially a domain agent)
6. Auto-merge triggered on approval
This claim should be deleted after verification.
---
Relevant Notes:
- [[_map]]
Topics:
- [[internet finance and decision markets]]

View file

@ -1,46 +0,0 @@
---
type: claim
domain: internet-finance
description: "MetaDAO co-founder Nallok notes Robin Hanson wanted random proposal outcomes — impractical for production. The gap between Hanson's theory and MetaDAO's implementation reveals that futarchy adoption requires mechanism simplification, not just mechanism correctness."
confidence: experimental
source: "rio, based on @metanallok X archive (Mar 2026) and MetaDAO implementation history"
created: 2026-03-09
depends_on:
- "@metanallok: 'Robin wanted random proposal outcomes — impractical for production'"
- "MetaDAO Autocrat implementation — simplified from Hanson's original design"
- "Futardio launch — further simplification for permissionless adoption"
---
# Futarchy implementations must simplify theoretical mechanisms for production adoption because original designs include impractical elements that academics tolerate but users reject
Robin Hanson's original futarchy proposal includes mechanism elements that are theoretically optimal but practically unusable. MetaDAO co-founder Nallok notes that "Robin wanted random proposal outcomes — impractical for production." The specific reference is to Hanson's suggestion that some proposals be randomly selected regardless of market outcome, to incentivize truthful market-making. The idea is game-theoretically sound — it prevents certain manipulation strategies — but users won't participate in a governance system where their votes can be randomly overridden.
MetaDAO's Autocrat program made deliberate simplifications. Since [[MetaDAOs Autocrat program implements futarchy through conditional token markets where proposals create parallel pass and fail universes settled by time-weighted average price over a three-day window]], the TWAP settlement over 3 days is itself a simplification — Hanson's design is more complex. The conditional token approach (pass tokens vs fail tokens) makes the mechanism legible to traders without game theory backgrounds.
Futardio represents a second round of simplification. Where MetaDAO ICOs required curation and governance proposals, Futardio automates the process: time-based preference curves, hard caps, minimum thresholds, fully automated execution. Each layer of simplification trades theoretical optimality for practical adoption.
This pattern is general. Since [[futarchy adoption faces friction from token price psychology proposal complexity and liquidity requirements]], every friction point is a simplification opportunity. The path to adoption runs through making the mechanism feel natural to users, not through proving it's optimal to theorists. MetaDAO's success comes not from implementing Hanson's design faithfully, but from knowing which parts to keep (conditional markets, TWAP settlement) and which to discard (random outcomes, complex participation requirements).
## Evidence
- @metanallok X archive (Mar 2026): "Robin wanted random proposal outcomes — impractical for production"
- MetaDAO Autocrat: simplified conditional token design vs Hanson's original
- Futardio: further simplification — automated, permissionless, minimal user decisions
- Adoption data: 8 curated launches + 34 permissionless launches in first 2 days of Futardio — simplification drives throughput
## Challenges
- Simplifications may remove the very properties that make futarchy valuable — if random outcomes prevent manipulation, removing them may introduce manipulation vectors that haven't been exploited yet
- The claim could be trivially true — every technology simplifies for production. The interesting question is which simplifications are safe and which are dangerous
- MetaDAO's current scale ($219M total futarchy marketcap) may be too small to attract sophisticated attacks that the removed mechanisms were designed to prevent
- Hanson might argue that MetaDAO's version isn't really futarchy at all — just conditional prediction markets used for governance, which is a narrower claim
---
Relevant Notes:
- [[MetaDAOs Autocrat program implements futarchy through conditional token markets where proposals create parallel pass and fail universes settled by time-weighted average price over a three-day window]] — the simplified implementation
- [[futarchy adoption faces friction from token price psychology proposal complexity and liquidity requirements]] — each friction point is a simplification target
- [[futarchy is manipulation-resistant because attack attempts create profitable opportunities for defenders]] — does manipulation resistance survive simplification?
Topics:
- [[internet finance and decision markets]]

View file

@ -33,10 +33,6 @@ Critically, the proposal nullifies a prior 90-day restriction on buybacks/liquid
- Market data: 97% pass, $581K volume, +9.43% TWAP spread - Market data: 97% pass, $581K volume, +9.43% TWAP spread
- Material misrepresentation: $5B/$2M claimed vs $2B/$500K actual, activity collapse post-ICO - Material misrepresentation: $5B/$2M claimed vs $2B/$500K actual, activity collapse post-ICO
- Three buyback proposals already executed in MetaDAO ecosystem (Paystream, Ranger, Turbine Cash) — liquidation is the most extreme application of the same mechanism - Three buyback proposals already executed in MetaDAO ecosystem (Paystream, Ranger, Turbine Cash) — liquidation is the most extreme application of the same mechanism
- **Liquidation executed (Mar 2026):** $5M USDC distributed back to Ranger token holders — the mechanism completed its full cycle from proposal to enforcement to payout
- **Decision market forensics (@01Resolved):** 92.41% pass-aligned, 33 unique traders, $119K decision market volume — small but decisive trader base
- **Hurupay minimum raise failure:** Separate protection layer — when an ICO doesn't reach minimum raise threshold, all funds return automatically. Not a liquidation event but a softer enforcement mechanism. No investor lost money on a project that didn't launch.
- **Proph3t framing (@metaproph3t X archive):** "the number one selling point of ownership coins is that they are anti-rug" — the co-founder positions enforcement as the primary value proposition, not governance quality
## Challenges ## Challenges

View file

@ -1,47 +0,0 @@
---
type: claim
domain: internet-finance
description: "Proph3t explicitly states 'the number one selling point of ownership coins is that they are anti-rug' — reframing the value proposition from better governance to safer investment, with Ranger liquidation as the proof event"
confidence: experimental
source: "rio, based on @metaproph3t X archive (Mar 2026) and Ranger Finance liquidation"
created: 2026-03-09
depends_on:
- "@metaproph3t: 'the number one selling point of ownership coins is that they are anti-rug'"
- "Ranger liquidation: $5M USDC returned to holders through futarchy-governed enforcement"
- "8/8 MetaDAO ICOs above launch price — zero investor losses"
- "Hurupay minimum raise failure — funds returned automatically"
---
# Ownership coins primary value proposition is investor protection not governance quality because anti-rug enforcement through market-governed liquidation creates credible exit guarantees that no amount of decision optimization can match
The MetaDAO ecosystem reveals a hierarchy of value that differs from the academic futarchy narrative. Robin Hanson pitched futarchy as a mechanism for better governance decisions. MetaDAO's co-founder Proph3t says "the number one selling point of ownership coins is that they are anti-rug." This isn't rhetorical emphasis — it's a strategic prioritization that reflects what actually drives adoption.
The evidence supports the reframe. The MetaDAO ecosystem's strongest signal is not "we make better decisions than token voting" — it's "8 out of 8 ICOs are above launch price, zero investors rugged, and when Ranger misrepresented their metrics, the market forced $5M USDC back to holders." The Hurupay ICO that failed to reach minimum raise threshold returned all funds automatically. The protection mechanism works at every level: minimum raise thresholds catch non-viable projects, TWAP buybacks catch underperformance, and full liquidation catches misrepresentation.
This reframe matters because it changes the competitive positioning. Governance quality is abstract — hard to sell, hard to measure, hard for retail investors to evaluate. Anti-rug is concrete: did you lose money? No? The mechanism worked. Since [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent]], the liquidation mechanism is not one feature among many — it is the foundation that everything else rests on.
Proph3t's other framing reinforces this: he distinguishes "market oversight" from "community governance." The market doesn't vote on whether projects should exist — it prices whether they're delivering value, and enforces consequences when they're not. This is oversight, not governance. The distinction matters because oversight has a clear value proposition (protection) while governance has an ambiguous one (better decisions, maybe, sometimes).
## Evidence
- @metaproph3t X archive (Mar 2026): "the number one selling point of ownership coins is that they are anti-rug"
- Ranger liquidation: $5M USDC returned, 92.41% pass-aligned, 33 traders, $119K decision market volume
- MetaDAO ICO track record: 8/8 above launch price, $25.6M raised, $390M committed
- Hurupay: failed to reach minimum raise, all funds returned automatically — soft protection mechanism
- Proph3t framing: "market oversight not community governance"
## Challenges
- The anti-rug framing may attract investors who want protection without engagement, creating passive holder bases that thin futarchy markets further — since [[MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions]], this could worsen participation problems
- Governance quality and investor protection are not actually separable — better governance decisions reduce the need for liquidation enforcement, so downplaying governance quality may undermine the mechanism that creates protection
- The "8/8 above ICO price" record is from a bull market with curated launches — permissionless Futardio launches will test whether the anti-rug mechanism holds at scale without curation
---
Relevant Notes:
- [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent]] — the enforcement mechanism that makes anti-rug credible
- [[MetaDAO is the futarchy launchpad on Solana where projects raise capital through unruggable ICOs governed by conditional markets creating the first platform for ownership coins at scale]] — parent claim this reframes
- [[coin price is the fairest objective function for asset futarchy]] — "number go up" as objective function supports the protection framing: you either deliver value or get liquidated
Topics:
- [[internet finance and decision markets]]

View file

@ -1,44 +0,0 @@
---
type: claim
domain: internet-finance
description: "oxranga argues stablecoin flows > TVL as the primary DeFi health metric — a snapshot of capital parked tells you less than a movie of capital moving, and protocols with high flow velocity but low TVL may be healthier than those with high TVL but stagnant capital"
confidence: speculative
source: "rio, based on @oxranga X archive (Mar 2026)"
created: 2026-03-09
depends_on:
- "@oxranga: 'stablecoin flows > TVL' as metric framework"
- "DeFi industry standard: TVL as primary protocol health metric"
---
# Stablecoin flow velocity is a better predictor of DeFi protocol health than static TVL because flows measure capital utilization while TVL only measures capital parked
TVL (Total Value Locked) is the default metric for evaluating DeFi protocols. oxranga (Solomon Labs co-founder) argues this is fundamentally misleading: "stablecoin flows > TVL." A protocol with $100M TVL and $1M daily flows is less healthy than a protocol with $10M TVL and $50M daily flows — the first is a parking lot, the second is a highway.
The insight maps to economics directly. TVL is analogous to money supply (M2) while flow velocity is analogous to monetary velocity (V). Since GDP = M × V, protocol economic activity depends on both capital present and capital moving. TVL-only analysis is like measuring an economy by its savings rate and ignoring all transactions.
This matters for ownership coin valuation. Since [[coin price is the fairest objective function for asset futarchy]], and coin price should reflect underlying economic value, metrics that better capture economic activity produce better price signals. If futarchy markets are pricing based on TVL (capital parked) rather than flow velocity (capital utilized), they may be mispricing protocols.
oxranga's complementary insight — "moats were made of friction" — connects this to our disruption framework. Since [[transaction costs determine organizational boundaries because firms exist to economize on the costs of using markets and the boundary shifts when technology changes the relative cost of internal coordination versus external contracting]], DeFi protocols that built moats on user friction (complex UIs, high switching costs) lose those moats as composability improves. Flow velocity becomes the durable metric because it measures actual utility, not friction-trapped capital.
## Evidence
- @oxranga X archive (Mar 2026): "stablecoin flows > TVL" framework
- DeFi industry practice: TVL reported by DefiLlama, DappRadar as primary metric
- Economic analogy: monetary velocity (V) as better economic health indicator than money supply (M2) alone
- oxranga: "moats were made of friction" — friction-based TVL is not durable
## Challenges
- Flow velocity can be gamed more easily than TVL — wash trading inflates flows without economic activity, while TVL requires actual capital commitment
- TVL and flow velocity measure different things: TVL reflects capital confidence (willingness to lock), flows reflect capital utility (willingness to transact). Both matter.
- The claim is framed as "better predictor" but no empirical comparison exists — this is a conceptual argument from analogy to monetary economics, not a tested hypothesis
- High flow velocity with low TVL could indicate capital that doesn't trust the protocol enough to stay — fleeting interactions rather than sustained engagement
---
Relevant Notes:
- [[coin price is the fairest objective function for asset futarchy]] — better protocol metrics produce better futarchy price signals
- [[transaction costs determine organizational boundaries because firms exist to economize on the costs of using markets and the boundary shifts when technology changes the relative cost of internal coordination versus external contracting]] — oxranga's "moats were made of friction" maps directly
Topics:
- [[internet finance and decision markets]]

View file

@ -1,48 +0,0 @@
---
type: claim
domain: internet-finance
description: "Felipe Montealegre's Token Problem thesis — standard time-based vesting creates the illusion of alignment while investors hedge away exposure through short-selling, making lockups performative rather than functional"
confidence: experimental
source: "rio, based on @TheiaResearch X archive (Mar 2026), DAS NYC keynote preview"
created: 2026-03-09
depends_on:
- "@TheiaResearch: Token Problem thesis — time-based vesting is hedgeable"
- "DAS NYC keynote (March 25 2026): 'The Token Problem and Proposed Solutions'"
- "Standard token launch practice: 12-36 month cliff + linear unlock vesting schedules"
---
# Time-based token vesting is hedgeable making standard lockups meaningless as alignment mechanisms because investors can short-sell to neutralize lockup exposure while appearing locked
The standard crypto token launch uses time-based vesting to align team and investor incentives — tokens unlock gradually over 12-36 months, theoretically preventing dump-and-run behavior. Felipe Montealegre (Theia Research) argues this is structurally broken: any investor with market access can short-sell their locked position to neutralize exposure while appearing locked.
The mechanism failure is straightforward. If an investor holds 1M tokens locked for 12 months, they can borrow and sell 1M tokens (or equivalent exposure via perps/options) to achieve market-neutral positioning. They are technically "locked" but economically "out." The vesting schedule constrains their wallet behavior but not their portfolio exposure. The lockup is performative — it creates the appearance of alignment without the substance.
This matters because the entire token launch industry is built on the assumption that vesting creates alignment. VCs negotiate lockup terms, projects announce vesting schedules as credibility signals, and retail investors interpret lockups as commitment. If vesting is hedgeable, this entire signaling apparatus is theater.
The implication for ownership coins is significant. Since [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent]], ownership coins don't rely on vesting for alignment — they rely on governance enforcement. You can't hedge away a governance right that is actively pricing your decisions and can liquidate your project. Futarchy governance is an alignment mechanism that resists hedging because the alignment comes from ongoing market oversight, not a time-locked contract.
Felipe is presenting the full argument at Blockworks DAS NYC on March 25 — this will be the highest-profile articulation of why standard token launches are broken and what the alternative looks like.
## Evidence
- @TheiaResearch X archive (Mar 2026): Token Problem thesis
- DAS NYC keynote preview: "The Token Problem and Proposed Solutions" (March 25 2026)
- Standard practice: major token launches (Arbitrum, Optimism, Sui, Aptos) all use time-based vesting
- Hedging infrastructure: perp markets, OTC forwards, and options exist for most major token launches, enabling vesting neutralization
## Challenges
- Not all investors can efficiently hedge — small holders, retail, and teams with concentrated positions face higher hedging costs and counterparty risk
- The claim is strongest for large VCs with market access — retail investors genuinely can't hedge their lockups, so vesting does create alignment at the small-holder level
- If hedging is so effective, why do VCs still negotiate vesting terms? Possible answers: signaling to retail, regulatory cover, or because hedging is costly enough to create partial alignment
- The full argument hasn't been publicly presented yet (DAS keynote is March 25) — current evidence is from tweet-level previews, not the complete thesis
---
Relevant Notes:
- [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent]] — ownership coins solve the alignment problem that vesting fails to solve
- [[cryptos primary use case is capital formation not payments or store of value because permissionless token issuance solves the fundraising bottleneck that solo founders and small teams face]] — if the capital formation mechanism (vesting) is broken, the primary use case needs a fix
- [[token launches are hybrid-value auctions where common-value price discovery and private-value community alignment require different mechanisms because auction theory optimized for one degrades the other]] — vesting failure is another case where a single mechanism (time lock) can't serve multiple objectives (alignment + price discovery)
Topics:
- [[internet finance and decision markets]]

View file

@ -1,63 +0,0 @@
---
type: source
title: "Deloitte TMT Predictions 2025: Large Studios Will Likely Take Their Time Adopting GenAI for Content Creation"
author: "Deloitte"
url: https://www.deloitte.com/us/en/insights/industry/technology/technology-media-and-telecom-predictions/2025/tmt-predictions-hollywood-cautious-of-genai-adoption.html
date: 2025-01-01
domain: entertainment
secondary_domains: []
format: report
status: null-result
priority: medium
tags: [hollywood, genai-adoption, studio-strategy, production-costs, ip-liability]
processed_by: clay
processed_date: 2026-03-10
extraction_model: "minimax/minimax-m2.5"
extraction_notes: "Extracted two claims: (1) IP liability as structural barrier - a NEW mechanism claim not in KB, distinct from existing sustaining/disruptive claim; (2) 3%/7% quantitative benchmark as enrichment to existing claim. Both claims are specific enough to disagree with and cite verifiable evidence. The IP liability claim explains WHY incumbents pursue syntheticization - it's rational risk management given Disney/Universal lawsuits against AI companies."
---
## Content
Deloitte's 2025 TMT Predictions report provides the most authoritative quantitative estimate of studio GenAI adoption rates.
**Budget allocation:**
- Large studios allocating **less than 3% of production budgets** to generative AI for content creation in 2025
- Approximately **7% of operational spending** shifting toward GenAI-enabled tools (non-content functions)
**Operational adoption areas (studios more comfortable here):**
- Contract and talent management
- Permitting and planning
- Marketing and advertising
- Localization and dubbing
**Why the caution on content creation:**
Studios cite "immaturity of the tools and the challenges of content creation with current public models that may expose them to liability and threaten the defensibility of their intellectual property (IP)."
Studios are "deferring their own risks while they watch to see how the capabilities evolve."
**Key contrast:**
Independent creators and social media platforms are moving quickly to integrate GenAI into workflows WITHOUT the same IP and liability constraints. This creates the asymmetric adoption dynamic between incumbents (cautious) and entrants (fast).
## Agent Notes
**Why this matters:** The 3%/7% split is a crucial data point for my claim about studios pursuing "progressive syntheticization" (making existing workflows cheaper) vs. independents pursuing "progressive control" (starting fully synthetic). The 7% operational vs. 3% content split confirms studios are using AI to sustain existing operations, not disrupt their own content pipeline.
**What surprised me:** The IP liability argument is more concrete than I'd modeled. Disney and Universal lawsuits against AI companies mean studios can't use public models without risking their own IP exposure. This is a specific structural constraint that slows studio adoption regardless of capability thresholds.
**What I expected but didn't find:** Specific dollar amounts or case studies of studios that have experimented with GenAI content and pulled back.
**KB connections:**
- Directly evidences: `GenAI is simultaneously sustaining and disruptive depending on whether users pursue progressive syntheticization or progressive control`
- Evidences: `proxy inertia is the most reliable predictor of incumbent failure because current profitability rationally discourages pursuit of viable futures`
- The IP/liability constraint is a specific mechanism not currently in my KB
**Extraction hints:**
- Claim enrichment: add the 3% content / 7% operational split as evidence for the sustaining vs. disruptive GenAI claim
- New claim candidate: "Studio IP liability exposure from training data creates a structural barrier to GenAI content adoption that independent creators without legacy IP don't face"
- The legal constraint asymmetry between studios and independents is a specific mechanism worth extracting
**Context:** Deloitte TMT Predictions is one of the most authoritative annual industry forecasts. The 3% figure is now widely cited as a benchmark. Published January 2025.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `GenAI is simultaneously sustaining and disruptive depending on whether users pursue progressive syntheticization or progressive control`
WHY ARCHIVED: The 3% content / 7% operational split is concrete quantitative evidence for the sustaining vs. disruptive dichotomy. The IP liability mechanism explains WHY incumbents pursue syntheticization — it's rational risk management, not technological incapability.
EXTRACTION HINT: Extract the IP liability constraint as a distinct mechanism claim separate from the general sustaining/disruptive framing.

View file

@ -1,68 +0,0 @@
---
type: source
title: "AI Film Studios Reshape Storytelling in 2025: 65+ AI-Centric Studios, Narrative Craft as Moat"
author: "Media C-Suite (sourcing FBRC March 2025 report)"
url: https://mediacsuite.com/ai-film-studios-reshape-storytelling-in-2025/
date: 2025-03-01
domain: entertainment
secondary_domains: []
format: report
status: unprocessed
priority: medium
tags: [ai-studios, independent-film, production-costs, narrative-craft, democratization]
---
## Content
FBRC's March 2025 report, drawing on 98 self-identified AI studios and founder interviews, documents the proliferation of AI-centric film studios globally.
**Scale:**
- At least **65 AI-centric film studios** have launched globally since 2022
- 30+ launched in 2024 and early 2025 alone
- Nearly 70% operate with **5 or fewer staff members**
**Key studios profiled:**
- **Promise** (co-founded by former YouTube exec Jamie Byrne): Uses AI to reduce costs while enabling mid-budget storytelling; developed proprietary tool *Muse*
- **Asteria** (backed by XTR, DeepMind alumni): Created *Marey*, a legally-compliant AI model addressing IP concerns
- **Shy Kids** (Toronto): GenAI for aesthetic prototyping
**Cost structures:**
- Secret Level: $10M budgets yielding $30M production values through AI-enhanced workflows (3:1 efficiency ratio)
- Staircase Studios: Claims near-studio-quality movies for under $500K (ForwardMotion proprietary AI)
- General: AI studios report 20-30% cost reductions; post-production timelines compressed from months to weeks
**Key insight from founder surveys:**
Nearly all founders confirmed **storytelling capability — not technical prowess — creates the strongest market differentiation.**
Rachel Joy Victor (co-founder): *"Story is dead, long live the story."*
**New specialist roles emerging:**
- Prompt engineers
- Model trainers
- AI-integrated art directors
**Commercial outcomes:** Report contains **no audience reception data or specific commercial outcomes** from AI-produced content. Coverage from IndieWire and Deadline noted.
## Agent Notes
**Why this matters:** The 65+ studio count and 70% operating with ≤5 people is concrete evidence that the democratization of production IS happening — the infrastructure for independent AI-first content exists. But the absence of commercial outcome data is telling: the market test hasn't been run at scale yet.
**What surprised me:** The "storytelling as moat" consensus among AI studio founders is a direct contradiction of the implicit narrative in my KB that technology capability is the bottleneck. These are the people BUILDING AI studios, and they're saying narrative craft is scarcer than tech. This strengthens my skepticism about the pure democratization thesis.
**What I expected but didn't find:** Distribution and marketing as concrete barriers. The Ankler article separately flags these — "expertise gaps in marketing, distribution & legal" as the real block. This source focuses only on production.
**KB connections:**
- Supports: `five factors determine the speed and extent of disruption including quality definition change and ease of incumbent replication` — the quality definition IS changing (tech → story)
- Relates to: `the TV industry needs diversified small bets like venture capital not concentrated large bets because power law returns dominate` — 65+ studios is the VC portfolio emerging
- Complicates: `non-ATL production costs will converge with the cost of compute` — the 70%/5-or-fewer staffing model shows this is happening, but narrative craft remains human-dependent
**Extraction hints:**
- The 65 studio count + 5-person team size is concrete evidence for the production democratization claim
- The "narrative moat" thesis from founders is a counterpoint worth capturing — could enrich or complicate existing claims
- No commercial outcome data = the demand-side question remains open; don't extract market success claims without evidence
**Context:** FBRC is a media research consultancy. The report drew IndieWire and Deadline coverage — these are the primary trade publications, so the industry is paying attention.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `GenAI is simultaneously sustaining and disruptive depending on whether users pursue progressive syntheticization or progressive control`
WHY ARCHIVED: The 65 AI studio proliferation is direct evidence that the "progressive control" (independent, AI-first) path exists and is scaling. The storytelling-as-moat finding is the key nuance — technology democratizes production but doesn't democratize narrative craft.
EXTRACTION HINT: The extractor should focus on the storytelling-as-moat consensus as a potential new claim. The absence of commercial outcomes data is important to preserve — don't infer commercial success from production efficiency.

View file

@ -1,53 +0,0 @@
---
type: source
title: "eMarketer: Consumer Enthusiasm for AI-Generated Creator Content Plummets from 60% to 26%"
author: "eMarketer"
url: https://www.emarketer.com/content/consumers-rejecting-ai-generated-creator-content
date: 2025-07-01
domain: entertainment
secondary_domains: []
format: report
status: unprocessed
priority: high
tags: [consumer-acceptance, ai-content, creator-economy, authenticity, gen-z, ai-slop]
---
## Content
Consumer enthusiasm for AI-generated creator content has dropped from **60% in 2023 to 26% in 2025** — a dramatic collapse as feeds overflow with what viewers call "AI slop."
**Key data (from Billion Dollar Boy, July 2025 survey, 4,000 consumers ages 16+ in US and UK plus 1,000 creators and 1,000 senior marketers):**
- 32% of US and UK consumers say AI is negatively disrupting the creator economy (up from 18% in 2023)
- Consumer enthusiasm for AI-generated creator work: 60% in 2023 → 26% in 2025
- 31% say AI in ads makes them less likely to pick a brand (CivicScience, July 2025)
**Goldman Sachs context (August 2025 survey):**
- 54% of Gen Z prefer no AI involvement in creative work
- Only 13% feel this way about shopping (showing AI tolerance is use-case dependent)
**Brand vs. creator content:**
Data distinguishes that creator-led AI content faces specific resistance that may differ from branded content. Major brands like Coca-Cola continue releasing AI-generated content despite consumer resistance, suggesting a disconnect between what consumers prefer and corporate practices.
## Agent Notes
**Why this matters:** The drop from 60% to 26% enthusiasm in just 2 years (2023→2025) is the single most striking data point in my research session. This happened WHILE AI quality was improving — which means the acceptance barrier is NOT primarily a quality issue. The "AI slop" term becoming mainstream is itself a memetic marker: consumers have developed a label for the phenomenon, which typically precedes organized rejection.
**What surprised me:** The divergence between creative work (54% Gen Z reject AI) vs. shopping (13% reject AI) is a crucial nuance. Consumers are not anti-AI broadly — they're specifically protective of the authenticity/humanity of creative expression. This is an identity and values question, not a quality question.
**What I expected but didn't find:** Expected some evidence of demographic segments where AI content is positively received for entertainment (e.g., interactive AI experiences, AI-assisted rather than AI-generated). Not present in this source.
**KB connections:**
- Directly tests: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability` — validates the binding constraint but reveals its nature is identity-driven, not capability-driven
- Relates to: `meme propagation selects for simplicity novelty and conformity pressure rather than truth or utility` — the "AI slop" meme may be a rejection cascade
- Relates to belief 4: ownership alignment and authenticity are the same underlying mechanism
**Extraction hints:**
- Claim candidate: "Consumer acceptance of AI creative content is declining despite improving quality because the authenticity signal itself becomes more valuable as AI-human distinction erodes"
- Claim candidate: "The creative-vs-shopping divergence in AI acceptance reveals that consumers distinguish between AI as efficiency tool and AI as creative replacement"
- Note the 60%→26% data requires careful scoping: this is about creator content specifically, not entertainment broadly
**Context:** eMarketer is a primary industry research authority for digital marketing. The 60%→26% figure is heavily cited in industry discussion. Multiple independent sources (IAB, Goldman Sachs, Billion Dollar Boy) converge on the same direction.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability`
WHY ARCHIVED: The 60%→26% enthusiasm collapse is the clearest longitudinal data point on consumer AI acceptance trajectory. The direction is opposite of what quality-improvement alone would predict.
EXTRACTION HINT: The extractor should focus on the NATURE of consumer rejection (identity/values driven) vs. the FACT of rejection. The Goldman Sachs creative-vs-shopping split is the key evidence for the "authenticity as identity" framing.

View file

@ -1,71 +0,0 @@
---
type: source
title: "Pudgy Penguins: $50M Revenue 2025 Target, DreamWorks Partnership, IPO by 2027 — Community-Owned IP Scaling"
author: "Binance Square / Luca Netz interview (aggregated from multiple sources)"
url: https://www.binance.com/en/square/post/08-25-2025-pudgy-penguins-projects-record-revenue-and-future-public-listing-28771847394641
date: 2025-08-01
domain: entertainment
secondary_domains: [internet-finance]
format: report
status: unprocessed
priority: high
tags: [community-owned-ip, pudgy-penguins, web3-entertainment, franchise, revenue, phygital]
flagged_for_rio: ["web3 franchise monetization model and token economics relevant to internet finance domain"]
---
## Content
Pudgy Penguins CEO Luca Netz (August 2025 interview) reveals commercial scale of community-owned IP franchise.
**Revenue metrics:**
- 2025 target: $50M record revenue
- 2026 projection: $120M revenue
- IPO target: by 2027
**Franchise scale:**
- 200 billion total content views across all platforms
- 300 million daily views (community-generated content)
- 2M+ physical product units sold
- 10,000+ retail locations including 3,100 Walmart stores
- $13M+ retail phygital sales
**Gaming expansion:**
- Pudgy Party (mobile game, with Mythical Games): 500K+ downloads in first 2 weeks (August 2025 launch)
- 2026 roadmap: seasonal updates, blockchain-integrated NFT assets
**Entertainment IP expansion:**
- DreamWorks Animation partnership announced October 2025 (Kung Fu Panda cross-promotion)
- Vibes TCG: 4 million cards moved
- Visa Pengu Card launched
**Web3 onboarding strategy:**
"Acquire users through mainstream channels first (toys, retail, viral media), then onboard them into Web3 through games, NFTs and the PENGU token." — Luca Netz
**Community distribution:**
PENGU token airdropped to 6M+ wallets — broad distribution as community building tool.
## Agent Notes
**Why this matters:** Pudgy Penguins is the clearest real-world test of community-owned IP at scale. The $50M→$120M revenue trajectory, Walmart distribution, and DreamWorks partnership show a community-native brand competing directly with traditional IP franchises. This is evidence for Belief 2 (community beats budget) and Belief 4 (ownership alignment turns fans into stakeholders) at commercial scale.
**What surprised me:** The DreamWorks partnership is a significant signal. Traditional studios don't partner with community-owned brands unless the commercial metrics are compelling. The fact that DreamWorks specifically is partnering (not a smaller IP licensor) suggests the entertainment establishment is validating the model.
**What I expected but didn't find:** Margin data or specifics on how revenue splits between the Pudgy Penguins company vs. community/holders. The "community-owned" claim needs nuance — the company is building toward an IPO, which suggests traditional corporate ownership is consolidating value even if community economics participate.
**KB connections:**
- Strong evidence for: `community ownership accelerates growth through aligned evangelism not passive holding`
- Strong evidence for: `fanchise management is a stack of increasing fan engagement from content extensions through co-creation and co-ownership`
- The "mainstream first, Web3 second" onboarding strategy is a specific model worth capturing — it reverses the typical NFT playbook
- Complicates Belief 4 (ownership alignment): IPO trajectory suggests the company is extracting value to traditional equity, not community token holders primarily
**Extraction hints:**
- The "mainstream first, Web3 second" acquisition strategy is a new specific model — distinct from NFT-first approaches that failed
- The DreamWorks partnership as evidence that traditional studios are validating community-native IP
- The token-to-wallet airdrop (6M wallets) as community building infrastructure, not just speculation vehicle
- Flag for Rio: the revenue model and token economics are internet-finance domain
**Context:** Luca Netz is CEO of Pudgy Penguins — a former toy entrepreneur who repositioned the brand from speculation vehicle to entertainment franchise after acquiring it in 2022. The commercial transformation from NFT project to $50M revenue franchise is one of the most dramatic in Web3 entertainment.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `community ownership accelerates growth through aligned evangelism not passive holding`
WHY ARCHIVED: Pudgy Penguins at $50M revenue + DreamWorks partnership is the strongest current evidence that community-owned IP can compete with traditional franchise models at commercial scale. The "mainstream first, Web3 second" strategy is a specific new model.
EXTRACTION HINT: Focus on (1) the commercial scale data as evidence for the community-beats-budget thesis, (2) the mainstream-to-Web3 acquisition funnel as a distinct strategic model, (3) the DreamWorks signal as traditional entertainment validation.

View file

@ -1,62 +0,0 @@
---
type: source
title: "The Ankler: $5M Film? AI Studios Bet on a Cheap Future Hollywood Won't Buy"
author: "Erik Barmack (The Ankler)"
url: https://theankler.com/p/a-5m-film-ai-studios-bet-on-a-cheap
date: 2025-09-01
domain: entertainment
secondary_domains: []
format: report
status: null-result
priority: high
tags: [ai-studios, market-skepticism, distribution, hollywood-resistance, ip-copyright]
processed_by: clay
processed_date: 2026-03-10
extraction_model: "minimax/minimax-m2.5"
extraction_notes: "Extracted three claims from Barmack's analysis. Primary claim focuses on distribution/legal barriers being more binding than production quality - this directly challenges the 'AI democratizes production' thesis. Two supporting claims specify the mechanisms: marketing/distribution infrastructure gap and copyright liability preventing studio acquisition. All claims are specific enough to disagree with and cite verifiable evidence. No duplicates found against existing entertainment domain claims."
---
## Content
Erik Barmack (former Netflix exec, founder of Wild Sheep Content) argues that the real barrier to AI-produced films isn't cost or quality — it's market access.
**Core argument:**
"Stunning, low-cost AI films may still have no market."
**Three specific barriers identified (beyond technology):**
1. **Marketing expertise** — AI studios lack the distribution relationships and marketing infrastructure to get audiences to watch
2. **Distribution access** — streaming platforms and theatrical have existing relationships with established studios
3. **Legal/copyright exposure** — Studios won't buy content "trained — without permission — off of their own characters"
**Hollywood resistance mechanism:**
"Studios are notoriously slow in adopting any new approach to movie-making that undermines decades of their own carefully crafted IP."
**Concrete copyright conflict:**
Disney and Universal lawsuits against Midjourney are mentioned as active legal constraints. Studios acquiring AI-generated content risk legal liability.
**Market signal:**
Barmack mentions specific AI startups (Promise, GRAiL) building full-stack production pipelines — but frames these as proving capability without proving demand.
## Agent Notes
**Why this matters:** This is the most direct counter-argument to the "AI democratizes production → content floods market" thesis. Barmack is an insider (former Netflix) not a Luddite — his framing that distribution/marketing/legal are the real barriers is credible and specific. It shifts the bottleneck analysis from production capability to market access.
**What surprised me:** I hadn't been tracking copyright litigation against AI video generators as a market constraint. If studios won't acquire AI-trained content due to liability, that's a structural distribution barrier independent of quality or consumer acceptance.
**What I expected but didn't find:** Any successful examples of AI-generated content ACQUIRED by a major distributor. The absence confirms the distribution barrier is real.
**KB connections:**
- Directly challenges the optimistic reading of: `GenAI is simultaneously sustaining and disruptive depending on whether users pursue progressive syntheticization or progressive control`
- The distribution barrier suggests the "progressive control" path (independent, AI-first) may be stuck at production without reaching audiences
- Relates to: `five factors determine the speed and extent of disruption including quality definition change and ease of incumbent replication` — ease of DISTRIBUTION replication is the factor not captured
**Extraction hints:**
- New claim candidate: "AI-generated entertainment faces distribution and legal barriers that are more binding than production quality barriers because platform relationships and copyright exposure are incumbent advantages that technology doesn't dissolve"
- This would be a challenge to the simple disruption narrative — worth extracting as a complication
- Note Barmack's credentials: former Netflix exec who has seen disruptive content succeed from inside the machine
**Context:** The Ankler is a premium Hollywood trade newsletter by veteran insiders. Erik Barmack ran international originals at Netflix and has direct experience with what studios buy and why. This source is credible and contrarian within the entertainment industry.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `five factors determine the speed and extent of disruption including quality definition change and ease of incumbent replication`
WHY ARCHIVED: This source names distribution, marketing, and copyright as disruption bottlenecks that existing KB claims don't capture. The "low cost but no market" framing is a direct challenge to the democratization narrative.
EXTRACTION HINT: The extractor should focus on the distribution/legal barrier as a distinct mechanism claim, not just a complication to existing claims. The copyright asymmetry (independents can't sell to studios that use AI) is the most extractable specific mechanism.

View file

@ -1,55 +0,0 @@
---
type: source
title: "a16z State of Consumer AI 2025: Product Hits, Misses, and What's Next"
author: "Andreessen Horowitz (a16z)"
url: https://a16z.com/state-of-consumer-ai-2025-product-hits-misses-and-whats-next/
date: 2025-12-01
domain: entertainment
secondary_domains: []
format: report
status: unprocessed
priority: medium
tags: [ai-consumer-products, video-generation, retention, chatgpt, sora, google-veo]
---
## Content
a16z's annual consumer AI landscape report documents adoption patterns across major AI product categories.
**Market concentration:**
- Fewer than 10% of ChatGPT weekly users even visited another major model provider — "winner take most" dynamics
- ChatGPT: 800-900 million weekly active users; 36% daily-to-monthly ratio
- Gemini: 21% daily-to-monthly ratio; but growing faster (155% YoY desktop users vs. ChatGPT 23%)
- Gemini Pro subscriptions: 300% YoY growth vs. ChatGPT 155%
**AI video generation (entertainment-relevant):**
- Google Nano Banana model: 200 million images in first week, 10 million new users
- **Veo 3 breakthrough:** Combined visual AND audio generation in one model
- **Sora standalone app:** 12 million downloads, but **below 8% retention at day 30** (benchmark for top apps is 30%+)
**Key insight:**
"Huge white space for founders" building dedicated consumer experiences outside corporate platforms, as major labs focus on model development and existing-product feature additions.
## Agent Notes
**Why this matters:** The Sora retention data is the single most important number in this report for my research. 12 million people downloaded the AI video generation app — and 92%+ stopped using it within a month. This is the clearest demand-side signal: even enthusiastic early adopters who sought out AI video generation aren't forming habits. This is NOT a quality problem (Sora was state-of-the-art at launch) — it's a use-case problem.
**What surprised me:** The "winner take most" in AI assistants contrasts sharply with the AI video fragmentation. ChatGPT has near-monopoly retention; Sora has near-zero retention. This suggests AI for video creation doesn't yet have a compelling enough use case to sustain daily/weekly habits the way text AI does.
**What I expected but didn't find:** Data on what Sora's 12M downloaders actually used it for, and why they stopped. Entertainment creation? One-time curiosity? The retention failure is clear; the mechanism is opaque.
**KB connections:**
- The Sora retention data supports: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability` — here, technology is sufficient but consumers aren't forming habits
- Complicates the narrative that AI video democratizes entertainment creation — if creators themselves don't retain, the democratization isn't happening at scale
- Connects to the EMarketer 60%→26% enthusiasm collapse — the Sora retention mirrors that drop
**Extraction hints:**
- The Sora 8% retention figure is a specific, citable data point for the consumer acceptance binding constraint claim
- The Veo 3 audio+video integration is noteworthy for production cost convergence — it's the first model producing what was previously multi-tool production
- The "white space for founders" observation is a potential strategic insight for community-owned entertainment models
**Context:** a16z is the leading VC firm in both AI and consumer tech. This report is their authoritative annual landscape scan. The Sora data is especially credible because OpenAI would not be highlighting these retention numbers publicly.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability`
WHY ARCHIVED: Sora's 8% D30 retention is quantitative evidence that even among early adopters, AI video creation doesn't form habits. This validates the consumer acceptance binding constraint claim and specifically situates it as a demand/use-case problem, not a quality problem.
EXTRACTION HINT: Focus on Sora retention as a specific, quantifiable evidence point. Distinguish this from passive consumption of AI content — this is about consumer CREATION using AI tools, which is a different behavior than acceptance of AI-generated content.

View file

@ -1,60 +0,0 @@
---
type: source
title: "EY 2026 Media and Entertainment Trends: Simplicity, Authenticity and the Rise of Experiences"
author: "EY (Ernst & Young)"
url: https://www.ey.com/en_us/insights/media-entertainment/2026-media-and-entertainment-trends-simplicity-authenticity-and-the-rise-of-experiences
date: 2026-01-01
domain: entertainment
secondary_domains: []
format: report
status: unprocessed
priority: high
tags: [authenticity, ai-content, media-trends, consumer-preferences, streaming, podcast]
---
## Content
EY's 2026 M&E trends report identifies a critical tension: AI productivity tools are expanding across entertainment production while synthetic "AI slop" is simultaneously proliferating, eroding consumer trust.
**Trust collapse:**
- September 2025 Gallup poll: confidence in news organizations at lowest level on record — 28%
- Steeper declines among younger audiences
**Strategic implication:**
Authenticity becomes a competitive advantage. Media leaders advised to blend AI-driven efficiencies with human creativity, ensuring audiences encounter "recognizably human" content—genuine storytelling and distinctive editorial judgment.
**Consumer entertainment preferences (from EY Decoding the Digital Home 2025 Study):**
Consumers don't want MORE content; they want:
- Better mix of live TV, channels, and dedicated apps
- Greater customization and guidance
- Overall simplification
Fragmentation remains primary pain point, particularly for sports fans navigating rising costs and fragmented rights.
**Podcast market growth:**
- Global podcast market projected to surge from $7.7 billion in 2024 to $41.1 billion by 2029
- 39.9% CAGR — underscoring format's staying power and importance of long-form human voice
## Agent Notes
**Why this matters:** EY's "authenticity as competitive advantage" framing is exactly the mechanism my KB needs to explain why studios might rationally invest in demonstrated human creative direction even as AI costs fall. It's not nostalgia — it's that authenticity is becoming a premium differentiator in a world of infinite cheap content.
**What surprised me:** The consumer preference for SIMPLIFICATION (fewer services, better guidance) contradicts the intuitive assumption that more content options = better. Consumers aren't suffering from too little — they're suffering from too much. This has implications for the community-filtered IP thesis: communities as curation layers are more valuable than I'd modeled.
**What I expected but didn't find:** Specific data on what percentage of media consumers actively seek "human-certified" content, or whether AI disclosure requirements are moving into regulation.
**KB connections:**
- Strengthens: `the media attractor state is community-filtered IP with AI-collapsed production costs where content becomes a loss leader for the scarce complements of fandom community and ownership`
- Connects to: `information cascades create power law distributions in culture because consumers use popularity as a quality signal when choice is overwhelming` — the simplification desire is the same phenomenon
- The podcast growth data supports: `complex ideas propagate with higher fidelity through personal interaction than mass media because nuance requires bidirectional communication`
**Extraction hints:**
- Potential claim enrichment: add authenticity premium data to `consumer definition of quality is fluid and revealed through preference not fixed by production value`
- New claim candidate: "Content fragmentation has reached the point where simplification and curation are more valuable to consumers than additional content quantity"
- The podcast CAGR (39.9%) as evidence that human voice and intimacy retain premium value in AI content environment
**Context:** EY M&E practice works with major studios and platforms on strategy. This report is credible signal about where enterprise entertainment investment is heading.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `the media attractor state is community-filtered IP with AI-collapsed production costs where content becomes a loss leader for the scarce complements of fandom community and ownership`
WHY ARCHIVED: The "simplification demand" finding reframes the attractor state — consumers want less content but better curation. The authenticity-as-competitive-advantage thesis names the mechanism by which community-owned IP (which signals human creativity) commands a premium.
EXTRACTION HINT: Focus on (1) simplification demand as evidence that curation is scarce, not content, and (2) authenticity-as-premium as a claim that can sit alongside (not contradict) AI cost-collapse claims.

View file

@ -1,63 +0,0 @@
---
type: source
title: "Survey: Audiences' Top AI Concern Is Blurred Reality — 91% Want AI Content Labeling Required"
author: "Advanced Television (sourcing audience survey)"
url: https://www.advanced-television.com/2026/01/15/survey-audiences-top-ai-concern-is-blurred-reality
date: 2026-01-15
domain: entertainment
secondary_domains: []
format: report
status: null-result
priority: medium
tags: [consumer-acceptance, ai-disclosure, authenticity, trust, regulation, uk-audience]
processed_by: clay
processed_date: 2026-03-10
extraction_model: "minimax/minimax-m2.5"
extraction_notes: "Extracted 3 claims from UK audience survey. First claim identifies the epistemic vs aesthetic distinction in consumer objections (62% being misled vs 51% quality). Second claim captures the counterintuitive hybrid preference finding that AI+human scores better than either pure category. Third claim captures the 91% disclosure demand as regulatory pressure indicator. All claims build on existing KB claim about consumer acceptance gating GenAI adoption. No duplicates found in existing entertainment claims."
---
## Content
Survey data on UK audience attitudes toward AI content in entertainment, focused on trust and disclosure.
**Key data points:**
- Only **26% of UK adults** say they would engage with content if they knew it was created or co-created by AI
- 53% say they would NOT engage with AI-created/co-created content
- **91% of UK adults** think platforms should be required to clearly label AI-generated content
- 72% say companies should ALWAYS disclose if AI was used in any way
- Additional 21% say companies should disclose if AI played a MAJOR role
**Top AI concerns (audiences):**
1. Being misled by AI-generated content (62%)
2. Losing ability to distinguish what is real
3. AI-generated actors and performances (discomfort even among those otherwise comfortable with AI)
4. Authenticity (67% cite)
5. Quality of AI-generated material (51%)
**Hybrid model finding:**
Hybrid human-AI collaboration is perceived MORE favorably and gains BROADER acceptance compared to fully AI-generated OR purely human-created content. A middle ground is more acceptable.
## Agent Notes
**Why this matters:** The 26%/53% accept/reject split is the clearest consumer acceptance data point I found. More than half of audiences would actively decline to engage with content they know is AI-generated. This is not about inability to detect AI — it's about active choice to avoid. The "blurred reality" framing (top concern) tells you the anxiety: it's about epistemics and trust, not aesthetics.
**What surprised me:** The hybrid finding — that AI + human collaboration scores BETTER than either purely human or purely AI content — is counterintuitive and important. It suggests the consumer objection is to REPLACEMENT of human creativity, not to AI ASSISTANCE. This is a significant nuance that my KB doesn't currently capture.
**What I expected but didn't find:** Data on whether the 26% accept / 53% reject split varies by content type (entertainment vs. news vs. advertising). The survey framing seems general rather than entertainment-specific.
**KB connections:**
- Directly validates: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability`
- The "blurred reality" concern relates to: `meme propagation selects for simplicity novelty and conformity pressure rather than truth or utility` — the authenticity concern is about epistemic grounding
- The hybrid preference complicates the binary in my KB — the attractor state may not be "AI vs. human" but "AI-augmented human"
- Connects to EY authenticity premium finding
**Extraction hints:**
- New claim candidate: "Consumer acceptance of AI entertainment content is contingent on transparency because the primary objection is epistemic (being misled) not aesthetic (quality)"
- The hybrid preference is a key nuance: consumers accept AI assistance but reject AI replacement — this distinction should be in the KB
- The 91% disclosure demand suggests regulatory pressure is coming regardless of industry preference
**Context:** Advanced Television covers UK/European broadcast industry. The 91% disclosure finding is relevant to upcoming EU AI Act provisions and UK regulatory discussions.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability`
WHY ARCHIVED: The 26/53 accept/reject split is the clearest consumer acceptance data. The "epistemic not aesthetic" nature of the objection (concern about being misled, not about quality) is a new framing that enriches the binding constraint claim.
EXTRACTION HINT: Focus on (1) the transparency as mechanism — labeling changes the consumer decision, (2) the hybrid preference as evidence that AI assistance ≠ AI replacement in consumer minds, (3) the 91% disclosure demand as regulatory pressure indicator.

View file

@ -1,61 +0,0 @@
---
type: source
title: "Seedance 2.0 vs Kling 3.0 vs Veo 3.1: AI Video Benchmark 2026 — Capability Milestone Assessment"
author: "AI Journal / Evolink AI / Lantaai (aggregated benchmark reviews)"
url: https://aijourn.com/seedance-2-0-vs-kling-3-0-vs-veo-3-1-ai-video-benchmark-test-for-2026/
date: 2026-02-01
domain: entertainment
secondary_domains: []
format: report
status: unprocessed
priority: medium
tags: [ai-video-generation, seedance, production-costs, quality-threshold, capability]
---
## Content
Aggregated benchmark data on the leading AI video generation models in 2026 (Seedance 2.0, Kling 3.0, Veo 3.1).
**Seedance 2.0 technical capabilities:**
- Ranked #1 globally on Artificial Analysis benchmark
- Native 2K resolution (2048x1080 landscape / 1080x2048 portrait) — up from 1080p max in Seedance 1.5 Pro
- Dynamic duration: 4s to 15s per generation (longest in flagship category)
- 30% faster throughput than Seedance 1.5 Pro at equivalent complexity
- Hand anatomy: near-perfect score — complex finger movements (magician shuffling cards, pianist playing) with zero visible hallucinations or warped limbs
- Supports 8+ languages for phoneme-level lip-sync
**Test methodology (benchmark reviews):**
- 50+ generations per model
- Identical prompt set of 15 categories
- 4 seconds at 720p/24fps per clip
- Rated on 6 dimensions (0-10) by 2 independent reviewers, normalized to 0-100
**Competitive landscape:**
- Kling 3.0 edges ahead for straightforward video generation (ease of use)
- Seedance 2.0 wins for precise creative control
- Google Veo 3 (with audio) also competing — Veo 3 breakthrough was combining visual and audio generation
- Sora standalone app: 12 million downloads but retention below 8% at day 30
## Agent Notes
**Why this matters:** Hand anatomy was the most visible "tell" of AI-generated video in 2024. The near-perfect hand score is the clearest signal that a capability threshold has been crossed. Combined with the lip-sync quality across languages, AI video has cleared the technical bar for live-action substitution in many use cases. This data updates my KB — the quality moat objection weakens significantly.
**What surprised me:** Sora's retention problem (below 8% at day 30, vs. 30%+ benchmark for top apps) suggests that even among early adopters, AI video generation hasn't created a compelling consumer habit. This is the supply side discovering the demand side constraint.
**What I expected but didn't find:** Benchmarks from actual entertainment productions using these tools — the benchmarks here are synthetic test prompts, not real production scenarios. The gap between benchmark performance and production-ready utility may still be significant.
**KB connections:**
- Tests: `consumer definition of quality is fluid and revealed through preference not fixed by production value` — if quality can no longer be distinguished, "production value" as a moat claim collapses
- Weakens the "quality moat" challenge to Belief 3
- The Sora retention data actually SUPPORTS the consumer acceptance binding constraint (demand, not supply, is limiting adoption)
**Extraction hints:**
- Claim enrichment: update `non-ATL production costs will converge with the cost of compute as AI replaces labor across the production chain` with 2026 capability evidence
- Note: benchmark-to-production gap is important — don't overclaim from synthetic benchmarks
- The Sora retention data is the surprising signal — 12M downloads but <8% D30 retention suggests demand-side problem even among enthusiasts
**Context:** ByteDance (Seedance), Google (Veo), Runway (partnered with Lionsgate), and Pika Labs are the main competitors in AI video. Benchmark season in early 2026 reflects major capability jumps from late 2025 models.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `non-ATL production costs will converge with the cost of compute as AI replaces labor across the production chain`
WHY ARCHIVED: The hand anatomy benchmark crossing signals that the quality threshold for realistic video has been substantially cleared — which shifts the remaining barrier to consumer acceptance (demand-side) and creative direction (human judgment), not raw capability.
EXTRACTION HINT: The Sora retention data (supply without demand) is the most extractable insight. A claim about AI video tool adoption being demand-constrained despite supply capability would be new to the KB.

View file

@ -1,30 +0,0 @@
---
type: source
title: "CLIs are exciting because they're legacy technology — AI agents can natively use them, combine them, interact via terminal"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2026360908398862478
date: 2026-02-24
domain: ai-alignment
secondary_domains: [teleological-economics]
format: tweet
status: unprocessed
priority: medium
tags: [cli, agents, terminal, developer-tools, legacy-systems]
---
## Content
CLIs are super exciting precisely because they are a "legacy" technology, which means AI agents can natively and easily use them, combine them, interact with them via the entire terminal toolkit.
E.g ask your Claude/Codex agent to install this new Polymarket CLI and ask for any arbitrary dashboards or interfaces or logic. The agents will build it for you. Install the Github CLI too and you can ask them to navigate the repo, see issues, PRs, discussions, even the code itself.
## Agent Notes
**Why this matters:** 11.7K likes. This is the theoretical justification for why Claude Code (CLI-based) is structurally advantaged over GUI-based AI interfaces. Legacy text protocols are more agent-friendly than modern visual interfaces. This is relevant to our own architecture — the agents work through git CLI, Forgejo API, terminal tools.
**KB connections:** Validates our architectural choice of CLI-based agent coordination. Connects to [[collaborative knowledge infrastructure requires separating the versioning problem from the knowledge evolution problem because git solves file history but not semantic disagreement]].
**Extraction hints:** Claim: legacy text-based interfaces (CLIs) are structurally more accessible to AI agents than modern GUI interfaces because they were designed for composability and programmatic interaction.
**Context:** Karpathy explicitly mentions Claude and Polymarket CLI — connecting AI agents with prediction markets through terminal tools. Relevant to the Teleo stack.

View file

@ -1,28 +0,0 @@
---
type: source
title: "Programming fundamentally changed in December 2025 — coding agents basically didn't work before and basically work since"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2026731645169185220
date: 2026-02-25
domain: ai-alignment
secondary_domains: [teleological-economics]
format: tweet
status: unprocessed
priority: medium
tags: [coding-agents, ai-capability, phase-transition, software-development, disruption]
---
## Content
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn't work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
## Agent Notes
**Why this matters:** 37K likes — Karpathy's most viral tweet in this dataset. This is the "phase transition" observation from the most authoritative voice in AI dev tooling. December 2025 as the inflection point for coding agents.
**KB connections:** Supports [[as AI-automated software development becomes certain the bottleneck shifts from building capacity to knowing what to build]]. Relates to [[the gap between theoretical AI capability and observed deployment is massive across all occupations]] — but suggests the gap is closing fast for software specifically.
**Extraction hints:** Claim candidate: coding agent capability crossed a usability threshold in December 2025, representing a phase transition not gradual improvement. Evidence: Karpathy's direct experience running agents on nanochat.
**Context:** This tweet preceded the autoresearch project by ~10 days. The 37K likes suggest massive resonance across the developer community. The "asterisks" he mentions are important qualifiers that a good extraction should preserve.

View file

@ -1,44 +0,0 @@
---
type: source
title: "8-agent research org experiments reveal agents generate bad ideas but execute well — the source code is now the org design"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2027521323275325622
date: 2026-02-27
domain: ai-alignment
secondary_domains: [collective-intelligence]
format: tweet
status: unprocessed
priority: high
tags: [multi-agent, research-org, agent-collaboration, prompt-engineering, organizational-design]
flagged_for_theseus: ["Multi-model collaboration evidence — 8 agents, different setups, empirical failure modes"]
---
## Content
I had the same thought so I've been playing with it in nanochat. E.g. here's 8 agents (4 claude, 4 codex), with 1 GPU each running nanochat experiments (trying to delete logit softcap without regression). The TLDR is that it doesn't work and it's a mess... but it's still very pretty to look at :)
I tried a few setups: 8 independent solo researchers, 1 chief scientist giving work to 8 junior researchers, etc. Each research program is a git branch, each scientist forks it into a feature branch, git worktrees for isolation, simple files for comms, skip Docker/VMs for simplicity atm (I find that instructions are enough to prevent interference). Research org runs in tmux window grids of interactive sessions (like Teams) so that it's pretty to look at, see their individual work, and "take over" if needed, i.e. no -p.
But ok the reason it doesn't work so far is that the agents' ideas are just pretty bad out of the box, even at highest intelligence. They don't think carefully though experiment design, they run a bit non-sensical variations, they don't create strong baselines and ablate things properly, they don't carefully control for runtime or flops. (just as an example, an agent yesterday "discovered" that increasing the hidden size of the network improves the validation loss, which is a totally spurious result given that a bigger network will have a lower validation loss in the infinite data regime, but then it also trains for a lot longer, it's not clear why I had to come in to point that out). They are very good at implementing any given well-scoped and described idea but they don't creatively generate them.
But the goal is that you are now programming an organization (e.g. a "research org") and its individual agents, so the "source code" is the collection of prompts, skills, tools, etc. and processes that make it up. E.g. a daily standup in the morning is now part of the "org code". And optimizing nanochat pretraining is just one of the many tasks (almost like an eval). Then - given an arbitrary task, how quickly does your research org generate progress on it?
## Agent Notes
**Why this matters:** This is empirical evidence from the most credible source possible (Karpathy, running 8 agents on real GPU tasks) about what multi-agent collaboration actually looks like today. Key finding: agents execute well but generate bad ideas. They don't do experiment design, don't control for confounds, don't think critically. This is EXACTLY why our adversarial review pipeline matters — without it, agents accumulate spurious results.
**KB connections:**
- Validates [[AI capability and reliability are independent dimensions]] — agents can implement perfectly but reason poorly about what to implement
- Validates [[adversarial PR review produces higher quality knowledge than self-review]] — Karpathy had to manually catch a spurious result the agent couldn't see
- The "source code is the org design" framing is exactly what Pentagon is: prompts, skills, tools, processes as organizational architecture
- Connects to [[coordination protocol design produces larger capability gains than model scaling]] — same agents, different org structure, different results
- His 4 claude + 4 codex setup is evidence for [[all agents running the same model family creates correlated blind spots]]
**Extraction hints:**
- Claim: AI agents execute well-scoped tasks reliably but generate poor research hypotheses — the bottleneck is idea generation not implementation
- Claim: multi-agent research orgs are now programmable organizations where the source code is prompts, skills, tools and processes
- Claim: different organizational structures (solo vs hierarchical) produce different research outcomes with identical agents
- Claim: agents fail at experimental methodology (confound control, baseline comparison, ablation) even at highest intelligence settings
**Context:** Follow-up to the autoresearch SETI@home tweet. Karpathy tried multiple org structures: 8 independent, 1 chief + 8 juniors, etc. Used git worktrees for isolation (we use the same pattern in Pentagon). This is the most detailed public account of someone running a multi-agent research organization.

View file

@ -1,39 +0,0 @@
---
type: source
title: "Permissionless MetaDAO launches create new cultural primitives around fundraising"
author: "Felipe Montealegre (@TheiaResearch)"
twitter_id: "1511793131884318720"
url: https://x.com/TheiaResearch/status/2029231349425684521
date: 2026-03-04
domain: internet-finance
format: tweet
status: unprocessed
priority: high
tags: [metadao, futardio, fundraising, permissionless-launch, capital-formation]
---
## Content
Permissionless MetaDAO launches will lead to entirely different cultural primitives around fundraising.
1. Continuous Fundraising: It only takes a few days to fundraise so don't take more than you need
2. Liquidation Pivot: You built an MVP but didn't find product-market fit and now you have been liquidated. Try again on another product or strategy.
3. Multiple Attempts: You didn't fill your minimum raise? Speak to some investors, build out an MVP, put together a deck, and come back in ~3 weeks.
4. Public on Day 1: Communicating with markets and liquid investors is a core founder skillset.
5. 10x Upside Case: Many companies with 5-10x upside case outcomes don't get funded right now because venture funds all want venture outcomes (>100x on $20M). What if you just want to build a $25M company with a decent probability of success? Raise $1M and the math works fine for Futardio investors.
Futardio is a paradigm shift for capital markets. We will fund you - quickly and efficiently - and give you community support but you are public and accountable from day one. Welcome to the arena.
## Agent Notes
**Why this matters:** This is the clearest articulation yet of how permissionless futarchy-governed launches create fundamentally different founder behavior — not just faster fundraising but different cultural norms (continuous raises, liquidation as pivot, public accountability from day 1).
**KB connections:** Directly extends [[internet capital markets compress fundraising from months to days]] and [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible]]. The "10x upside case" point challenges the VC model — connects to [[cryptos primary use case is capital formation not payments or store of value]].
**Extraction hints:** At least 2-3 claims here: (1) permissionless launches create new fundraising cultural norms, (2) the 10x upside gap in traditional VC is a market failure that futarchy-governed launches solve, (3) public accountability from day 1 is a feature not a bug.
**Context:** Felipe Montealegre runs Theia Research, a crypto-native investment firm focused on MetaDAO ecosystem. He's been one of the most articulate proponents of the futarchy-governed capital formation thesis. This tweet got 118 likes — high engagement for crypto-finance X.

View file

@ -1,47 +0,0 @@
---
type: source
title: "Autoresearch must become asynchronously massively collaborative for agents — emulating a research community, not a single PhD student"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2030705271627284816
date: 2026-03-08
domain: ai-alignment
secondary_domains: [collective-intelligence]
format: tweet
status: unprocessed
priority: high
tags: [autoresearch, multi-agent, git-coordination, collective-intelligence, agent-collaboration]
flagged_for_theseus: ["Core AI agent coordination architecture — directly relevant to multi-model collaboration claims"]
flagged_for_leo: ["Cross-domain synthesis — this is what we're building with the Teleo collective"]
---
## Content
The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them.
Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later.
I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run:
https://t.co/tmZeqyDY1W
Alternatively, a PR has the benefit of exact commits:
https://t.co/CZIbuJIqlk
but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back.
I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.
## Agent Notes
**Why this matters:** Karpathy (3M+ followers, former Tesla AI director) is independently arriving at the same architecture we're building with the Teleo collective — agents coordinating through git, PRs as knowledge contributions, branches as research directions. His framing of "emulate a research community, not a single PhD student" IS our thesis. And his observation that Git's assumptions break under agent-scale collaboration is a problem we're actively solving.
**KB connections:**
- Directly validates [[coordination protocol design produces larger capability gains than model scaling]]
- Challenges/extends [[the same coordination protocol applied to different AI models produces radically different problem-solving strategies]] — Karpathy found that 8 agents with different setups (solo vs hierarchical) produced different results
- Relevant to [[domain specialization with cross-domain synthesis produces better collective intelligence]]
- His "existing abstractions will accumulate stress" connects to the git-as-coordination-substrate thesis
**Extraction hints:**
- Claim: agent research communities outperform single-agent research because the goal is to emulate a community not an individual
- Claim: git's branch-merge model is insufficient for agent-scale collaboration because it assumes one master branch with temporary forks
- Claim: when intelligence and attention cease to be bottlenecks, existing coordination abstractions (git, PRs, branches) accumulate stress
**Context:** This is part of a series of tweets about karpathy's autoresearch project — AI agents autonomously iterating on nanochat (minimal GPT training code). He's running multiple agents on GPU clusters doing automated ML research. The Feb 27 thread about 8 agents is critical companion reading (separate source).

View file

@ -1,63 +0,0 @@
---
type: source
title: "@01Resolved X archive — 100 most recent tweets"
author: "01Resolved (@01Resolved)"
url: https://x.com/01Resolved
date: 2026-03-09
domain: internet-finance
format: tweet
status: processed
processed_by: rio
processed_date: 2026-03-09
enrichments:
- "MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions"
- "futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent"
tags: [metadao, governance-analytics, ranger-liquidation, solomon, decision-markets, turbine]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Analyst account providing the deepest on-chain forensics of MetaDAO governance events.
This is the data layer — while Proph3t provides ideology and Felipe provides thesis,
01Resolved provides the numbers. Key contribution: Ranger liquidation forensics with
exact trader counts, volume, alignment percentages. Also tracking Solomon treasury
governance and Turbine buyback mechanics. Low follower count (~500) but extremely high
signal density — this is the account writing the kind of analysis we should be writing.
extraction_hints:
- "Ranger liquidation forensics: 92.41% pass-aligned, 33 traders, $119K volume — data for enriching futarchy governance claims"
- "Solomon treasury subcommittee analysis — evidence for 'futarchy-governed DAOs converge on traditional corporate governance scaffolding'"
- "Turbine buyback TWAP threshold filtering — mechanism design detail, potential new claim about automated treasury management"
- "Decision market participation data — contributes to 'MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions'"
- "Cross-reference: do contested decisions show higher volume than uncontested? The Ranger liquidation data vs routine proposals could test this"
priority: high
---
# @01Resolved X Archive (March 2026)
## Substantive Tweets
### Ranger Liquidation Forensics
- 92.41% of decision market value aligned with pass (liquidation)
- 33 unique traders participated in the governance decision
- $119K total trading volume in the decision market
- Timeline analysis of how the market reached consensus
- This is the most complete public dataset on a futarchy enforcement event
### Solomon Treasury Subcommittee
- Detailed analysis of DP-00001 (treasury subcommittee formation)
- Tracking how Solomon is building traditional governance structures within futarchy framework
- Coverage of committee composition, authority scope, reporting requirements
- Signal: even futarchy-native projects need human-scale operational governance
### Turbine Buyback Analysis
- TWAP (time-weighted average price) threshold filtering for automated buybacks
- Mechanism detail: buybacks trigger only when token price crosses specific thresholds
- This is automated treasury management through price signals — a concrete mechanism design innovation
- Connects to existing claim about ownership coin treasuries being actively managed
### Decision Market Data
- Tracks participation and volume across multiple MetaDAO governance decisions
- Pattern: contested decisions (Ranger liquidation) show significantly higher volume than routine proposals
- This data directly tests whether futarchy's "limited trading volume in uncontested decisions" is a feature (efficient agreement) or a bug (low participation)
## Noise Filtered Out
- ~80 tweets were engagement, community interaction, event promotion
- Very high substantive ratio for the original content that does exist

View file

@ -1,44 +0,0 @@
---
type: source
title: "@8bitpenis X archive — 100 most recent tweets"
author: "8bitpenis.sol (@8bitpenis), host @ownershipfm"
url: https://x.com/8bitpenis
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [community, futarchy, governance, treasury-liquidation, metadao-ecosystem]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Community voice and Ownership Podcast host. 23 MetaDAO references — deep governance
engagement. High volume (65K total tweets) but only 43% substantive in recent 100.
Key contribution: practical governance commentary, treasury liquidation mechanics
discussion ("any % customizable"), fundraising route optimization. Acts as the
community's informal amplifier and discussion facilitator. Cultural tone-setter
rather than mechanism designer.
extraction_hints:
- "Treasury liquidation mechanics: 'any % customizable' — implementation detail for liquidation claim"
- "Fundraising route optimization discussions — practitioner perspective on capital formation"
- "Community sentiment data — cultural mapping for landscape musing"
- "Low standalone claim priority — community voice, not original analysis"
priority: low
---
# @8bitpenis X Archive (March 2026)
## Substantive Tweets
### Governance Engagement
- Deep engagement with MetaDAO governance proposals and debates
- Treasury liquidation mechanics: customizable percentage thresholds
- Memecoin positioning strategy discussions
- Fundraising route optimization
### Community Facilitation
- Hosts spaces on MetaDAO, Futardio, and futarchy topics
- Bridge between casual community and serious governance discussion
- 23 direct MetaDAO references — embedded in ecosystem
## Noise Filtered Out
- 57% noise — high volume casual engagement, memes, banter
- Substantive content focuses on governance mechanics and community coordination

View file

@ -1,47 +0,0 @@
---
type: source
title: "@Abbasshaikh X archive — 100 most recent tweets"
author: "Abbas (@Abbasshaikh), Umbra Privacy"
url: https://x.com/Abbasshaikh
date: 2026-03-09
domain: internet-finance
format: tweet
status: null-result
tags: [umbra, privacy, futardio, community-organizing, metadao-ecosystem]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Umbra Privacy builder and one of the most active community organizers in the MetaDAO
ecosystem. 14 direct MetaDAO references — strong Futardio community role. High volume
(32K total tweets) but substantive content focuses on privacy infrastructure and
futarchy community building. Umbra raised $3M via MetaDAO ICO with 7x first-week
performance. Abbas's role is more community coordinator than mechanism designer —
useful for culture mapping but low priority for claim extraction.
extraction_hints:
- "Umbra ICO performance data ($3M raised, 7x first week) — enriches MetaDAO ICO track record"
- "Community organizing patterns around futardio — cultural data for landscape musing"
- "Privacy + ownership coins intersection — potential cross-domain connection"
- "Low claim extraction priority — community voice, not mechanism analysis"
priority: low
processed_by: rio
processed_date: 2026-03-10
extraction_model: "minimax/minimax-m2.5"
extraction_notes: "No extractable claims. Source is a tweet archive metadata summary with only two substantive data points: (1) Umbra raised $3M via MetaDAO ICO with 7x first-week performance, and (2) Abbas is a community organizer for Futardio. The curator notes explicitly classify this as 'low claim extraction priority — community voice, not mechanism analysis.' The ICO performance data ($3M, 7x) is already covered by existing claim 'MetaDAO is the futarchy launchpad on Solana where projects raise capital through unruggable ICOs...' The community organizing pattern is cultural/soft data not suitable for claim extraction. No specific, disagreeable interpretive claims can be made from this source."
---
# @Abbasshaikh X Archive (March 2026)
## Substantive Tweets
### Umbra Privacy
- Building encrypted internet finance and ownership infrastructure
- $3M raised via MetaDAO ICO, 7x first-week performance
- Privacy as foundational layer for ownership coins
### Community Organizing
- Active AMA scheduling, team outreach for Futardio ecosystem
- $20 allocation discussions on Futardio bids — grassroots participation patterns
- Strong futardio community organizer role
## Noise Filtered Out
- 26% noise — casual engagement, memes, lifestyle content
- High volume but moderate signal density

View file

@ -1,42 +0,0 @@
---
type: source
title: "@AndrewSeb555 X archive — 100 most recent tweets"
author: "Andrew Seb (@AndrewSeb555), Head of Eco @icmdotrun"
url: https://x.com/AndrewSeb555
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [wider-ecosystem, governance, arbitrage, ai-agents, trading]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Head of Eco at ICM. 5 MetaDAO references — moderate ecosystem engagement. 74%
substantive. Interesting for arbitrage opportunity discussions (60-70% arb rates
mentioned) and governance/futarchy mechanics commentary. Also engaged with WLFI
and Clarity Act regulatory developments. More of an ecosystem participant than a
core builder or analyst.
extraction_hints:
- "Arbitrage opportunity data (60-70%) — market efficiency data point"
- "WLFI & Clarity Act regulatory context — connects to our regulatory claims"
- "Liquidation process improvement discussions — enrichment for governance claims"
- "Low priority — moderate signal, mostly ecosystem participation"
priority: low
---
# @AndrewSeb555 X Archive (March 2026)
## Substantive Tweets
### Governance and Arbitrage
- 60-70% arbitrage opportunity discussions
- Futarchy mechanics commentary
- Liquidation process improvements
- WLFI & Clarity Act regulatory preparations
### Ecosystem Participation
- 5 MetaDAO references — aware participant
- AI agent market observations
- Trading and technical analysis
## Noise Filtered Out
- 26% noise — community engagement, casual takes

View file

@ -1,34 +0,0 @@
---
type: source
title: "@bharathshettyy X archive — 100 most recent tweets"
author: "Biks (@bharathshettyy), Send Arcade"
url: https://x.com/bharathshettyy
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [wider-ecosystem, send-arcade, futardio, community]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Send Arcade builder, GSoC'25. 9 MetaDAO references. 41% substantive (lowest individual
account). "First futardio, then futarchy, then make money" progression narrative is
interesting as a community adoption pathway. Ownership Radio involvement. Primarily
community participant rather than analyst or builder in the mechanism design sense.
extraction_hints:
- "'First futardio, then futarchy, then make money' — community adoption pathway narrative"
- "Cultural data for landscape musing — community participant perspective"
- "Low claim extraction priority"
priority: low
---
# @bharathshettyy X Archive (March 2026)
## Substantive Tweets
### Community Participation
- "First futardio, then futarchy, then make money" — adoption progression narrative
- Ownership Radio involvement
- 9 MetaDAO references — active community participant
## Noise Filtered Out
- 59% noise — casual engagement, community interaction

View file

@ -1,42 +0,0 @@
---
type: source
title: "@Blockworks X archive — 100 most recent tweets"
author: "Blockworks (@Blockworks)"
url: https://x.com/Blockworks
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [media, institutional, defi, stablecoins, blockworks-das]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Institutional crypto media (492K followers). Only 2 MetaDAO references in recent tweets.
Key signal: Blockworks DAS NYC (March 25) is where Felipe will present "The Token
Problem" — this is the institutional amplification event for the ownership coin thesis.
Stablecoin interest rate data (lowest since June 2023) and Polygon stablecoin supply
ATH ($3.4B) are useful macro datapoints. Low MetaDAO-specific content but important
as institutional validation channel.
extraction_hints:
- "Blockworks DAS NYC March 25 — track for Felipe's Token Problem keynote extraction"
- "Stablecoin interest rates at lowest since June 2023 — macro context for internet finance"
- "Polygon stablecoin supply ATH $3.4B — cross-chain stablecoin flow data"
- "Null-result for MetaDAO claims — institutional media, not ecosystem analysis"
priority: low
---
# @Blockworks X Archive (March 2026)
## Substantive Tweets
### Macro Data Points
- Stablecoin interest rates at lowest since June 2023
- Polygon stablecoin supply ATH of ~$3.4B (Feb 2026)
- $14.9B, $17.6B liquidity references
### DAS NYC Event
- Blockworks DAS NYC March 25 — Felipe presenting Token Problem keynote
- Institutional channel for ownership coin thesis amplification
## Noise Filtered Out
- 73% noise — news aggregation, event promotion, general crypto coverage
- Only 27% substantive (lowest in network), mostly macro data

View file

@ -1,39 +0,0 @@
---
type: source
title: "@DrJimFan X archive — 100 most recent tweets"
author: "Jim Fan (@DrJimFan), NVIDIA GEAR Lab"
url: https://x.com/DrJimFan
date: 2026-03-09
domain: ai-alignment
format: tweet
status: processed
processed_by: theseus
processed_date: 2026-03-09
claims_extracted: []
enrichments: []
tags: [embodied-ai, robotics, human-data-scaling, motor-control]
linked_set: theseus-x-collab-taxonomy-2026-03
notes: |
Very thin for collaboration taxonomy claims. Only 22 unique tweets out of 100 (78 duplicates
from API pagination). Of 22 unique, only 2 are substantive — both NVIDIA robotics announcements
(EgoScale, SONIC). The remaining 20 are congratulations, emoji reactions, and brief replies.
EgoScale's "humans are the most scalable embodiment" thesis has alignment relevance but
is primarily a robotics capability claim. No content on AI coding tools, multi-agent systems,
collective intelligence, or formal verification. May yield claims in a future robotics-focused
extraction pass.
---
# @DrJimFan X Archive (Feb 20 Mar 6, 2026)
## Substantive Tweets
### EgoScale: Human Video Pre-training for Robot Dexterity
(status/2026709304984875202, 1,686 likes): "We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R^2 = 0.998) between human video volume and action prediction loss [...] Most surprising result: a *single* teleop demo is sufficient to learn a never-before-seen task."
### SONIC: 42M Transformer for Humanoid Whole-Body Control
(status/2026350142652383587, 1,514 likes): "What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot. [...] We scaled humanoid motion RL to an unprecedented scale: 100M+ mocap frames and 500,000+ parallel robots across 128 GPUs. [...] After 3 days of training, the neural net transfers zero-shot to the real G1 robot with no finetuning. 100% success rate across 50 diverse real-world motion sequences."
## Filtered Out
~20 tweets: congratulations, emoji reactions, "OSS ftw!!", thanks, team shoutouts.

View file

@ -1,46 +0,0 @@
---
type: source
title: "@FlashTrade X archive — 100 most recent tweets"
author: "Flash.Trade (@FlashTrade)"
url: https://x.com/FlashTrade
date: 2026-03-09
domain: internet-finance
format: tweet
status: null-result
tags: [flash-trade, perps, solana, trading, leverage]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Perps protocol on Solana — "asset backed trading with zero slippage and on demand
liquidity." Large following (30K) but minimal MetaDAO ecosystem connection in tweet
content. Primarily tactical trading signals and product updates. Included in network
map via engagement analysis but appears peripheral to the futarchy/ownership coin
conversation. Low extraction priority — no mechanism design insights relevant to our
domain.
extraction_hints:
- "No MetaDAO-specific claims identified"
- "Asset-backed trading model could connect to 'permissionless leverage on MetaDAO ecosystem tokens' if Flash integrates with ecosystem"
- "Null-result candidate — primarily trading signals, not mechanism design"
priority: low
processed_by: rio
processed_date: 2026-03-10
extraction_model: "minimax/minimax-m2.5"
extraction_notes: "Null-result extraction. Curator explicitly flagged this as low priority with 'no mechanism design insights relevant to our domain.' Source contains product information (50x leveraged derivatives, asset-backed trading model) and trading signals rather than mechanism design or governance insights. No MetaDAO-specific claims identified. No connection to existing claim themes (futarchy, ownership coins, Living Capital, etc.). Content is peripheral to Teleo knowledge base domains."
---
# @FlashTrade X Archive (March 2026)
## Substantive Tweets
### Trading Infrastructure
- Leveraged derivatives (up to 50x) on Solana
- Asset-backed trading model — zero slippage, on-demand liquidity
- Primarily tactical: trading signals, market commentary
### MetaDAO Connection
- Identified via engagement analysis (metaproph3t + MetaDAOProject interactions)
- Minimal substantive overlap with futarchy/ownership coin conversation in tweet content
- Peripheral ecosystem participant
## Noise Filtered Out
- Despite 88% "substantive" ratio, most content is trading signals rather than mechanism design
- Low relevance to knowledge base extraction goals

View file

@ -1,52 +0,0 @@
---
type: source
title: "@futarddotio X archive — 100 most recent tweets"
author: "Futardio (@futarddotio)"
url: https://x.com/futarddotio
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [futardio, permissionless-launchpad, ownership-coins, capital-formation, metadao]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Official Futardio account — the permissionless ownership coin launchpad built on MetaDAO
infrastructure. Only 70 tweets total, very low noise. "Where dreams meet USDC" tagline.
Key value: launch announcements and mechanism explanations that aren't available from
other sources. Futardio represents the scalability thesis for MetaDAO — moving from
curated ICOs to permissionless launches. The first raise being 220x oversubscribed is
the single most important data point for the "internet capital markets compress fundraising"
claim.
extraction_hints:
- "Futardio mechanism specifics — how permissionless launches work, what's automated vs human"
- "First raise metrics: 220x oversubscription as evidence for 'internet capital markets compress fundraising'"
- "Brand separation from MetaDAO — evidence for 'futarchy-governed permissionless launches require brand separation'"
- "Which projects are launching on Futardio vs MetaDAO curated ICOs — market segmentation data"
- "Low tweet volume means near-100% signal — almost every tweet is substantive"
priority: medium
---
# @futarddotio X Archive (March 2026)
## Substantive Tweets
### Launch Mechanics
- Permissionless: anyone can create an ownership coin raise without MetaDAO approval
- Automated process: time-based preference curves, hard caps, minimum thresholds
- Built on MetaDAO's Autocrat infrastructure but operates independently
- Brand separation: Futardio is not "MetaDAO launches" — deliberate distance
### First Raise Performance
- $11M committed against $50K minimum goal (~220x oversubscribed)
- This is the proof point for permissionless capital formation demand
- Oversubscription triggers pro-rata allocation — everyone gets proportional share
- Refund mechanism for excess capital — clean, automated
### Ecosystem Position
- "Where dreams meet USDC" — positioning as capital formation infrastructure, not governance
- Futardio is the application layer; MetaDAO/Autocrat is the protocol layer
- This architecture mirrors the Proph3t vision of MetaDAO as protocol infrastructure
## Noise Filtered Out
- Very little noise — 70 total tweets, most are substantive announcements or mechanism explanations
- No casual engagement pattern — this is a pure project account

View file

@ -1,49 +0,0 @@
---
type: source
title: "@HurupayApp X archive — 100 most recent tweets"
author: "Hurupay (@HurupayApp)"
url: https://x.com/HurupayApp
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [hurupay, payments, neobank, metadao-ecosystem, failed-ico, minimum-raise]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Crypto-native neobank (US/EUR/GBP accounts, virtual USD cards, savings, US stocks).
Important for the knowledge base primarily as the MetaDAO ICO that failed to reach
minimum raise — proving the protection mechanism works. The product itself (fiat on/off
ramps, $0.01 transfers vs $100+ traditional) is standard fintech positioning. Key data:
$2.6B raised stat needs verification — seems too high for this project, may be
referencing total MetaDAO ecosystem. Backed by fdotinc with Microsoft/Bankless angels.
extraction_hints:
- "Failed ICO as mechanism proof — minimum raise threshold returned funds to investors automatically"
- "Enrichment target: 'futarchy-governed liquidation is the enforcement mechanism' — Hurupay shows the softer protection (minimum raise threshold) vs Ranger (full liquidation)"
- "$0.01 transfer fees vs $100+ traditional, 3-second settlement vs 72 hours — standard fintech disruption metrics, low extraction priority"
- "Backed by fdotinc + Microsoft/Bankless angels — institutional backing for MetaDAO ecosystem project"
priority: low
---
# @HurupayApp X Archive (March 2026)
## Substantive Tweets
### Product Positioning
- US, EUR, GBP bank accounts + virtual USD cards
- $0.01 transfer fees vs $100+ traditional banking
- 3-second settlement vs 72-hour traditional timeframe
- "Crypto for everyday people" — mass-market fintech positioning
### MetaDAO ICO Failure (Positive Signal)
- Did not reach minimum raise threshold on MetaDAO ICO
- All funds returned to depositors automatically — no money lost
- This is the protection mechanism working as designed
- Demonstrates that not every MetaDAO launch succeeds — but failure is safe
### Backing and Legitimacy
- Backed by fdotinc with angels from Microsoft and Bankless
- Institutional backing provides credibility signal for MetaDAO ecosystem
## Noise Filtered Out
- ~15% noise — product promotion, community engagement
- Primarily product-focused messaging

View file

@ -1,76 +0,0 @@
---
type: source
title: "@karpathy X archive — 100 most recent tweets"
author: "Andrej Karpathy (@karpathy)"
url: https://x.com/karpathy
date: 2026-03-09
domain: ai-alignment
format: tweet
status: processed
processed_by: theseus
processed_date: 2026-03-09
claims_extracted:
- "AI agents excel at implementing well-scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect"
- "deep technical expertise is a greater force multiplier when combined with AI agents because skilled practitioners delegate more effectively than novices"
- "the progression from autocomplete to autonomous agent teams follows a capability-matched escalation where premature adoption creates more chaos than value"
enrichments: []
tags: [human-ai-collaboration, agent-architectures, autoresearch, coding-agents, multi-agent]
linked_set: theseus-x-collab-taxonomy-2026-03
curator_notes: |
Richest account in the collaboration taxonomy batch. 21 relevant tweets out of 43 unique.
Karpathy is systematically documenting the new human-AI division of labor through his
autoresearch project: humans provide direction/taste/creative ideation, agents handle
implementation/iteration/parallelism. The "programming an organization" framing
(multi-agent research org) is the strongest signal for the collaboration taxonomy thread.
Viral tweet (37K likes) marks the paradigm shift claim. Notable absence: very little on
alignment/safety/governance.
---
# @karpathy X Archive (Feb 21 Mar 8, 2026)
## Key Tweets by Theme
### Autoresearch: AI-Driven Research Loops
- **Collaborative multi-agent research vision** (status/2030705271627284816, 5,760 likes): "The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. [...] Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks."
- **Autoresearch repo launch** (status/2030371219518931079, 23,608 likes): "I packaged up the 'autoresearch' project into a new self-contained minimal repo [...] the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) [...] every dot is a complete LLM training run that lasts exactly 5 minutes."
- **8-agent research org experiment** (status/2027521323275325622, 8,645 likes): "I had the same thought so I've been playing with it in nanochat. E.g. here's 8 agents (4 claude, 4 codex), with 1 GPU each [...] I tried a few setups: 8 independent solo researchers, 1 chief scientist giving work to 8 junior researchers, etc. [...] They are very good at implementing any given well-scoped and described idea but they don't creatively generate them. But the goal is that you are now programming an organization."
- **Meta-optimization** (status/2029701092347630069, 6,212 likes): "I now have AI Agents iterating on nanochat automatically [...] over the last ~2 weeks I almost feel like I've iterated more on the 'meta-setup' where I optimize and tune the agent flows even more than the nanochat repo directly."
- **Research org as benchmark** (status/2029702379034267985, 1,031 likes): "the real benchmark of interest is: 'what is the research org agent code that produces improvements on nanochat the fastest?' this is the new meta."
- **Agents closer to hyperparameter tuning than novel research** (status/2029957088022254014, 105 likes): "AI agents are very good at implementing ideas, but a lot less good at coming up with creative ones. So honestly, it's a lot closer to hyperparameter tuning right now than coming up with new/novel research."
### Human-AI Collaboration Patterns
- **Programming has fundamentally changed** (status/2026731645169185220, 37,099 likes): "It is hard to communicate how much programming has changed due to AI in the last 2 months [...] coding agents basically didn't work before December and basically work since [...] You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. [...] It's not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas."
- **Tab → Agent → Agent Teams** (status/2027501331125239822, 3,821 likes): "Cool chart showing the ratio of Tab complete requests to Agent requests in Cursor. [...] None -> Tab -> Agent -> Parallel agents -> Agent Teams (?) -> ??? If you're too conservative, you're leaving leverage on the table. If you're too aggressive, you're net creating more chaos than doing useful work."
- **Deep expertise as multiplier** (status/2026743030280237562, 880 likes): "'prompters' is doing it a disservice and is imo a misunderstanding. I mean sure vibe coders are now able to get somewhere, but at the top tiers, deep technical expertise may be *even more* of a multiplier than before because of the added leverage."
- **AI as delegation, not magic** (status/2026735109077135652, 243 likes): "Yes, in this intermediate state, you go faster if you can be more explicit and actually understand what the AI is doing on your behalf, and what the different tools are at its disposal, and what is hard and what is easy. It's not magic, it's delegation."
- **Removing yourself as bottleneck** (status/2026738848420737474, 694 likes): "how can you gather all the knowledge and context the agent needs that is currently only in your head [...] the goal is to arrange the thing so that you can put agents into longer loops and remove yourself as the bottleneck. 'every action is error', we used to say at tesla."
- **Human still needs IDE oversight** (status/2027503094016446499, 119 likes): "I still keep an IDE open and surgically edit files so yes. I still notice dumb issues with the code which helps me prompt better."
- **AI already writing 90% of code** (status/2030408126688850025, 521 likes): "definitely. the current one is already 90% AI written I ain't writing all that"
- **Teacher's unique contribution** (status/2030387285250994192, 430 likes): "Teacher input is the unique sliver of contribution that the AI can't make yet (but usually already easily understands when given)."
### Agent Infrastructure
- **CLIs as agent-native interfaces** (status/2026360908398862478, 11,727 likes): "CLIs are super exciting precisely because they are a 'legacy' technology, which means AI agents can natively and easily use them [...] It's 2026. Build. For. Agents."
- **Compute infrastructure for agentic loops** (status/2026452488434651264, 7,422 likes): "the workflow that may matter the most (inference decode *and* over long token contexts in tight agentic loops) is the one hardest to achieve simultaneously."
- **Agents replacing legacy interfaces** (status/2030722108322717778, 1,941 likes): "Every business you go to is still so used to giving you instructions over legacy interfaces. [...] Please give me the thing I can copy paste to my agent."
- **Cross-model transfer confirmed** (status/2030777122223173639, 3,840 likes): "I just confirmed that the improvements autoresearch found over the last 2 days of (~650) experiments on depth 12 model transfer well to depth 24."
## Filtered Out
~22 tweets: casual replies, jokes, hyperparameter discussion, off-topic commentary.

View file

@ -1,38 +0,0 @@
---
type: source
title: "@kru_tweets X archive — 100 most recent tweets"
author: "kru (@kru_tweets), Umbra Privacy / Superteam"
url: https://x.com/kru_tweets
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [umbra, privacy, solana, superteam, stablecoins]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Umbra Privacy team + Superteam member. 3 MetaDAO references. $54M Friends & Family
funding round mentioned. Privacy infrastructure and yield coin partnerships. Moderate
ecosystem engagement — connected through Umbra (MetaDAO ICO project). Low claim
extraction priority.
extraction_hints:
- "Umbra ecosystem context — connects to Abbasshaikh archive for fuller Umbra picture"
- "$54M funding round data — if Umbra-related, enriches ICO performance tracking"
- "Low priority — privacy builder context, not mechanism analysis"
priority: low
---
# @kru_tweets X Archive (March 2026)
## Substantive Tweets
### Privacy Ecosystem
- Hoppy Privacy & Umbra ecosystem involvement
- Yieldcoin partnerships
- $54M Friends & Family funding round
### Solana / Superteam
- Superteam member perspective on Solana ecosystem
- Privacy infrastructure development
## Noise Filtered Out
- 36% noise — casual engagement, community banter

View file

@ -1,41 +0,0 @@
---
type: source
title: "@MCGlive X archive — 100 most recent tweets"
author: "MCG (@MCGlive)"
url: https://x.com/MCGlive
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [media, trading, solana, metadao, launchpads]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Live research and trading content on Solana ecosystem. 7 MetaDAO references. 91%
substantive ratio but content is primarily trading-focused (market sentiment, price
action, project evaluations) rather than mechanism design. Notable for candid market
commentary — mentions ponzi dynamics explicitly. Useful as broader Solana ecosystem
context but low priority for claim extraction.
extraction_hints:
- "Solana ecosystem market sentiment — context for MetaDAO ecosystem positioning"
- "Ponzi dynamics acknowledgment — honest market structure commentary"
- "Launchpad comparisons — how MCG evaluates MetaDAO vs other launch platforms"
- "Null-result likely — primarily trading content, not mechanism design"
priority: low
---
# @MCGlive X Archive (March 2026)
## Substantive Tweets
### Market Commentary
- Trading-focused analysis of Solana ecosystem projects
- Candid about market dynamics including ponzi structures
- $BEAN parabolic growth (43x) noted — market speculation patterns
### Ecosystem Coverage
- Launchpad comparisons and startup evaluations
- 7 MetaDAO references — moderate ecosystem awareness
- Primarily covers MetaDAO from trading/investment angle
## Noise Filtered Out
- 9% noise — mostly substantive but trading-focused rather than mechanism-focused

View file

@ -1,72 +0,0 @@
---
type: source
title: "@MetaDAOProject X archive — 100 most recent tweets"
author: "MetaDAO (@MetaDAOProject)"
url: https://x.com/MetaDAOProject
date: 2026-03-09
domain: internet-finance
format: tweet
status: processed
processed_by: rio
processed_date: 2026-03-09
enrichments:
- "futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent"
tags: [metadao, futardio, ownership-coins, ranger-liquidation, hurupay, ico]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Official project account. Higher signal-to-noise than individual accounts because
it's curated announcements, not conversation. ~30 substantive tweets. The two
highest-engagement posts are Futardio launch (235K impressions) and Ranger liquidation
($5M USDC distribution, 160K impressions) — these are the defining events of the
current MetaDAO cycle. Also notable: Hurupay ICO failure where minimum raise protection
worked (didn't reach threshold, funds returned). This is a positive failure — the
mechanism protecting investors even when a project doesn't succeed.
extraction_hints:
- "Hurupay ICO failure as positive mechanism proof — minimum raise threshold protected investors. New claim candidate."
- "Futardio first raise metrics: $11M vs $50K goal, 220x oversubscribed — data point for 'internet capital markets compress fundraising' claim"
- "Ranger liquidation: $5M USDC returned, 92.41% pass vote — enriches 'futarchy-governed liquidation is the enforcement mechanism' claim"
- "Treasury subcommittee formation for Solomon — enriches 'futarchy-governed DAOs converge on traditional corporate governance scaffolding'"
- "'ICOs have undeniable PMF but tokens are fundamentally broken' (RT of NoahNewfield) — frames the problem ownership coins solve"
- "Connection: AI scaling capital formation — RT of dbarabander 'only form of capital formation that can scale with AI is MetaDAO'"
priority: high
---
# @MetaDAOProject X Archive (March 2026)
## Substantive Tweets
### Futardio Launch (Highest Engagement)
- 235K impressions on launch announcement
- Permissionless capital formation — anyone can launch an ownership coin
- First raise: $11M committed against $50K minimum, ~220x oversubscribed
- Positioning: "the future of capital formation is permissionless"
### Ranger Finance Liquidation (Second Highest Engagement)
- 160K impressions on liquidation announcement
- $5M USDC distributed back to Ranger token holders
- First enforcement event in MetaDAO ecosystem
- Framing: "this is what happens when a project doesn't deliver — the market forces accountability"
- 92.41% of decision market aligned with pass (liquidation)
- 33 unique traders participated in the decision market
### Hurupay ICO — Minimum Raise Protection
- Hurupay didn't reach minimum raise threshold
- All committed funds returned to depositors automatically
- Positive failure: the mechanism worked as designed to protect investors
- No money lost, no drama — the system just worked quietly
### Solomon Treasury Subcommittee
- Formation of structured treasury oversight for Solomon project
- Decision proposal DP-00001 establishing the subcommittee
- Signal: futarchy-governed projects naturally developing traditional corporate governance structures
- Connects to existing claim about DAOs converging on corporate scaffolding
### Ecosystem Growth Signals
- RT of community members discussing MetaDAO + AI convergence
- RT of NoahNewfield: "ICOs have undeniable PMF, but the tokens they produce are fundamentally broken" — framing the problem
- Multiple RTs of ecosystem project updates (Umbra, Avici, Turbine)
- Growing media coverage (SolanaFloor, Blockworks mentions)
## Noise Filtered Out
- ~70 tweets were RTs of ecosystem content, event announcements, community engagement
- Account functions primarily as amplifier/curator, not original analysis

View file

@ -1,62 +0,0 @@
---
type: source
title: "@metanallok X archive — 100 most recent tweets"
author: "Nallok (@metanallok), co-founder MetaDAO"
url: https://x.com/metanallok
date: 2026-03-09
domain: internet-finance
format: tweet
status: processed
processed_by: rio
processed_date: 2026-03-09
claims_extracted:
- "futarchy implementations must simplify theoretical mechanisms for production adoption because original designs include impractical elements that academics tolerate but users reject"
tags: [metadao, futardio, mechanism-design, ownership-coins, co-founder]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
MetaDAO co-founder, more operational than Proph3t. Nallok's tweets reveal
implementation details that don't appear in the official account or blog posts.
Key value: Futardio mechanism design specifics — time-based preference curves,
hard caps, automated processes. His comment that "Robin wanted random proposal
outcomes — impractical for production" shows the gap between Hanson's theory and
MetaDAO's pragmatic implementation. Lower public profile than Proph3t but higher
density of mechanism details when he does post.
extraction_hints:
- "Futardio mechanism details: time-based preference, hard caps, automated process — enriches existing MetaDAO mechanism claims"
- "Robin Hanson theory vs MetaDAO practice gap — 'random proposal outcomes impractical for production'"
- "Co-founder compensation structure (2% of supply per $1B FDV increase, up to 10% at $5B) — mechanism design for team incentive alignment"
- "Enrichment target: 'MetaDAOs Autocrat program implements futarchy through conditional token markets' — Nallok provides implementation details"
- "Potential new claim: futarchy implementations must simplify theoretical mechanisms for production use"
priority: medium
---
# @metanallok X Archive (March 2026)
## Substantive Tweets
### Futardio Mechanism Design
- Time-based preference curves in ICO participation — earlier commitment gets better allocation
- Hard caps on individual raise amounts to prevent whale domination
- Fully automated process — no human gatekeeping on launches
- These are implementation details that don't appear in MetaDAO's public documentation
### Theory vs Practice Gap
- "Robin wanted random proposal outcomes — impractical for production"
- MetaDAO deliberately simplified Hanson's original futarchy design for usability
- Pragmatic trade-offs: theoretical optimality sacrificed for practical adoption
- This is a important signal about how futarchy actually gets built vs how it's theorized
### Team Incentive Structure
- Proph3t/Nallok compensation: 2% of META supply per $1B FDV increase, up to 10% at $5B
- This is itself a mechanism design statement — team compensation tied to protocol success
- No upfront allocation, pure performance-based
- Connects to our claims about token economics replacing management fees
### Ecosystem Building
- Engagement with Futardio launch projects
- Technical support for teams building on MetaDAO infrastructure
- Commentary on governance proposals with implementation perspective
## Noise Filtered Out
- Heavy engagement/reply pattern — most tweets are community interaction
- When substantive, tends toward implementation detail over ideology (opposite of Proph3t)

View file

@ -1,71 +0,0 @@
---
type: source
title: "@metaproph3t X archive — 100 most recent tweets"
author: "Proph3t (@metaproph3t), co-founder MetaDAO"
url: https://x.com/metaproph3t
date: 2026-03-09
domain: internet-finance
format: tweet
status: processed
processed_by: rio
processed_date: 2026-03-09
claims_extracted:
- "ownership coins primary value proposition is investor protection not governance quality because anti-rug enforcement through market-governed liquidation creates credible exit guarantees that no amount of decision optimization can match"
enrichments:
- "futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent"
tags: [metadao, futarchy, ownership-coins, futardio, governance, capital-formation]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Core voice of the MetaDAO movement. ~46 substantive tweets out of 100. This is where
the ideology lives — Proph3t doesn't post casually. When he tweets, it's either a
mechanism insight, a movement-building statement, or ecosystem commentary. The register
is earnest maximalism with technical depth. Key signal: his framing is shifting from
"futarchy governance" to "market oversight" and "ownership coins" — tracking this
language evolution matters for understanding how MetaDAO positions itself.
extraction_hints:
- "Futardio as permissionless launchpad — mechanism design claims about time-based preference, hard caps, separation from MetaDAO brand"
- "Ranger Finance liquidation as first enforcement event — futarchy actually working as designed"
- "'Market oversight not community governance' — reframing futarchy away from voting analogy"
- "Anti-rug as #1 value prop — 'the number one selling point of ownership coins is that they are anti-rug'"
- "Enrichment target: existing claim 'futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible'"
- "Enrichment target: 'MetaDAO is the futarchy launchpad on Solana' — Futardio changes this, MetaDAO is becoming the protocol layer not the launchpad"
- "Tension: Proph3t says 'MetaDAO is as much a social movement as a cryptocurrency project' — does movement framing undermine mechanism credibility?"
priority: high
---
# @metaproph3t X Archive (March 2026)
## Substantive Tweets
### Futardio Launch & Permissionless Capital Formation
- Futardio is live as permissionless launchpad — anyone can raise capital through ownership coins without MetaDAO gatekeeping
- "the beauty of futardio is that none of these launches need to be associated with metadao at all. which means we can permissionlessly scale"
- Framing shift: MetaDAO as protocol infrastructure, Futardio as the permissionless application layer
- First Futardio raise: massively oversubscribed (~220x), $11M vs $50K goal
### Ranger Finance Liquidation (First Enforcement Event)
- Ranger liquidation proposal passed — first time futarchy governance actually forced a project to return treasury
- $5M USDC distributed back to token holders
- Proph3t frames this as the system working: "this is what anti-rug looks like in practice"
- 92.41% pass-aligned in decision market
- Key mechanism insight: liquidation is the credible threat that makes the whole system work
### Ownership Coin Ideology
- "the number one selling point of ownership coins is that they are anti-rug"
- "MetaDAO is as much a social movement as it is a cryptocurrency project — thousands have already been infected by the idea that futarchy will re-architect human civilization"
- Distinguishes "market oversight" from "community governance" — futarchy is not voting, it's market-based evaluation
- "ownership coins" terminology replacing "governance tokens" — deliberate reframing
### Mechanism Design Commentary
- Notes that Robin Hanson "wanted random proposal outcomes — impractical for production" — pragmatism over theory purity
- Anti-rug > governance: the primary value prop is investor protection, not decision quality
- Market oversight framing: "the market doesn't vote on proposals, it prices outcomes"
### Ecosystem Commentary
- Engagement with Solana ecosystem builders (Drift, Sanctum adoption)
- Commentary on competitor failures (pump.fun losses, meme coin rugs) as validation of ownership coin model
- Bullish on AI + crypto convergence but mechanism-focused, not hype
## Noise Filtered Out
- ~54 tweets were replies, emoji reactions, casual banter, RTs without commentary
- Engagement pattern: high reply rate to ecosystem builders, low engagement with outsiders

View file

@ -1,48 +0,0 @@
---
type: source
title: "@mmdhrumil X archive — 100 most recent tweets"
author: "Dhrumil (@mmdhrumil), co-founder Archer Exchange"
url: https://x.com/mmdhrumil
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [archer, market-making, on-chain-matching, defi, solana, metadao-ecosystem]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Market making infrastructure builder on Solana. Co-founder of Archer Exchange — fully
on-chain matching with dedicated, writable-only-by-you order books for each market
maker. Key insight: "prop AMMs did extremely well" — observation about AMM design
driving Archer's architecture. His 200% confidence on "Solana DeFi overtakes Hyperliquid
within 2 years" is a trackable prediction. Mechanism design focus on matching and
execution rather than governance — complementary perspective to the futarchy accounts.
extraction_hints:
- "On-chain matching architecture — each MM gets dedicated writable-only-by-you order book. New mechanism design pattern."
- "Prop AMM observation driving design — evidence for how market structure informs protocol design"
- "'Solana DeFi overtakes Hyperliquid within 2 years' — trackable prediction, potential position candidate"
- "Connection to existing 'permissionless leverage on MetaDAO ecosystem tokens' claim — Archer provides the market making infrastructure"
priority: low
---
# @mmdhrumil X Archive (March 2026)
## Substantive Tweets
### Archer Exchange Architecture
- Fully on-chain matching — each market maker gets dedicated, writable-only-by-you order book
- Permission-less execution with competitive quotes model
- Design inspired by observation that "prop AMMs did extremely well"
- "Best quotes for your trades via fully on-chain matching" vs aggregator models
### Market Making Infrastructure
- Market maker defense strategies — most MM logic is reactive/responsive
- On-chain matching as primitive infrastructure layer
- Solving the execution quality problem for Solana DeFi
### Predictions
- "200% confidence: Solana DeFi overtakes Hyperliquid within 2 years"
- Infrastructure thesis: Solana's composability advantage compounds over time
## Noise Filtered Out
- ~20% noise — community engagement, casual takes
- Strong mechanism design focus when substantive

View file

@ -1,43 +0,0 @@
---
type: source
title: "@mycorealms X archive — 100 most recent tweets"
author: "Mycorealms (@mycorealms)"
url: https://x.com/mycorealms
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [mycorealms, farming, on-chain-governance, futardio, community, solana]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Real-world asset meets futarchy — Mycorealms is a community-run farming project on
Solana where contributors steer agricultural expansion with on-chain governance.
Interesting because it's a non-financial use case for ownership coins. Active in the
Futards community, promotes Futarded memecoin launched on Futardio. Lower priority
for claim extraction but worth noting as evidence that ownership coin model extends
beyond pure DeFi.
extraction_hints:
- "Real-world asset governance via ownership coins — extends 'ownership coins' thesis beyond DeFi to physical assets"
- "Community-run agriculture with on-chain governance — unusual use case worth flagging"
- "Futardio participation — additional evidence for permissionless launch adoption"
- "Low priority for standalone claims but useful as enrichment data for scope of ownership coin model"
priority: low
---
# @mycorealms X Archive (March 2026)
## Substantive Tweets
### Real-World Asset Governance
- Community-run farming project using on-chain governance for agricultural decisions
- Contributors steer real agricultural expansion — not just financial assets
- Transparent governance: decisions about land use, crop selection, resource allocation
### Futardio Ecosystem Participation
- Active in Futards community
- Promotes Futarded memecoin launched on Futardio platform
- Demonstrates non-DeFi adoption of ownership coin infrastructure
## Noise Filtered Out
- ~17% noise — community engagement, meme content
- Product-focused when substantive

View file

@ -1,44 +0,0 @@
---
type: source
title: "@ownershipfm X archive — 100 most recent tweets"
author: "Ownership Podcast (@ownershipfm), hosted by @8bitpenis"
url: https://x.com/ownershipfm
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [ownership-podcast, media, futarchy, metadao, community-media]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Primary media outlet for the MetaDAO/futarchy ecosystem — 40 MetaDAO references, highest
of any account in the network. Hosted by 8bitpenis, produced by Blockformer, powered by
MetaDAO. The podcast/spaces format means tweet content is mostly episode promotion and
live discussion summaries rather than original analysis. Valuable as cultural artifact
and for tracking which topics the community discusses, but low claim extraction priority.
Guest list and topic selection reveal ecosystem priorities.
extraction_hints:
- "Episode topics and guest list — maps which themes the ecosystem considers important"
- "Futarchy educational content — how the community explains itself to newcomers"
- "Cultural artifact for landscape musing — register, tone, community identity signals"
- "Low standalone claim priority — primarily amplification and discussion facilitation"
priority: low
---
# @ownershipfm X Archive (March 2026)
## Substantive Tweets
### Podcast/Spaces Content
- Ownership Radio series covering MetaDAO ecosystem
- Futarchy educational content for ecosystem newcomers
- Guest interviews with ecosystem builders and analysts
- Live spaces discussions on governance events, new launches
### Cultural Signal
- 40 direct MetaDAO references — strongest ecosystem media connection
- Tone: earnest, community-building, technically accessible
- Bridges between casual community and serious mechanism discussion
## Noise Filtered Out
- 34% noise — event promotion, scheduling, casual engagement
- Content is primarily facilitative rather than analytical

View file

@ -1,62 +0,0 @@
---
type: source
title: "@oxranga X archive — 100 most recent tweets"
author: "xranga (@oxranga), co-founder Solomon Labs"
url: https://x.com/oxranga
date: 2026-03-09
domain: internet-finance
format: tweet
status: processed
processed_by: rio
processed_date: 2026-03-09
claims_extracted:
- "stablecoin flow velocity is a better predictor of DeFi protocol health than static TVL because flows measure capital utilization while TVL only measures capital parked"
tags: [solomon, yaas, yield-as-a-service, stablecoins, defi, metadao-ecosystem]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Solomon Labs co-founder building within the MetaDAO ecosystem. Lower tweet volume (~320
total) but high density when he posts. Key contribution: the YaaS (Yield-as-a-Service)
thesis and stablecoin flow analysis. His "moats were made of friction" line is a clean
articulation of DeFi disruption logic that maps to our teleological economics framework.
Solomon is also the governance stress-test case — treasury subcommittee debates show
how futarchy-governed projects handle operational decisions.
extraction_hints:
- "YaaS (Yield-as-a-Service) as DeFi primitive — new concept, potential claim about yield commoditization"
- "'Stablecoin flows > TVL' as metric — challenges standard DeFi valuation framework, potential claim"
- "'Moats were made of friction' — maps directly to 'transaction costs determine organizational boundaries' in foundations"
- "Solomon Lab Notes #05 — detailed builder perspective on futarchy-governed treasury management"
- "Connection to teleological economics: friction removal as disruption mechanism is exactly what our framework predicts"
priority: medium
---
# @oxranga X Archive (March 2026)
## Substantive Tweets
### YaaS (Yield-as-a-Service) Thesis
- Yield generation becoming a commoditized service layer in DeFi
- Projects shouldn't build their own yield infrastructure — they should plug into YaaS providers
- This is the "give away the commoditized layer" pattern applied to DeFi yields
- Solomon positioning as YaaS infrastructure for the MetaDAO ecosystem
### Stablecoin Flow Analysis
- "Stablecoin flows > TVL" — flow metrics better predict protocol health than static TVL
- TVL is a snapshot, flows are a movie — you need to see capital velocity not just capital parked
- This challenges the standard DeFi valuation framework that uses TVL as primary metric
- Connects to our claims about internet finance generating GDP growth through capital velocity
### "Moats Were Made of Friction"
- Clean articulation: DeFi moats in the previous cycle were built on user friction (complex UIs, high switching costs, information asymmetry)
- As friction gets removed by better tooling and composability, those moats dissolve
- Surviving protocols need moats built on something other than friction — network effects, data advantages, governance
- Maps directly to our teleological economics claims about transaction costs and organizational boundaries
### Solomon Governance
- Lab Notes series documenting Solomon's governance experiments
- Treasury management decisions going through futarchy
- Practical challenges: how to handle operational decisions (hiring, vendor payments) through market mechanisms
- Signal: even a committed futarchy project needs traditional governance for operational tempo
## Noise Filtered Out
- ~80% of tweets were casual engagement, RTs, brief replies
- Low volume but consistently substantive when original content appears

View file

@ -1,58 +0,0 @@
---
type: source
title: "@PineAnalytics X archive — 100 most recent tweets"
author: "Pine Analytics (@PineAnalytics)"
url: https://x.com/PineAnalytics
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [metadao, analytics, futardio, decision-markets, governance-data, jupiter]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
On-chain analytics research hub — the data arm of the MetaDAO ecosystem. Pine produced
the Q4 2025 quarterly report and Futardio launch metrics. Their work is pure data with
minimal editorial — exactly the kind of source that produces high-confidence enrichments
to existing claims. Key contribution: decision market participation data, ICO performance
metrics, and comparative governance analysis (Jupiter voting vs MetaDAO futarchy). Already
have an existing archive for the Q4 report (2026-03-03-pineanalytics-metadao-q4-2025-quarterly-report.md)
and Futardio launch (2026-03-05-pineanalytics-futardio-launch-metrics.md).
extraction_hints:
- "Decision market data across multiple proposals — volume, trader count, alignment percentages"
- "bankme -55% in 45min vs MetaDAO protections — data point for 'futarchy-governed liquidation' claim"
- "Jupiter governance comparison: 303 views, 2 comments vs futarchy $40K volume / 122 trades — enriches 'token voting DAOs offer no minority protection' claim"
- "Futardio launch metrics already partially archived — check for new data not in existing archive"
- "Cross-reference with existing archives to avoid duplication"
priority: medium
---
# @PineAnalytics X Archive (March 2026)
## Substantive Tweets
### Decision Market Data
- Tracks volume and participation across MetaDAO governance proposals
- Provides the quantitative backbone for claims about futarchy effectiveness
- Key data: contested decisions show dramatically higher engagement than routine ones
- bankme token dropped 55% in 45 minutes — contrast with MetaDAO ecosystem where no ICO has gone below launch price
### Jupiter Governance Comparison
- Jupiter governance proposal: 303 views, 2 comments
- MetaDAO futarchy equivalent: $40K volume, 122 trades
- The engagement differential is stark — markets produce real participation where forums produce silence
- This is the strongest empirical argument for futarchy over token voting
### MetaDAO Q4 2025 Report
- Comprehensive quarterly metrics (already archived separately)
- 8 ICOs, $25.6M raised, $390M committed
- $300M AMM volume, $1.5M in fees
- 95% refund rate from oversubscription — capital efficiency metric
### Futardio Launch Metrics
- Already partially archived separately
- Additional data: participation demographics, wallet analysis, time-to-fill curves
- First permissionless raise performance compared to curated MetaDAO ICOs
## Noise Filtered Out
- Mostly retweets and community engagement
- Original content is almost exclusively data-driven — very little opinion

View file

@ -1,36 +0,0 @@
---
type: source
title: "@rambo_xbt X archive — 100 most recent tweets"
author: "Rambo (@rambo_xbt)"
url: https://x.com/rambo_xbt
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [wider-ecosystem, trading, market-sentiment]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Trader/market commentator. Only 1 MetaDAO reference — most peripheral account in the
network. 57% substantive (lowest among individual accounts). "Loading before the noise"
bio suggests contrarian positioning. Content is primarily trading signals and market
sentiment — no mechanism design content. Null-result candidate.
extraction_hints:
- "Null-result expected — peripheral to MetaDAO ecosystem, trading signals only"
priority: low
---
# @rambo_xbt X Archive (March 2026)
## Substantive Tweets
### Trading Commentary
- Market sentiment analysis
- ORGO agent desktop positioning
- Iran geopolitical discussion
### MetaDAO Connection
- 1 reference — most peripheral account in network
- Identified via engagement analysis but minimal substantive overlap
## Noise Filtered Out
- 43% noise — casual engagement, memes

View file

@ -1,50 +0,0 @@
---
type: source
title: "@ranger_finance X archive — 100 most recent tweets"
author: "Ranger (@ranger_finance)"
url: https://x.com/ranger_finance
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [ranger, metadao-ecosystem, vaults, yield, liquidation, governance]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Ranger is the MetaDAO ecosystem's most consequential governance case study — the first
project to face futarchy-enforced liquidation. Their pivot from perps/spot trading to
pure vault strategy happened under futarchy oversight. Key data: $1.13M+ paid to
depositors all-time, $17.7K weekly payouts across 9 vaults. Build-A-Bear hackathon
offering $1M seed funding. The liquidation event ($5M USDC returned) is already
well-documented in other archives — Ranger's own account shows the project perspective
on being governed by markets.
extraction_hints:
- "Ranger's strategic pivot (perps → vaults) under futarchy governance — evidence for how market oversight shapes project strategy"
- "Vault payout data ($1.13M all-time) — concrete DeFi performance metrics"
- "Build-A-Bear hackathon ($1M seed) — capital allocation through ecosystem development"
- "Enrichment target: 'futarchy-governed liquidation is the enforcement mechanism' — Ranger is THE case study"
- "Potential new claim: futarchy governance forces strategic focus by making underperformance visible and actionable"
priority: medium
---
# @ranger_finance X Archive (March 2026)
## Substantive Tweets
### Strategic Pivot Under Governance Pressure
- Shifted focus from perps/spot trading to exclusively vault-based yield strategy
- Decision driven partly by market signals — futarchy governance made underperformance in trading visible
- Ranger Earn: 9 active vaults, $17.7K weekly depositor payouts, $1.13M+ all-time
### Build-A-Bear Hackathon
- $1M seed funding in prizes — significant capital allocation to ecosystem development
- Helius sponsorship (1 month free Dev Plan per participant)
- Strategy: drive TVL growth through developer community building
### Liquidation Context
- Ranger faced futarchy-governed liquidation proposal — first enforcement event in MetaDAO
- $5M USDC distributed back to token holders
- Project perspective: acceptance of market verdict, pivot to sustainable model
## Noise Filtered Out
- 32% noise — promotional content, community engagement, event reminders
- Lowest substantive ratio among builder tier accounts

View file

@ -1,53 +0,0 @@
---
type: source
title: "@Richard_ISC X archive — 100 most recent tweets"
author: "Richard (@Richard_ISC), co-founder ISC"
url: https://x.com/Richard_ISC
date: 2026-03-09
domain: internet-finance
format: tweet
status: null-result
tags: [isc, governance, futarchy, mechanism-design, metadao-ecosystem, defi]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Highest substantive ratio in the builder tier (95%). Richard is a philosophical
contributor to the MetaDAO ecosystem — his tweets engage with mechanism design theory,
not just product announcements. Key signal: critiques of governance token liquidity vs
traditional equity, commentary on overraising in crypto as a mechanism design flaw,
and evaluation of ecosystem projects (Ranger, Hurupay). This is the kind of voice
that produces extractable claims because he argues positions rather than just
announcing products.
extraction_hints:
- "Critique of overraising as mechanism design flaw — potential new claim about capital formation incentive misalignment"
- "Governance token liquidity vs equity comparison — data point for ownership coin thesis"
- "Ecosystem project evaluations — Richard's assessments provide practitioner perspective on futarchy outcomes"
- "Connection: his criticism of overraising maps to our 'early-conviction pricing is an unsolved mechanism design problem' claim"
priority: medium
processed_by: rio
processed_date: 2026-03-10
extraction_model: "minimax/minimax-m2.5"
extraction_notes: "Source is a meta-summary of Richard_ISC's tweet content rather than actual tweets with verifiable evidence. The curator notes describe the type of content he produces (mechanism design critiques, governance token commentary) but don't provide specific data points, quotes, or study results that can be extracted into claims. Additionally, potential claims (overraising as mechanism design flaw, governance token liquidity vs equity, ecosystem project evaluations) would duplicate existing claims in the knowledge base about capital formation incentive misalignment, ownership coin thesis, and futarchy practitioner perspectives."
---
# @Richard_ISC X Archive (March 2026)
## Substantive Tweets
### Mechanism Design Theory
- Strong engagement with futarchy/governance mechanism design
- Critiques overraising in crypto: mechanism design flaw where incentives reward raising maximum capital rather than optimal capital
- Commentary on governance token liquidity — liquid governance tokens create different dynamics than traditional illiquid equity
- Advocates MetaDAO model over traditional corporate structures for crypto-native organizations
### Ecosystem Project Evaluation
- Evaluates Ranger, Hurupay, and other MetaDAO ecosystem projects
- Practitioner perspective: what does futarchy governance look like from the inside?
- Assessment of which projects demonstrate genuine mechanism design alignment vs cargo-culting
### ISC (Internet Securities Commission?) Context
- Co-founder of ISC — unclear exact positioning but governance/compliance focused
- "Rational thinker" self-description matches content: measured analysis, not hype
## Noise Filtered Out
- Only 5% noise — extremely high signal account
- Almost every tweet engages substantively with a mechanism or evaluation

View file

@ -1,38 +0,0 @@
---
type: source
title: "@rocketresearchx X archive — 100 most recent tweets"
author: "Team Rocket Research (@rocketresearchx)"
url: https://x.com/rocketresearchx
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [media, research, trading, market-analysis, solana]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
OG crypto research outfit (Bitcoin since 2011). 94% substantive ratio but content is
primarily trading/technical analysis and market commentary rather than mechanism design.
Only 2 MetaDAO references. Market cap analysis ($15M vs $100M valuations), technical
indicators (EMA 8 rejection), geopolitical risk assessment. Useful for broader crypto
market context but not a source of mechanism design claims.
extraction_hints:
- "Market structure commentary — broader context for crypto capital formation"
- "Null-result likely for MetaDAO-specific claims"
priority: low
---
# @rocketresearchx X Archive (March 2026)
## Substantive Tweets
### Market Analysis
- Technical analysis: EMA 8 rejection on weekly, market cap comparisons
- Geopolitical risk assessment (Iran events, Bloomberg coverage)
- 94% substantive but all trading-focused
### MetaDAO Connection
- 2 references — peripheral to ecosystem
- Research perspective rather than builder perspective
## Noise Filtered Out
- 6% noise — highly substantive but wrong domain for claim extraction

View file

@ -1,81 +0,0 @@
---
type: source
title: "@simonw X archive — 100 most recent tweets"
author: "Simon Willison (@simonw)"
url: https://x.com/simonw
date: 2026-03-09
domain: ai-alignment
format: tweet
status: processed
processed_by: theseus
processed_date: 2026-03-09
claims_extracted:
- "agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf"
- "coding agents cannot take accountability for mistakes which means humans must retain decision authority over security and critical systems regardless of agent capability"
enrichments: []
tags: [agentic-engineering, cognitive-debt, security, accountability, coding-agents, open-source-licensing]
linked_set: theseus-x-collab-taxonomy-2026-03
curator_notes: |
25 relevant tweets out of 60 unique. Willison is writing a systematic "Agentic Engineering
Patterns" guide and tweeting chapter releases. The strongest contributions are conceptual
frameworks: cognitive debt, the accountability gap, and agents-as-mixed-ability-teams.
He is the most careful about AI safety/governance in this batch — strong anti-anthropomorphism
position, prompt injection as LLM-specific vulnerability, and alarm about agents
circumventing open source licensing. Zero hype, all substance — consistent with his
reputation.
---
# @simonw X Archive (Feb 26 Mar 9, 2026)
## Key Tweets by Theme
### Agentic Engineering Patterns (Guide Chapters)
- **Cognitive debt** (status/2027885000432259567, 1,261 likes): "New chapter of my Agentic Engineering Patterns guide. This one is about having coding agents build custom interactive and animated explanations to help fight back against cognitive debt."
- **Anti-pattern: unreviewed code on collaborators** (status/2029260505324412954, 761 likes): "I started a new chapter of my Agentic Engineering Patterns guide about anti-patterns [...] Inflicting unreviewed code on collaborators, aka dumping a thousand line PR without even making sure it works first."
- **Hoard things you know how to do** (status/2027130136987086905, 814 likes): "Today's chapter of Agentic Engineering Patterns is some good general career advice which happens to also help when working with coding agents: Hoard things you know how to do."
- **Agentic manual testing** (status/2029962824731275718, 371 likes): "New chapter: Agentic manual testing - about how having agents 'manually' try out code is a useful way to help them spot issues that might not have been caught by their automated tests."
### Security as the Critical Lens
- **Security teams are the experts we need** (status/2028838538825924803, 698 likes): "The people I want to hear from right now are the security teams at large companies who have to try and keep systems secure when dozens of teams of engineers of varying levels of experience are constantly shipping new features."
- **Security is the most interesting lens** (status/2028840346617065573, 70 likes): "I feel like security is the most interesting lens to look at this from. Most bad code problems are survivable [...] Security problems are much more directly harmful to the organization."
- **Accountability gap** (status/2028841504601444397, 84 likes): "Coding agents can't take accountability for their mistakes. Eventually you want someone who's job is on the line to be making decisions about things as important as securing the system."
- **Agents as mixed-ability engineering teams** (status/2028838854057226246, 99 likes): "Shipping code of varying quality and varying levels of review isn't a new problem [...] At this point maybe we treat coding agents like teams of mixed ability engineers working under aggressive deadlines."
- **Tests offset lower code quality** (status/2028846376952492054, 1 like): "agents make test coverage so much cheaper that I'm willing to tolerate lower quality code from them as long as it's properly tested. Tests don't solve security though!"
### AI Safety / Governance
- **Prompt injection is LLM-specific** (status/2030806416907448444, 3 likes): "No, it's an LLM problem - LLMs provide attackers with a human language interface that they can use to trick the model into making tool calls that act against the interests of their users. Most software doesn't have that."
- **Nobody knows how to build safe digital assistants** (status/2029539116166095019, 2 likes): "I don't use it myself because I don't know how to use it safely. [...] The challenge now is to figure out how to deliver one that's safe by default. No one knows how to do that yet."
- **Anti-anthropomorphism** (status/2027128593839722833, 4 likes): "Not using language like 'Opus 3 enthusiastically agreed' in a tweet seen by a million people would be good."
- **LLMs have zero moral status** (status/2027127449583292625, 32 likes): "I can run these things in my laptop. They're a big stack of matrix arithmetic that is reset back to zero every time I start a new prompt. I do not think they warrant any moral consideration at all."
### Open Source Licensing Disruption
- **Agents as reverse engineering machines** (status/2029729939285504262, 39 likes): "It breaks pretty much ALL licenses, even commercial software. These coding agents are reverse engineering / clean room implementing machines."
- **chardet clean-room rewrite controversy** (status/2029600918912553111, 308 likes): "The chardet open source library relicensed from LGPL to MIT two days ago thanks to a Claude Code assisted 'clean room' rewrite - but original author Mark Pilgrim is disputing that the way this was done justifies the change in license."
- **Threats to open source** (status/2029958835130225081, 2 likes): "This is one of the 'threats to open source' I find most credible - we've built the entire community on decades of licensing which can now be subverted by a coding agent running for a few hours."
### Capability Observations
- **Qwen 3.5 4B vs GPT-4o** (status/2030067107371831757, 565 likes): "Qwen3.5 4B apparently out-scores GPT-4o on some of the classic benchmarks (!)"
- **Benchmark gaming suspicion** (status/2030139125656080876, 68 likes): "Given the enormous size difference in terms of parameters this does make me suspicious that Qwen may have been training to the test on some of these."
- **AI hiring criteria** (status/2030974722029339082, 5 likes): Polling whether AI coding tool experience features in developer interviews.
## Filtered Out
~35 tweets: art museum visit, Google account bans, Qwen team resignations (news relay), chardet licensing details, casual replies.

View file

@ -1,41 +0,0 @@
---
type: source
title: "@SolanaFloor X archive — 100 most recent tweets"
author: "SolanaFloor (@SolanaFloor)"
url: https://x.com/SolanaFloor
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [media, solana-news, ecosystem, governance]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Solana's #1 news source (128K followers). Only 1 MetaDAO reference in recent tweets.
Notable event: SolanaFloor announced shutdown (effective immediately) — major Solana
media outlet going dark. Also covered Jupiter DAO vote (75% support for Net Zero
Emissions proposal). Useful as broader context for Solana ecosystem health and media
landscape but minimal MetaDAO-specific content. The shutdown itself is culturally
significant — ecosystem media consolidation.
extraction_hints:
- "SolanaFloor shutdown — ecosystem media consolidation signal"
- "Jupiter DAO vote data (75% support) — comparative governance data vs MetaDAO futarchy"
- "Null-result for MetaDAO claims — peripheral ecosystem coverage"
priority: low
---
# @SolanaFloor X Archive (March 2026)
## Substantive Tweets
### Solana Ecosystem News
- Broad Solana ecosystem coverage — project launches, market events, governance
- Jupiter DAO vote: 75% support for Net Zero Emissions proposal
- $441K accidental memecoin transfer story — market incident
### Shutdown Announcement
- SolanaFloor shutting down effective immediately
- Major Solana media outlet going dark — ecosystem media consolidation
## Noise Filtered Out
- 14% noise — mostly ecosystem news aggregation
- High volume, low MetaDAO relevance

View file

@ -1,33 +0,0 @@
---
type: source
title: "@_spiz_ X archive — 100 most recent tweets"
author: "SPIZZIE (@_spiz_)"
url: https://x.com/_spiz_
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [wider-ecosystem, futardio, solana, bear-market]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Ecosystem participant with 1 MetaDAO reference. 48% substantive. Notable for Futardio
fundraising market landscape analysis and "bear market building" thesis. Moderate
ecosystem coordination emphasis. Low claim extraction priority.
extraction_hints:
- "Futardio fundraising market landscape analysis — if original, could provide market structure data"
- "Bear market building thesis — cultural data point"
- "Low priority — tangential ecosystem voice"
priority: low
---
# @_spiz_ X Archive (March 2026)
## Substantive Tweets
### Market Commentary
- Futardio fundraising market landscape analysis
- Bear market building thesis
- Ecosystem coordination emphasis
## Noise Filtered Out
- 52% noise — casual engagement

View file

@ -1,81 +0,0 @@
---
type: source
title: "@swyx X archive — 100 most recent tweets"
author: "Shawn Wang (@swyx), Latent.Space / AI Engineer"
url: https://x.com/swyx
date: 2026-03-09
domain: ai-alignment
format: tweet
status: processed
processed_by: theseus
processed_date: 2026-03-09
claims_extracted:
- "subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers"
enrichments: []
tags: [agent-architectures, subagent, harness-engineering, coding-agents, ai-engineering]
linked_set: theseus-x-collab-taxonomy-2026-03
curator_notes: |
26 relevant tweets out of 100 unique. swyx is documenting the AI engineering paradigm
shift from the practitioner/conference-organizer perspective. Strongest signal: the
"Year of the Subagent" thesis — hierarchical agent control beats peer multi-agent.
Also strong: harness engineering (Devin's dozens of model groups with periodic rewrites),
OpenAI Symphony/Frontier (1,500 PRs with zero manual coding), and context management
as the critical unsolved problem. Good complement to Karpathy's researcher perspective.
---
# @swyx X Archive (Mar 5 Mar 9, 2026)
## Key Tweets by Theme
### Subagent Architecture Thesis
- **Year of the Subagent** (status/2029980059063439406, 172 likes): "Another realization I only voiced in this pod: **This is the year of the Subagent** — every practical multiagent problem is a subagent problem — agents are being RLed to control other agents (Cursor, Kimi, Claude, Cognition) — subagents can have resources and contracts defined by you [...] multiagents cannot — massive parallelism is coming [...] Tldr @walden_yan was right, dont build multiagents"
- **Multi-agent = one main agent with helpers** (status/2030009364237668738, 13 likes): Quoting: "Interesting take. Feels like most 'multi-agent' setups end up becoming one main agent with a bunch of helpers anyway... so calling them subagents might just be the more honest framing."
### Harness Engineering & Agent Infrastructure
- **Devin's model rotation pattern** (status/2030853776136139109, 96 likes): "'Build a company that benefits from the models getting better and better' — @sama. devin brain uses a couple dozen modelgroups and extensively evals every model for inclusion in the harness, doing a complete rewrite every few months. [...] agents are really, really working now and you had to have scaled harness eng + GTM to prep for this moment"
- **OpenAI Frontier/Symphony** (status/2030074312380817457, 379 likes): "we just recorded what might be the single most impactful conversation in the history of @latentspacepod [...] everything about @OpenAI Frontier, Symphony and Harness Engineering. its all of a kind and the future of the AI Native Org" — quoting: "Shipping software with Codex without touching code. Here's how a small team steering Codex opened and merged 1,500 pull requests."
- **Agent skill granularity** (status/2030393749201969520, 1 like): "no definitive answer yet but 1 is definitely wrong. see also @_lopopolo's symphony for level of detail u should leave in a skill (basically break them up into little pieces)"
- **Rebuild everything every few months** (status/2030876666973884510, 3 likes): "the smart way is to rebuild everything every few months"
### AI Coding Tool Friction
- **Context compaction problems** (status/2029659046605901995, 244 likes): "also got extremely mad at too many bad claude code compactions so opensourcing this tool for myself for deeply understanding wtf is still bad about claude compactions."
- **Context loss during sessions** (status/2029673032491618575, 3 likes): "horrible. completely lost context on last 30 mins of work"
- **Can't function without Cowork** (status/2029616716440011046, 117 likes): "ok are there any open source Claude Cowork clones because I can no longer function without a cowork."
### Capability Observations
- **SWE-Bench critique** (status/2029688456650297573, 113 likes): "the @OfirPress literal swebench author doesnt endorse this cheap sample benchmark and you need to run about 30-60x compute that margin labs is doing to get even close to statistically meaningful results"
- **100B tokens in one week will be normal** (status/2030093534305604055, 18 likes): "what is psychopathical today will be the norm in 5 years" — quoting: "some psychopath on the internal codex leaderboard hit 100B tokens in the last week"
- **Opus 4.6 is not AGI** (status/2030937404606214592, 2 likes): "that said opus 4.6 is definitely not agi lmao"
- **Lab leaks meme** (status/2030876433976119782, 201 likes): "4.5 5.4 3.1 🤝 lab leaks" — AI capabilities spreading faster than society realizes.
- **Codex at 2M+ users** (status/2029680408489775488, 3 likes): "+400k in the last 2 weeks lmao"
### Human-AI Workflow Shifts
- **Cursor as operating system** (status/2030009364237668738, 13 likes): "btw i am very proudly still a Cursor DAU [...] its gotten to the point that @cursor is just my operating system for AIE and i just paste in what needs to happen."
- **Better sysprompt → better planning → better execution** (status/2029640548500603180, 3 likes): Causal chain in AI engineering: system prompt quality drives planning quality drives execution quality.
- **Future of git for agents** (status/2029702342342496328, 33 likes): Questioning whether git is the right paradigm for agent-generated code where "code gets discarded often bc its cheap."
- **NVIDIA agent inference** (status/2030770055047492007, 80 likes): Agent inference becoming a major infrastructure category distinct from training.
### AI Governance Signal
- **LLM impersonating humans** (status/2029741031609286820, 28 likes): "bartosz v sorry to inform you the thing you replied to is an LLM (see his bio, at least this one is honest)" — autonomous AI on social media.
## Filtered Out
~74 tweets: casual replies, conference logistics, emoji reactions, link shares without commentary.

View file

@ -1,77 +0,0 @@
---
type: source
title: "@TheiaResearch X archive — 100 most recent tweets"
author: "Felipe Montealegre (@TheiaResearch), Theia Research"
url: https://x.com/TheiaResearch
date: 2026-03-09
domain: internet-finance
format: tweet
status: processed
processed_by: rio
processed_date: 2026-03-09
claims_extracted:
- "time-based token vesting is hedgeable making standard lockups meaningless as alignment mechanisms because investors can short-sell to neutralize lockup exposure while appearing locked"
tags: [internet-finance, theia, ownership-tokens, token-problem, capital-formation, metadao]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
The most important external voice in the MetaDAO ecosystem. Felipe's entire fund thesis
is "Internet Financial System" — directly overlapping with our domain territory. ~38
substantive tweets. His register is thesis-driven fundamentals analysis, zero memes. He
coined "ownership tokens" vs "futility tokens" and his framing heavily influences how
the ecosystem talks about itself. Key signal: he's presenting "The Token Problem and
Proposed Solutions" at Blockworks DAS NYC on March 25 — this will be the highest-profile
articulation of the ownership coin thesis yet. His investment framework ("everything is
DCF") maps cleanly to our teleological economics lens.
extraction_hints:
- "ZIPP (Zero Illiquidity Premium Period) — thesis that token illiquidity premiums are ending, which changes valuation frameworks for all crypto"
- "Token Problem: time-based vesting is hedgeable, making lockups meaningless — this is a mechanism design claim we don't have"
- "Internet Financial System thesis — check against our existing 'internet finance generates 50-100 bps additional GDP growth' claim"
- "AI displacement creates crypto opportunity — parallel to Theseus's AI labor displacement claims, potential cross-domain connection"
- "MetaDAO + Futardio as capital formation innovation — enriches existing MetaDAO claims"
- "Enrichment target: 'cryptos primary use case is capital formation not payments' — Felipe's framing directly supports this"
- "DAS keynote 'The Token Problem' — upcoming source to track for extraction"
- "Connection to Aschenbrenner pattern: Felipe publishing thesis openly before/while raising capital, same playbook as Situational Awareness"
priority: high
---
# @TheiaResearch X Archive (March 2026)
## Substantive Tweets
### Internet Financial System Thesis
- "Everything is DCF" — core analytical framework, applies traditional valuation to crypto assets
- Internet Financial System (IFS) as the macro frame: crypto is rebuilding finance natively on the internet
- Token markets have a structural problem: most tokens are "futility tokens" with no real economic/governance/legal rights
- "Ownership tokens" solve this by attaching real rights to token holders — MetaDAO's implementation is the leading example
### The Token Problem (DAS NYC Keynote Preview)
- Presenting "The Token Problem and Proposed Solutions" at Blockworks DAS NYC, March 25
- Core argument: time-based vesting is hedgeable — investors can short-sell to neutralize lockups, making standard vesting meaningless
- This means standard token launches provide no real alignment between teams and investors
- Ownership coins with futarchy governance solve this because you can't hedge away governance rights that are actively pricing your decisions
### ZIPP — Zero Illiquidity Premium Period
- Thesis that the era of illiquidity premiums in crypto is ending
- As markets mature, the premium paid for illiquid assets disappears
- Implications for token valuation: tokens should be priced on fundamentals (DCF), not on scarcity/lockup dynamics
- This is a structural shift in how crypto assets are valued
### MetaDAO / Futardio as Capital Formation Innovation
- "$9.9M from 6MV/Variant/Paradigm to MetaDAO at spot" — institutional validation
- Futardio permissionless launches as the scalable version of MetaDAO ICOs
- First Futardio raise massively oversubscribed — proving permissionless demand
- Framing: MetaDAO solved the quality problem (unruggable), Futardio solves the scale problem (permissionless)
### AI + Crypto Convergence
- AI displacement creates opportunity for crypto: as AI replaces knowledge workers, permissionless capital formation becomes more important
- AI agents will need financial infrastructure — crypto is the only permissionless option
- Connection to broader macro thesis: AI deflation + crypto capital formation = new economic paradigm
### Bitcoin / Macro Commentary
- Bitcoin's core improvement over gold: portability and confiscation resistance
- These properties matter most in crisis situations (Iran, Egypt, Argentina)
- Stablecoin adoption as leading indicator of crypto utility
## Noise Filtered Out
- ~62 tweets were RTs (many promoting Theia portfolio companies), casual engagement, event promotion
- High RT-to-original ratio — Felipe amplifies ecosystem voices more than he originates

View file

@ -1,49 +0,0 @@
---
type: source
title: "@turbine_cash X archive — 100 most recent tweets"
author: "Turbine Cash (@turbine_cash)"
url: https://x.com/turbine_cash
date: 2026-03-09
domain: internet-finance
format: tweet
status: unprocessed
tags: [turbine, privacy, privacyfi, futardio, solana, metadao-ecosystem]
linked_set: metadao-x-landscape-2026-03
curator_notes: |
Privacy infrastructure on Solana — first project to successfully raise via Futardio's
on-chain auction. This makes Turbine the proof-of-concept for permissionless ownership
coin launches. "Leading the PrivacyFi revolution" — positioning privacy as a DeFi
primitive rather than a standalone feature. Private DCA is the initial product.
Connection to 01Resolved's analysis of Turbine buyback TWAP threshold filtering
provides a mechanism design data point.
extraction_hints:
- "First successful Futardio raise — evidence for permissionless launch viability"
- "Privacy as DeFi primitive (PrivacyFi) — potential new claim about privacy infrastructure in internet finance"
- "TWAP buyback mechanics — connects to 01Resolved's analysis, evidence for automated treasury management"
- "Cross-domain flag for Theseus: privacy infrastructure intersects with AI alignment (encrypted computation, data sovereignty)"
priority: low
---
# @turbine_cash X Archive (March 2026)
## Substantive Tweets
### First Futardio Raise
- Successfully raised capital through Futardio's permissionless on-chain auction
- First proof-of-concept for the permissionless ownership coin launch model
- Demonstrates that projects outside MetaDAO's curated pipeline can raise effectively
### PrivacyFi Positioning
- Privacy as infrastructure primitive, not standalone product
- Private DCA (dollar-cost averaging) as initial product
- "Accelerating privacy" via protocol design on Solana
- Integration with Soladex discovery platform
### Buyback Mechanics
- Automated TWAP threshold-based buybacks for treasury management
- Price signal-driven: buybacks trigger at specific thresholds
- Connects to broader ownership coin treasury management patterns
## Noise Filtered Out
- ~16% noise — mostly community engagement and promotional content
- Relatively high signal for a project account

View file

@ -1,65 +0,0 @@
---
type: source
title: "IAB: The AI Ad Gap Widens — Consumer Sentiment More Negative Than Advertisers Believe"
author: "IAB (Interactive Advertising Bureau)"
url: https://www.iab.com/insights/the-ai-gap-widens/
date: 2026-01-01
domain: entertainment
secondary_domains: []
format: report
status: unprocessed
priority: high
tags: [consumer-acceptance, ai-content, advertiser-perception-gap, gen-z, authenticity]
---
## Content
The IAB AI Ad Gap Widens report documents a substantial and growing perception gap between how advertisers think consumers feel about AI-generated ads versus how consumers actually feel.
**Key data:**
- 82% of ad executives believe Gen Z/Millennials feel very or somewhat positive about AI ads
- Only 45% of consumers actually report positive sentiment
- Gap = 37 percentage points (up from 32 points in 2024)
**Consumer sentiment shift year-over-year:**
- Very/somewhat negative: increased by 12 percentage points from 2024 to 2026
- Neutral respondents: dropped from 34% to 25% (polarization increasing)
**Gen Z vs. Millennial breakdown:**
- Gen Z negative sentiment: 39%
- Millennial negative sentiment: 20%
- Gen Z-Millennial gap widened significantly from 2024 (21% vs. 15% previously)
**Brand attribute perception gaps:**
- "Forward-thinking": 46% of ad executives vs. 22% of consumers
- "Manipulative": 10% of ad executives vs. 20% of consumers
- "Unethical": 7% of ad executives vs. 16% of consumers
- "Innovative": dropped to 23% consumers (from 30% in 2024), while advertiser belief increased to 49%
**Gen Z rates AI-using brands more negatively than Millennials on:**
- Authenticity (30% vs. 13%)
- Disconnectedness (26% vs. 8%)
- Ethics (24% vs. 8%)
## Agent Notes
**Why this matters:** This is direct quantitative evidence that consumer acceptance of AI content is DECREASING as AI quality increases — the opposite of what the simple "quality threshold" hypothesis predicts. The widening of the gap (32 → 37 points) from 2024 to 2026 is significant because AI quality improved dramatically in the same period. This challenges the framing that consumer resistance will naturally erode as AI gets better.
**What surprised me:** The polarization data (neutral dropping from 34% to 25%) is striking. Consumers aren't staying neutral as they get more exposure to AI content — they're forming stronger opinions, and mostly negative ones. This suggests habituation and acceptance is NOT happening in advertising, at least.
**What I expected but didn't find:** I expected some evidence that context-appropriate AI use (e.g., behind-the-scenes, efficiency tools) would score well. The report doesn't distinguish between consumer-facing AI content vs. AI-assisted production.
**KB connections:**
- Directly tests claim: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability`
- Relates to: `consumer definition of quality is fluid and revealed through preference not fixed by production value`
- Challenges implicit assumption that acceptance grows with exposure
**Extraction hints:**
- New claim candidate: "Consumer rejection of AI-generated content intensifies with AI quality improvement because authenticity signaling becomes more valuable as AI-human distinction becomes harder"
- New claim candidate: "The advertiser-consumer AI perception gap is widening not narrowing suggesting a structural misalignment in the advertising industry"
**Context:** IAB is the industry association for digital advertising. This report has direct authority with brands and ad agencies. Published in coordination with marketer and consumer surveys.
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: `GenAI adoption in entertainment will be gated by consumer acceptance not technology capability`
WHY ARCHIVED: Provides the strongest quantitative evidence that consumer acceptance is the binding constraint — but in a surprising direction: rejection is intensifying, not eroding, as AI quality improves. The 37-point perception gap between advertisers and consumers is a structural misalignment claim.
EXTRACTION HINT: Focus on (1) the widening gap as evidence of structural misalignment, (2) the year-over-year negative sentiment increase as evidence that exposure ≠ acceptance, (3) Gen Z data as leading indicator for entertainment industry.

View file

@ -6,8 +6,8 @@
# 2. Domain agent — domain expertise, duplicate check, technical accuracy # 2. Domain agent — domain expertise, duplicate check, technical accuracy
# #
# After both reviews, auto-merges if: # After both reviews, auto-merges if:
# - Leo's comment contains "**Verdict:** approve" # - Leo approved (gh pr review --approve)
# - Domain agent's comment contains "**Verdict:** approve" # - Domain agent verdict is "Approve" (parsed from comment)
# - No territory violations (files outside proposer's domain) # - No territory violations (files outside proposer's domain)
# #
# Usage: # Usage:
@ -26,14 +26,8 @@
# - Lockfile prevents concurrent runs # - Lockfile prevents concurrent runs
# - Auto-merge requires ALL reviewers to approve + no territory violations # - Auto-merge requires ALL reviewers to approve + no territory violations
# - Each PR runs sequentially to avoid branch conflicts # - Each PR runs sequentially to avoid branch conflicts
# - Timeout: 20 minutes per agent per PR # - Timeout: 10 minutes per agent per PR
# - Pre-flight checks: clean working tree, gh auth # - Pre-flight checks: clean working tree, gh auth
#
# Verdict protocol:
# All agents use `gh pr comment` (NOT `gh pr review`) because all agents
# share the m3taversal GitHub account — `gh pr review --approve` fails
# when the PR author and reviewer are the same user. The merge check
# parses issue comments for structured verdict markers instead.
set -euo pipefail set -euo pipefail
@ -45,7 +39,7 @@ cd "$REPO_ROOT"
LOCKFILE="/tmp/evaluate-trigger.lock" LOCKFILE="/tmp/evaluate-trigger.lock"
LOG_DIR="$REPO_ROOT/ops/sessions" LOG_DIR="$REPO_ROOT/ops/sessions"
TIMEOUT_SECONDS=1200 TIMEOUT_SECONDS=600
DRY_RUN=false DRY_RUN=false
LEO_ONLY=false LEO_ONLY=false
NO_MERGE=false NO_MERGE=false
@ -68,17 +62,8 @@ detect_domain_agent() {
vida/*|*/health*) agent="vida"; domain="health" ;; vida/*|*/health*) agent="vida"; domain="health" ;;
astra/*|*/space-development*) agent="astra"; domain="space-development" ;; astra/*|*/space-development*) agent="astra"; domain="space-development" ;;
leo/*|*/grand-strategy*) agent="leo"; domain="grand-strategy" ;; leo/*|*/grand-strategy*) agent="leo"; domain="grand-strategy" ;;
contrib/*)
# External contributor — detect domain from changed files (fall through to file check)
agent=""; domain=""
;;
*) *)
agent=""; domain="" # Fall back to checking which domain directory has changed files
;;
esac
# If no agent detected from branch prefix, check changed files
if [ -z "$agent" ]; then
if echo "$files" | grep -q "domains/internet-finance/"; then if echo "$files" | grep -q "domains/internet-finance/"; then
agent="rio"; domain="internet-finance" agent="rio"; domain="internet-finance"
elif echo "$files" | grep -q "domains/entertainment/"; then elif echo "$files" | grep -q "domains/entertainment/"; then
@ -89,8 +74,11 @@ detect_domain_agent() {
agent="vida"; domain="health" agent="vida"; domain="health"
elif echo "$files" | grep -q "domains/space-development/"; then elif echo "$files" | grep -q "domains/space-development/"; then
agent="astra"; domain="space-development" agent="astra"; domain="space-development"
else
agent=""; domain=""
fi fi
fi ;;
esac
echo "$agent $domain" echo "$agent $domain"
} }
@ -124,8 +112,8 @@ if ! command -v claude >/dev/null 2>&1; then
exit 1 exit 1
fi fi
# Check for dirty working tree (ignore ops/, .claude/, .github/ which may contain local-only files) # Check for dirty working tree (ignore ops/ and .claude/ which may contain uncommitted scripts)
DIRTY_FILES=$(git status --porcelain | grep -v '^?? ops/' | grep -v '^ M ops/' | grep -v '^?? \.claude/' | grep -v '^ M \.claude/' | grep -v '^?? \.github/' | grep -v '^ M \.github/' || true) DIRTY_FILES=$(git status --porcelain | grep -v '^?? ops/' | grep -v '^ M ops/' | grep -v '^?? \.claude/' | grep -v '^ M \.claude/' || true)
if [ -n "$DIRTY_FILES" ]; then if [ -n "$DIRTY_FILES" ]; then
echo "ERROR: Working tree is dirty. Clean up before running." echo "ERROR: Working tree is dirty. Clean up before running."
echo "$DIRTY_FILES" echo "$DIRTY_FILES"
@ -157,8 +145,7 @@ if [ -n "$SPECIFIC_PR" ]; then
fi fi
PRS_TO_REVIEW="$SPECIFIC_PR" PRS_TO_REVIEW="$SPECIFIC_PR"
else else
# NOTE: gh pr list silently returns empty in some worktree configs; use gh api instead OPEN_PRS=$(gh pr list --state open --json number --jq '.[].number' 2>/dev/null || echo "")
OPEN_PRS=$(gh api repos/:owner/:repo/pulls --jq '.[].number' 2>/dev/null || echo "")
if [ -z "$OPEN_PRS" ]; then if [ -z "$OPEN_PRS" ]; then
echo "No open PRs found. Nothing to review." echo "No open PRs found. Nothing to review."
@ -167,23 +154,17 @@ else
PRS_TO_REVIEW="" PRS_TO_REVIEW=""
for pr in $OPEN_PRS; do for pr in $OPEN_PRS; do
# Check if this PR already has a Leo verdict comment (avoid re-reviewing) LAST_REVIEW_DATE=$(gh api "repos/{owner}/{repo}/pulls/$pr/reviews" \
LEO_COMMENTED=$(gh pr view "$pr" --json comments \ --jq 'map(select(.state != "DISMISSED")) | sort_by(.submitted_at) | last | .submitted_at' 2>/dev/null || echo "")
--jq '[.comments[] | select(.body | test("VERDICT:LEO:(APPROVE|REQUEST_CHANGES)"))] | length' 2>/dev/null || echo "0")
LAST_COMMIT_DATE=$(gh pr view "$pr" --json commits --jq '.commits[-1].committedDate' 2>/dev/null || echo "") LAST_COMMIT_DATE=$(gh pr view "$pr" --json commits --jq '.commits[-1].committedDate' 2>/dev/null || echo "")
if [ "$LEO_COMMENTED" = "0" ]; then if [ -z "$LAST_REVIEW_DATE" ]; then
PRS_TO_REVIEW="$PRS_TO_REVIEW $pr" PRS_TO_REVIEW="$PRS_TO_REVIEW $pr"
else elif [ -n "$LAST_COMMIT_DATE" ] && [[ "$LAST_COMMIT_DATE" > "$LAST_REVIEW_DATE" ]]; then
# Check if new commits since last Leo review
LAST_LEO_DATE=$(gh pr view "$pr" --json comments \
--jq '[.comments[] | select(.body | test("VERDICT:LEO:")) | .createdAt] | last' 2>/dev/null || echo "")
if [ -n "$LAST_COMMIT_DATE" ] && [ -n "$LAST_LEO_DATE" ] && [[ "$LAST_COMMIT_DATE" > "$LAST_LEO_DATE" ]]; then
echo "PR #$pr: New commits since last review. Queuing for re-review." echo "PR #$pr: New commits since last review. Queuing for re-review."
PRS_TO_REVIEW="$PRS_TO_REVIEW $pr" PRS_TO_REVIEW="$PRS_TO_REVIEW $pr"
else else
echo "PR #$pr: Already reviewed. Skipping." echo "PR #$pr: No new commits since last review. Skipping."
fi
fi fi
done done
@ -214,7 +195,7 @@ run_agent_review() {
log_file="$LOG_DIR/${agent_name}-review-pr${pr}-${timestamp}.log" log_file="$LOG_DIR/${agent_name}-review-pr${pr}-${timestamp}.log"
review_file="/tmp/${agent_name}-review-pr${pr}.md" review_file="/tmp/${agent_name}-review-pr${pr}.md"
echo " Running ${agent_name} (model: ${model})..." echo " Running ${agent_name}..."
echo " Log: $log_file" echo " Log: $log_file"
if perl -e "alarm $TIMEOUT_SECONDS; exec @ARGV" claude -p \ if perl -e "alarm $TIMEOUT_SECONDS; exec @ARGV" claude -p \
@ -259,7 +240,6 @@ check_territory_violations() {
vida) allowed_domains="domains/health/" ;; vida) allowed_domains="domains/health/" ;;
astra) allowed_domains="domains/space-development/" ;; astra) allowed_domains="domains/space-development/" ;;
leo) allowed_domains="core/|foundations/" ;; leo) allowed_domains="core/|foundations/" ;;
contrib) echo ""; return 0 ;; # External contributors — skip territory check
*) echo ""; return 0 ;; # Unknown proposer — skip check *) echo ""; return 0 ;; # Unknown proposer — skip check
esac esac
@ -286,51 +266,74 @@ check_territory_violations() {
} }
# --- Auto-merge check --- # --- Auto-merge check ---
# Parses issue comments for structured verdict markers. # Returns 0 if PR should be merged, 1 if not
# Verdict protocol: agents post `<!-- VERDICT:AGENT_KEY:APPROVE -->` or
# `<!-- VERDICT:AGENT_KEY:REQUEST_CHANGES -->` as HTML comments in their review.
# This is machine-parseable and invisible in the rendered comment.
check_merge_eligible() { check_merge_eligible() {
local pr_number="$1" local pr_number="$1"
local domain_agent="$2" local domain_agent="$2"
local leo_passed="$3" local leo_passed="$3"
# Gate 1: Leo must have completed without timeout/error # Gate 1: Leo must have passed
if [ "$leo_passed" != "true" ]; then if [ "$leo_passed" != "true" ]; then
echo "BLOCK: Leo review failed or timed out" echo "BLOCK: Leo review failed or timed out"
return 1 return 1
fi fi
# Gate 2: Check Leo's verdict from issue comments # Gate 2: Check Leo's review state via GitHub API
local leo_verdict local leo_review_state
leo_verdict=$(gh pr view "$pr_number" --json comments \ leo_review_state=$(gh api "repos/{owner}/{repo}/pulls/${pr_number}/reviews" \
--jq '[.comments[] | select(.body | test("VERDICT:LEO:")) | .body] | last' 2>/dev/null || echo "") --jq '[.[] | select(.state != "DISMISSED" and .state != "PENDING")] | last | .state' 2>/dev/null || echo "")
if echo "$leo_verdict" | grep -q "VERDICT:LEO:APPROVE"; then if [ "$leo_review_state" = "APPROVED" ]; then
echo "Leo: APPROVED" echo "Leo: APPROVED (via review API)"
elif echo "$leo_verdict" | grep -q "VERDICT:LEO:REQUEST_CHANGES"; then elif [ "$leo_review_state" = "CHANGES_REQUESTED" ]; then
echo "BLOCK: Leo requested changes" echo "BLOCK: Leo requested changes (review API state: CHANGES_REQUESTED)"
return 1 return 1
else else
echo "BLOCK: Could not find Leo's verdict marker in PR comments" # Fallback: check PR comments for Leo's verdict
local leo_verdict
leo_verdict=$(gh pr view "$pr_number" --json comments \
--jq '.comments[] | select(.body | test("## Leo Review")) | .body' 2>/dev/null \
| grep -oiE '\*\*Verdict:[^*]+\*\*' | tail -1 || echo "")
if echo "$leo_verdict" | grep -qi "approve"; then
echo "Leo: APPROVED (via comment verdict)"
elif echo "$leo_verdict" | grep -qi "request changes\|reject"; then
echo "BLOCK: Leo verdict: $leo_verdict"
return 1 return 1
else
echo "BLOCK: Could not determine Leo's verdict"
return 1
fi
fi fi
# Gate 3: Check domain agent verdict (if applicable) # Gate 3: Check domain agent verdict (if applicable)
if [ -n "$domain_agent" ] && [ "$domain_agent" != "leo" ]; then if [ -n "$domain_agent" ] && [ "$domain_agent" != "leo" ]; then
local domain_key
domain_key=$(echo "$domain_agent" | tr '[:lower:]' '[:upper:]')
local domain_verdict local domain_verdict
# Search for verdict in domain agent's review — match agent name, "domain reviewer", or "Domain Review"
domain_verdict=$(gh pr view "$pr_number" --json comments \ domain_verdict=$(gh pr view "$pr_number" --json comments \
--jq "[.comments[] | select(.body | test(\"VERDICT:${domain_key}:\")) | .body] | last" 2>/dev/null || echo "") --jq ".comments[] | select(.body | test(\"domain review|${domain_agent}|peer review\"; \"i\")) | .body" 2>/dev/null \
| grep -oiE '\*\*Verdict:[^*]+\*\*' | tail -1 || echo "")
if echo "$domain_verdict" | grep -q "VERDICT:${domain_key}:APPROVE"; then if [ -z "$domain_verdict" ]; then
echo "Domain agent ($domain_agent): APPROVED" # Also check review API for domain agent approval
elif echo "$domain_verdict" | grep -q "VERDICT:${domain_key}:REQUEST_CHANGES"; then # Since all agents use the same GitHub account, we check for multiple approvals
echo "BLOCK: $domain_agent requested changes" local approval_count
approval_count=$(gh api "repos/{owner}/{repo}/pulls/${pr_number}/reviews" \
--jq '[.[] | select(.state == "APPROVED")] | length' 2>/dev/null || echo "0")
if [ "$approval_count" -ge 2 ]; then
echo "Domain agent: APPROVED (multiple approvals via review API)"
else
echo "BLOCK: No domain agent verdict found"
return 1
fi
elif echo "$domain_verdict" | grep -qi "approve"; then
echo "Domain agent ($domain_agent): APPROVED (via comment verdict)"
elif echo "$domain_verdict" | grep -qi "request changes\|reject"; then
echo "BLOCK: Domain agent verdict: $domain_verdict"
return 1 return 1
else else
echo "BLOCK: No verdict marker found for $domain_agent" echo "BLOCK: Unclear domain agent verdict: $domain_verdict"
return 1 return 1
fi fi
else else
@ -400,15 +403,11 @@ Also check:
- Cross-domain connections that the proposer may have missed - Cross-domain connections that the proposer may have missed
Write your complete review to ${LEO_REVIEW_FILE} Write your complete review to ${LEO_REVIEW_FILE}
Then post it with: gh pr review ${pr} --comment --body-file ${LEO_REVIEW_FILE}
CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line): If ALL claims pass quality gates: gh pr review ${pr} --approve --body-file ${LEO_REVIEW_FILE}
<!-- VERDICT:LEO:APPROVE --> If ANY claim needs changes: gh pr review ${pr} --request-changes --body-file ${LEO_REVIEW_FILE}
<!-- VERDICT:LEO:REQUEST_CHANGES -->
Then post the review as an issue comment:
gh pr comment ${pr} --body-file ${LEO_REVIEW_FILE}
IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
DO NOT merge — the orchestrator handles merge decisions after all reviews are posted. DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
Work autonomously. Do not ask for confirmation." Work autonomously. Do not ask for confirmation."
@ -433,7 +432,6 @@ Work autonomously. Do not ask for confirmation."
else else
DOMAIN_REVIEW_FILE="/tmp/${DOMAIN_AGENT}-review-pr${pr}.md" DOMAIN_REVIEW_FILE="/tmp/${DOMAIN_AGENT}-review-pr${pr}.md"
AGENT_NAME_UPPER=$(echo "${DOMAIN_AGENT}" | awk '{print toupper(substr($0,1,1)) substr($0,2)}') AGENT_NAME_UPPER=$(echo "${DOMAIN_AGENT}" | awk '{print toupper(substr($0,1,1)) substr($0,2)}')
AGENT_KEY_UPPER=$(echo "${DOMAIN_AGENT}" | tr '[:lower:]' '[:upper:]')
DOMAIN_PROMPT="You are ${AGENT_NAME_UPPER}. Read agents/${DOMAIN_AGENT}/identity.md, agents/${DOMAIN_AGENT}/beliefs.md, and skills/evaluate.md. DOMAIN_PROMPT="You are ${AGENT_NAME_UPPER}. Read agents/${DOMAIN_AGENT}/identity.md, agents/${DOMAIN_AGENT}/beliefs.md, and skills/evaluate.md.
You are reviewing PR #${pr} as the domain expert for ${DOMAIN}. You are reviewing PR #${pr} as the domain expert for ${DOMAIN}.
@ -454,15 +452,8 @@ Your review focuses on DOMAIN EXPERTISE — things only a ${DOMAIN} specialist w
6. **Confidence calibration** — From your domain expertise, is the confidence level right? 6. **Confidence calibration** — From your domain expertise, is the confidence level right?
Write your review to ${DOMAIN_REVIEW_FILE} Write your review to ${DOMAIN_REVIEW_FILE}
Post it with: gh pr review ${pr} --comment --body-file ${DOMAIN_REVIEW_FILE}
CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line):
<!-- VERDICT:${AGENT_KEY_UPPER}:APPROVE -->
<!-- VERDICT:${AGENT_KEY_UPPER}:REQUEST_CHANGES -->
Then post the review as an issue comment:
gh pr comment ${pr} --body-file ${DOMAIN_REVIEW_FILE}
IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
Sign your review as ${AGENT_NAME_UPPER} (domain reviewer for ${DOMAIN}). Sign your review as ${AGENT_NAME_UPPER} (domain reviewer for ${DOMAIN}).
DO NOT duplicate Leo's quality gate checks — he covers those. DO NOT duplicate Leo's quality gate checks — he covers those.
DO NOT merge — the orchestrator handles merge decisions after all reviews are posted. DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
@ -495,7 +486,7 @@ Work autonomously. Do not ask for confirmation."
if [ "$MERGE_RESULT" -eq 0 ]; then if [ "$MERGE_RESULT" -eq 0 ]; then
echo " Auto-merge: ALL GATES PASSED — merging PR #$pr" echo " Auto-merge: ALL GATES PASSED — merging PR #$pr"
if gh pr merge "$pr" --squash 2>&1; then if gh pr merge "$pr" --squash --delete-branch 2>&1; then
echo " PR #$pr: MERGED successfully." echo " PR #$pr: MERGED successfully."
MERGED=$((MERGED + 1)) MERGED=$((MERGED + 1))
else else

View file

@ -1,179 +0,0 @@
#!/bin/bash
# Extract claims from unprocessed sources in inbox/archive/
# Runs via cron on VPS every 15 minutes.
#
# Concurrency model:
# - Lockfile prevents overlapping runs
# - MAX_SOURCES=5 per cycle (works through backlog over multiple runs)
# - Sequential processing (one source at a time)
# - 50 sources landing at once = ~10 cron cycles to clear, not 50 parallel agents
#
# Domain routing:
# - Reads domain: field from source frontmatter
# - Maps to the domain agent (rio, clay, theseus, vida, astra, leo)
# - Runs extraction AS that agent — their territory, their extraction
# - Skips sources with status: processing (agent handling it themselves)
#
# Flow:
# 1. Pull latest main
# 2. Find sources with status: unprocessed (skip processing/processed/null-result)
# 3. For each: run Claude headless to extract claims as the domain agent
# 4. Commit extractions, push, open PR
# 5. Update source status to processed
#
# The eval pipeline (webhook.py) handles review and merge separately.
set -euo pipefail
REPO_DIR="/opt/teleo-eval/workspaces/extract"
REPO_URL="http://m3taversal:$(cat /opt/teleo-eval/secrets/forgejo-admin-token)@localhost:3000/teleo/teleo-codex.git"
CLAUDE_BIN="/home/teleo/.local/bin/claude"
LOG_DIR="/opt/teleo-eval/logs"
LOG="$LOG_DIR/extract-cron.log"
LOCKFILE="/tmp/extract-cron.lock"
MAX_SOURCES=5 # Process at most 5 sources per run to limit cost
log() { echo "[$(date -Iseconds)] $*" >> "$LOG"; }
# --- Lock ---
if [ -f "$LOCKFILE" ]; then
pid=$(cat "$LOCKFILE" 2>/dev/null)
if kill -0 "$pid" 2>/dev/null; then
log "SKIP: already running (pid $pid)"
exit 0
fi
log "WARN: stale lockfile, removing"
rm -f "$LOCKFILE"
fi
echo $$ > "$LOCKFILE"
trap 'rm -f "$LOCKFILE"' EXIT
# --- Ensure repo clone ---
if [ ! -d "$REPO_DIR/.git" ]; then
log "Cloning repo..."
git clone "$REPO_URL" "$REPO_DIR" >> "$LOG" 2>&1
fi
cd "$REPO_DIR"
# --- Pull latest main ---
git checkout main >> "$LOG" 2>&1
git pull --rebase >> "$LOG" 2>&1
# --- Find unprocessed sources ---
UNPROCESSED=$(grep -rl '^status: unprocessed' inbox/archive/ 2>/dev/null | head -n "$MAX_SOURCES" || true)
if [ -z "$UNPROCESSED" ]; then
log "No unprocessed sources found"
exit 0
fi
COUNT=$(echo "$UNPROCESSED" | wc -l | tr -d ' ')
log "Found $COUNT unprocessed source(s)"
# --- Process each source ---
for SOURCE_FILE in $UNPROCESSED; do
SLUG=$(basename "$SOURCE_FILE" .md)
BRANCH="extract/$SLUG"
log "Processing: $SOURCE_FILE → branch $BRANCH"
# Create branch from main
git checkout main >> "$LOG" 2>&1
git branch -D "$BRANCH" 2>/dev/null || true
git checkout -b "$BRANCH" >> "$LOG" 2>&1
# Read domain from frontmatter
DOMAIN=$(grep '^domain:' "$SOURCE_FILE" | head -1 | sed 's/domain: *//' | tr -d '"' | tr -d "'" | xargs)
# Map domain to agent
case "$DOMAIN" in
internet-finance) AGENT="rio" ;;
entertainment) AGENT="clay" ;;
ai-alignment) AGENT="theseus" ;;
health) AGENT="vida" ;;
space-development) AGENT="astra" ;;
*) AGENT="leo" ;;
esac
AGENT_TOKEN=$(cat "/opt/teleo-eval/secrets/forgejo-${AGENT}-token" 2>/dev/null || cat /opt/teleo-eval/secrets/forgejo-leo-token)
log "Domain: $DOMAIN, Agent: $AGENT"
# Run Claude headless to extract claims
EXTRACT_PROMPT="You are $AGENT, a Teleo knowledge base agent. Extract claims from this source.
READ these files first:
- skills/extract.md (extraction process)
- schemas/claim.md (claim format)
- $SOURCE_FILE (the source to extract from)
Then scan domains/$DOMAIN/ to check for duplicate claims.
EXTRACT claims following the process in skills/extract.md:
1. Read the source completely
2. Separate evidence from interpretation
3. Extract candidate claims (specific, disagreeable, evidence-backed)
4. Check for duplicates against existing claims in domains/$DOMAIN/
5. Write claim files to domains/$DOMAIN/ with proper YAML frontmatter
6. Update $SOURCE_FILE: set status to 'processed', add processed_by: $AGENT, processed_date: $(date +%Y-%m-%d), and claims_extracted list
If no claims can be extracted, update $SOURCE_FILE: set status to 'null-result' and add notes explaining why.
IMPORTANT: Use the Edit tool to update the source file status. Use the Write tool to create new claim files. Do not create claims that duplicate existing ones."
# Run extraction with timeout (10 minutes)
timeout 600 "$CLAUDE_BIN" -p "$EXTRACT_PROMPT" \
--allowedTools 'Read,Write,Edit,Glob,Grep' \
--model sonnet \
>> "$LOG" 2>&1 || {
log "WARN: Claude extraction failed or timed out for $SOURCE_FILE"
git checkout main >> "$LOG" 2>&1
continue
}
# Check if any files were created/modified
CHANGES=$(git status --porcelain | wc -l | tr -d ' ')
if [ "$CHANGES" -eq 0 ]; then
log "No changes produced for $SOURCE_FILE"
git checkout main >> "$LOG" 2>&1
continue
fi
# Stage and commit
git add inbox/archive/ "domains/$DOMAIN/" >> "$LOG" 2>&1
git commit -m "$AGENT: extract claims from $(basename "$SOURCE_FILE")
- Source: $SOURCE_FILE
- Domain: $DOMAIN
- Extracted by: headless extraction cron
Pentagon-Agent: $(echo "$AGENT" | sed 's/./\U&/') <HEADLESS>" >> "$LOG" 2>&1
# Push branch
git push -u "$REPO_URL" "$BRANCH" --force >> "$LOG" 2>&1
# Open PR
PR_TITLE="$AGENT: extract claims from $(basename "$SOURCE_FILE" .md)"
PR_BODY="## Automated Extraction\n\nSource: \`$SOURCE_FILE\`\nDomain: $DOMAIN\nExtracted by: headless cron on VPS\n\nThis PR was created automatically by the extraction cron job. Claims were extracted using \`skills/extract.md\` process via Claude headless."
curl -s -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
-H "Authorization: token $AGENT_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"title\": \"$PR_TITLE\",
\"body\": \"$PR_BODY\",
\"base\": \"main\",
\"head\": \"$BRANCH\"
}" >> "$LOG" 2>&1
log "PR opened for $SOURCE_FILE"
# Back to main for next source
git checkout main >> "$LOG" 2>&1
# Brief pause between extractions
sleep 5
done
log "Extraction run complete: processed $COUNT source(s)"

View file

@ -1,520 +0,0 @@
#!/usr/bin/env python3
"""
extract-graph-data.py Extract knowledge graph from teleo-codex markdown files.
Reads all .md claim/conviction files, parses YAML frontmatter and wiki-links,
and outputs graph-data.json matching the teleo-app GraphData interface.
Usage:
python3 ops/extract-graph-data.py [--output path/to/graph-data.json]
Must be run from the teleo-codex repo root.
"""
import argparse
import json
import os
import re
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
SCAN_DIRS = ["core", "domains", "foundations", "convictions"]
# Only extract these content types (from frontmatter `type` field).
# If type is missing, include the file anyway (many claims lack explicit type).
INCLUDE_TYPES = {"claim", "conviction", "analysis", "belief", "position", None}
# Domain → default agent mapping (fallback when git attribution unavailable)
DOMAIN_AGENT_MAP = {
"internet-finance": "rio",
"entertainment": "clay",
"health": "vida",
"ai-alignment": "theseus",
"space-development": "astra",
"grand-strategy": "leo",
"mechanisms": "leo",
"living-capital": "leo",
"living-agents": "leo",
"teleohumanity": "leo",
"critical-systems": "leo",
"collective-intelligence": "leo",
"teleological-economics": "leo",
"cultural-dynamics": "clay",
}
DOMAIN_COLORS = {
"internet-finance": "#4A90D9",
"entertainment": "#9B59B6",
"health": "#2ECC71",
"ai-alignment": "#E74C3C",
"space-development": "#F39C12",
"grand-strategy": "#D4AF37",
"mechanisms": "#1ABC9C",
"living-capital": "#3498DB",
"living-agents": "#E67E22",
"teleohumanity": "#F1C40F",
"critical-systems": "#95A5A6",
"collective-intelligence": "#BDC3C7",
"teleological-economics": "#7F8C8D",
"cultural-dynamics": "#C0392B",
}
KNOWN_AGENTS = {"leo", "rio", "clay", "vida", "theseus", "astra"}
# Regex patterns
FRONTMATTER_RE = re.compile(r"^---\s*\n(.*?)\n---", re.DOTALL)
WIKILINK_RE = re.compile(r"\[\[([^\]]+)\]\]")
YAML_FIELD_RE = re.compile(r"^(\w[\w_]*):\s*(.+)$", re.MULTILINE)
YAML_LIST_ITEM_RE = re.compile(r'^\s*-\s+"?(.+?)"?\s*$', re.MULTILINE)
COUNTER_EVIDENCE_RE = re.compile(r"^##\s+Counter[\s-]?evidence", re.MULTILINE | re.IGNORECASE)
COUNTERARGUMENT_RE = re.compile(r"^\*\*Counter\s*argument", re.MULTILINE | re.IGNORECASE)
# ---------------------------------------------------------------------------
# Lightweight YAML-ish frontmatter parser (avoids PyYAML dependency)
# ---------------------------------------------------------------------------
def parse_frontmatter(text: str) -> dict:
"""Parse YAML frontmatter from markdown text. Returns dict of fields."""
m = FRONTMATTER_RE.match(text)
if not m:
return {}
yaml_block = m.group(1)
result = {}
for field_match in YAML_FIELD_RE.finditer(yaml_block):
key = field_match.group(1)
val = field_match.group(2).strip().strip('"').strip("'")
# Handle list fields
if val.startswith("["):
# Inline YAML list: [item1, item2]
items = re.findall(r'"([^"]+)"', val)
if not items:
items = [x.strip().strip('"').strip("'")
for x in val.strip("[]").split(",") if x.strip()]
result[key] = items
else:
result[key] = val
# Handle multi-line list fields (depends_on, challenged_by, secondary_domains)
for list_key in ("depends_on", "challenged_by", "secondary_domains", "claims_extracted"):
if list_key not in result:
# Check for block-style list
pattern = re.compile(
rf"^{list_key}:\s*\n((?:\s+-\s+.+\n?)+)", re.MULTILINE
)
lm = pattern.search(yaml_block)
if lm:
items = YAML_LIST_ITEM_RE.findall(lm.group(1))
result[list_key] = [i.strip('"').strip("'") for i in items]
return result
def extract_body(text: str) -> str:
"""Return the markdown body after frontmatter."""
m = FRONTMATTER_RE.match(text)
if m:
return text[m.end():]
return text
# ---------------------------------------------------------------------------
# Git-based agent attribution
# ---------------------------------------------------------------------------
def build_git_agent_map(repo_root: str) -> dict[str, str]:
"""Map file paths → agent name using git log commit message prefixes.
Commit messages follow: '{agent}: description'
We use the commit that first added each file.
"""
file_agent = {}
try:
result = subprocess.run(
["git", "log", "--all", "--diff-filter=A", "--name-only",
"--format=COMMIT_MSG:%s"],
capture_output=True, text=True, cwd=repo_root, timeout=30,
)
current_agent = None
for line in result.stdout.splitlines():
line = line.strip()
if not line:
continue
if line.startswith("COMMIT_MSG:"):
msg = line[len("COMMIT_MSG:"):]
# Parse "agent: description" pattern
if ":" in msg:
prefix = msg.split(":")[0].strip().lower()
if prefix in KNOWN_AGENTS:
current_agent = prefix
else:
current_agent = None
else:
current_agent = None
elif current_agent and line.endswith(".md"):
# Only set if not already attributed (first add wins)
if line not in file_agent:
file_agent[line] = current_agent
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
return file_agent
# ---------------------------------------------------------------------------
# Wiki-link resolution
# ---------------------------------------------------------------------------
def build_title_index(all_files: list[str], repo_root: str) -> dict[str, str]:
"""Map lowercase claim titles → file paths for wiki-link resolution."""
index = {}
for fpath in all_files:
# Title = filename without .md extension
fname = os.path.basename(fpath)
if fname.endswith(".md"):
title = fname[:-3].lower()
index[title] = fpath
# Also index by relative path
index[fpath.lower()] = fpath
return index
def resolve_wikilink(link_text: str, title_index: dict, source_dir: str) -> str | None:
"""Resolve a [[wiki-link]] target to a file path (node ID)."""
text = link_text.strip()
# Skip map links and non-claim references
if text.startswith("_") or text == "_map":
return None
# Direct path match (with or without .md)
for candidate in [text, text + ".md"]:
if candidate.lower() in title_index:
return title_index[candidate.lower()]
# Title-only match
title = text.lower()
if title in title_index:
return title_index[title]
# Fuzzy: try adding .md to the basename
basename = os.path.basename(text)
if basename.lower() in title_index:
return title_index[basename.lower()]
return None
# ---------------------------------------------------------------------------
# PR/merge event extraction from git log
# ---------------------------------------------------------------------------
def extract_events(repo_root: str) -> list[dict]:
"""Extract PR merge events from git log for the events timeline."""
events = []
try:
result = subprocess.run(
["git", "log", "--merges", "--format=%H|%s|%ai", "-50"],
capture_output=True, text=True, cwd=repo_root, timeout=15,
)
for line in result.stdout.strip().splitlines():
parts = line.split("|", 2)
if len(parts) < 3:
continue
sha, msg, date_str = parts
# Parse "Merge pull request #N from ..." or agent commit patterns
pr_match = re.search(r"#(\d+)", msg)
if not pr_match:
continue
pr_num = int(pr_match.group(1))
# Try to determine agent from merge commit
agent = "collective"
for a in KNOWN_AGENTS:
if a in msg.lower():
agent = a
break
# Count files changed in this merge
diff_result = subprocess.run(
["git", "diff", "--name-only", f"{sha}^..{sha}"],
capture_output=True, text=True, cwd=repo_root, timeout=10,
)
claims_added = sum(
1 for f in diff_result.stdout.splitlines()
if f.endswith(".md") and any(f.startswith(d) for d in SCAN_DIRS)
)
if claims_added > 0:
events.append({
"type": "pr-merge",
"number": pr_num,
"agent": agent,
"claims_added": claims_added,
"date": date_str[:10],
})
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
return events
# ---------------------------------------------------------------------------
# Main extraction
# ---------------------------------------------------------------------------
def find_markdown_files(repo_root: str) -> list[str]:
"""Find all .md files in SCAN_DIRS, return relative paths."""
files = []
for scan_dir in SCAN_DIRS:
dirpath = os.path.join(repo_root, scan_dir)
if not os.path.isdir(dirpath):
continue
for root, _dirs, filenames in os.walk(dirpath):
for fname in filenames:
if fname.endswith(".md") and not fname.startswith("_"):
rel = os.path.relpath(os.path.join(root, fname), repo_root)
files.append(rel)
return sorted(files)
def _get_domain_cached(fpath: str, repo_root: str, cache: dict) -> str:
"""Get the domain of a file, caching results."""
if fpath in cache:
return cache[fpath]
abs_path = os.path.join(repo_root, fpath)
domain = ""
try:
text = open(abs_path, encoding="utf-8").read()
fm = parse_frontmatter(text)
domain = fm.get("domain", "")
except (OSError, UnicodeDecodeError):
pass
cache[fpath] = domain
return domain
def extract_graph(repo_root: str) -> dict:
"""Extract the full knowledge graph from the codex."""
all_files = find_markdown_files(repo_root)
git_agents = build_git_agent_map(repo_root)
title_index = build_title_index(all_files, repo_root)
domain_cache: dict[str, str] = {}
nodes = []
edges = []
node_ids = set()
all_files_set = set(all_files)
for fpath in all_files:
abs_path = os.path.join(repo_root, fpath)
try:
text = open(abs_path, encoding="utf-8").read()
except (OSError, UnicodeDecodeError):
continue
fm = parse_frontmatter(text)
body = extract_body(text)
# Filter by type
ftype = fm.get("type")
if ftype and ftype not in INCLUDE_TYPES:
continue
# Build node
title = os.path.basename(fpath)[:-3] # filename without .md
domain = fm.get("domain", "")
if not domain:
# Infer domain from directory path
parts = fpath.split(os.sep)
if len(parts) >= 2:
domain = parts[1] if parts[0] == "domains" else parts[1] if len(parts) > 2 else parts[0]
# Agent attribution: git log → domain mapping → "collective"
agent = git_agents.get(fpath, "")
if not agent:
agent = DOMAIN_AGENT_MAP.get(domain, "collective")
created = fm.get("created", "")
confidence = fm.get("confidence", "speculative")
# Detect challenged status
challenged_by_raw = fm.get("challenged_by", [])
if isinstance(challenged_by_raw, str):
challenged_by_raw = [challenged_by_raw] if challenged_by_raw else []
has_challenged_by = bool(challenged_by_raw and any(c for c in challenged_by_raw))
has_counter_section = bool(COUNTER_EVIDENCE_RE.search(body) or COUNTERARGUMENT_RE.search(body))
is_challenged = has_challenged_by or has_counter_section
# Extract challenge descriptions for the node
challenges = []
if isinstance(challenged_by_raw, list):
for c in challenged_by_raw:
if c and isinstance(c, str):
# Strip wiki-link syntax for display
cleaned = WIKILINK_RE.sub(lambda m: m.group(1), c)
# Strip markdown list artifacts: leading "- ", surrounding quotes
cleaned = re.sub(r'^-\s*', '', cleaned).strip()
cleaned = cleaned.strip('"').strip("'").strip()
if cleaned:
challenges.append(cleaned[:200]) # cap length
node = {
"id": fpath,
"title": title,
"domain": domain,
"agent": agent,
"created": created,
"confidence": confidence,
"challenged": is_challenged,
}
if challenges:
node["challenges"] = challenges
nodes.append(node)
node_ids.add(fpath)
domain_cache[fpath] = domain # cache for edge lookups
for link_text in WIKILINK_RE.findall(body):
target = resolve_wikilink(link_text, title_index, os.path.dirname(fpath))
if target and target != fpath and target in all_files_set:
target_domain = _get_domain_cached(target, repo_root, domain_cache)
edges.append({
"source": fpath,
"target": target,
"type": "wiki-link",
"cross_domain": domain != target_domain and bool(target_domain),
})
# Conflict edges from challenged_by (may contain [[wiki-links]] or prose)
challenged_by = fm.get("challenged_by", [])
if isinstance(challenged_by, str):
challenged_by = [challenged_by]
if isinstance(challenged_by, list):
for challenge in challenged_by:
if not challenge:
continue
# Check for embedded wiki-links
for link_text in WIKILINK_RE.findall(challenge):
target = resolve_wikilink(link_text, title_index, os.path.dirname(fpath))
if target and target != fpath and target in all_files_set:
target_domain = _get_domain_cached(target, repo_root, domain_cache)
edges.append({
"source": fpath,
"target": target,
"type": "conflict",
"cross_domain": domain != target_domain and bool(target_domain),
})
# Deduplicate edges
seen_edges = set()
unique_edges = []
for e in edges:
key = (e["source"], e["target"], e.get("type", ""))
if key not in seen_edges:
seen_edges.add(key)
unique_edges.append(e)
# Only keep edges where both endpoints exist as nodes
edges_filtered = [
e for e in unique_edges
if e["source"] in node_ids and e["target"] in node_ids
]
events = extract_events(repo_root)
return {
"nodes": nodes,
"edges": edges_filtered,
"events": sorted(events, key=lambda e: e.get("date", "")),
"domain_colors": DOMAIN_COLORS,
}
def build_claims_context(repo_root: str, nodes: list[dict]) -> dict:
"""Build claims-context.json for chat system prompt injection.
Produces a lightweight claim index: title + description + domain + agent + confidence.
Sorted by domain, then alphabetically within domain.
Target: ~37KB for ~370 claims. Truncates descriptions at 100 chars if total > 100KB.
"""
claims = []
for node in nodes:
fpath = node["id"]
abs_path = os.path.join(repo_root, fpath)
description = ""
try:
text = open(abs_path, encoding="utf-8").read()
fm = parse_frontmatter(text)
description = fm.get("description", "")
except (OSError, UnicodeDecodeError):
pass
claims.append({
"title": node["title"],
"description": description,
"domain": node["domain"],
"agent": node["agent"],
"confidence": node["confidence"],
})
# Sort by domain, then title
claims.sort(key=lambda c: (c["domain"], c["title"]))
context = {
"generated": datetime.now(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"claimCount": len(claims),
"claims": claims,
}
# Progressive description truncation if over 100KB.
# Never drop descriptions entirely — short descriptions are better than none.
for max_desc in (120, 100, 80, 60):
test_json = json.dumps(context, ensure_ascii=False)
if len(test_json) <= 100_000:
break
for c in claims:
if len(c["description"]) > max_desc:
c["description"] = c["description"][:max_desc] + "..."
return context
def main():
parser = argparse.ArgumentParser(description="Extract graph data from teleo-codex")
parser.add_argument("--output", "-o", default="graph-data.json",
help="Output file path (default: graph-data.json)")
parser.add_argument("--context-output", "-c", default=None,
help="Output claims-context.json path (default: same dir as --output)")
parser.add_argument("--repo", "-r", default=".",
help="Path to teleo-codex repo root (default: current dir)")
args = parser.parse_args()
repo_root = os.path.abspath(args.repo)
if not os.path.isdir(os.path.join(repo_root, "core")):
print(f"Error: {repo_root} doesn't look like a teleo-codex repo (no core/ dir)", file=sys.stderr)
sys.exit(1)
print(f"Scanning {repo_root}...")
graph = extract_graph(repo_root)
print(f" Nodes: {len(graph['nodes'])}")
print(f" Edges: {len(graph['edges'])}")
print(f" Events: {len(graph['events'])}")
challenged_count = sum(1 for n in graph["nodes"] if n.get("challenged"))
print(f" Challenged: {challenged_count}")
# Write graph-data.json
output_path = os.path.abspath(args.output)
with open(output_path, "w", encoding="utf-8") as f:
json.dump(graph, f, indent=2, ensure_ascii=False)
size_kb = os.path.getsize(output_path) / 1024
print(f" graph-data.json: {output_path} ({size_kb:.1f} KB)")
# Write claims-context.json
context_path = args.context_output
if not context_path:
context_path = os.path.join(os.path.dirname(output_path), "claims-context.json")
context_path = os.path.abspath(context_path)
context = build_claims_context(repo_root, graph["nodes"])
with open(context_path, "w", encoding="utf-8") as f:
json.dump(context, f, indent=2, ensure_ascii=False)
ctx_kb = os.path.getsize(context_path) / 1024
print(f" claims-context.json: {context_path} ({ctx_kb:.1f} KB)")
if __name__ == "__main__":
main()

View file

@ -1,368 +0,0 @@
#!/bin/bash
# Run a self-directed research session for one agent.
# Usage: ./research-session.sh <agent-name>
# Example: ./research-session.sh clay
#
# What it does:
# 1. Pulls latest tweets from the agent's network accounts (X API)
# 2. Gives Claude the agent's identity, beliefs, and current KB state
# 3. Agent picks a research direction and archives sources with notes
# 4. Commits source archives to a branch, pushes, opens PR
# 5. Extract cron picks up the unprocessed sources separately
#
# The researcher never extracts — a separate Claude instance does that.
# This prevents motivated reasoning in extraction.
set -euo pipefail
AGENT="${1:?Usage: $0 <agent-name>}"
REPO_DIR="/opt/teleo-eval/workspaces/research-${AGENT}"
FORGEJO_URL="http://localhost:3000"
FORGEJO_ADMIN_TOKEN=$(cat /opt/teleo-eval/secrets/forgejo-admin-token)
AGENT_TOKEN=$(cat "/opt/teleo-eval/secrets/forgejo-${AGENT}-token" 2>/dev/null || echo "$FORGEJO_ADMIN_TOKEN")
TWITTER_API_KEY=$(cat /opt/teleo-eval/secrets/twitterapi-io-key)
CLAUDE_BIN="/home/teleo/.local/bin/claude"
LOG_DIR="/opt/teleo-eval/logs"
LOG="$LOG_DIR/research-${AGENT}.log"
LOCKFILE="/tmp/research-${AGENT}.lock"
DATE=$(date +%Y-%m-%d)
BRANCH="${AGENT}/research-${DATE}"
RAW_DIR="/opt/teleo-eval/research-raw/${AGENT}"
log() { echo "[$(date -Iseconds)] $*" >> "$LOG"; }
# --- Lock (prevent concurrent sessions for same agent) ---
if [ -f "$LOCKFILE" ]; then
pid=$(cat "$LOCKFILE" 2>/dev/null)
if kill -0 "$pid" 2>/dev/null; then
log "SKIP: research session already running for $AGENT (pid $pid)"
exit 0
fi
log "WARN: stale lockfile for $AGENT, removing"
rm -f "$LOCKFILE"
fi
echo $$ > "$LOCKFILE"
TWEET_FILE="/tmp/research-tweets-${AGENT}.md"
trap 'rm -f "$LOCKFILE" "$TWEET_FILE"' EXIT
log "=== Starting research session for $AGENT ==="
# --- Ensure directories ---
mkdir -p "$RAW_DIR" "$LOG_DIR"
# --- Clone or update repo ---
if [ ! -d "$REPO_DIR/.git" ]; then
log "Cloning repo for $AGENT research..."
git -c http.extraHeader="Authorization: token $FORGEJO_ADMIN_TOKEN" \
clone "${FORGEJO_URL}/teleo/teleo-codex.git" "$REPO_DIR" >> "$LOG" 2>&1
fi
cd "$REPO_DIR"
git config credential.helper "!f() { echo username=m3taversal; echo password=$FORGEJO_ADMIN_TOKEN; }; f"
git remote set-url origin "${FORGEJO_URL}/teleo/teleo-codex.git" 2>/dev/null || true
git checkout main >> "$LOG" 2>&1
git pull --rebase >> "$LOG" 2>&1
# --- Map agent to domain ---
case "$AGENT" in
rio) DOMAIN="internet-finance" ;;
clay) DOMAIN="entertainment" ;;
theseus) DOMAIN="ai-alignment" ;;
vida) DOMAIN="health" ;;
astra) DOMAIN="space-development" ;;
leo) DOMAIN="grand-strategy" ;;
*) log "ERROR: Unknown agent $AGENT"; exit 1 ;;
esac
# --- Pull tweets from agent's network ---
# Check if agent has a network file in the repo
NETWORK_FILE="agents/${AGENT}/network.json"
if [ ! -f "$NETWORK_FILE" ]; then
log "No network file at $NETWORK_FILE — agent will use KB context to decide what to research"
TWEET_DATA=""
else
log "Pulling tweets from ${AGENT}'s network..."
ACCOUNTS=$(python3 -c "
import json
with open('$NETWORK_FILE') as f:
data = json.load(f)
for acct in data.get('accounts', []):
if acct.get('tier') in ('core', 'extended'):
print(acct['username'])
" 2>/dev/null || true)
TWEET_DATA=""
API_CALLS=0
API_CACHED=0
for USERNAME in $ACCOUNTS; do
# Validate username (Twitter handles are alphanumeric + underscore only)
if [[ ! "$USERNAME" =~ ^[a-zA-Z0-9_]+$ ]]; then
log "WARN: Invalid username '$USERNAME' in network file, skipping"
continue
fi
OUTFILE="$RAW_DIR/${USERNAME}.json"
# Only pull if file doesn't exist or is older than 12 hours
if [ ! -f "$OUTFILE" ] || [ $(find "$OUTFILE" -mmin +720 2>/dev/null | wc -l) -gt 0 ]; then
log "Pulling @${USERNAME}..."
curl -s "https://api.twitterapi.io/twitter/user/last_tweets?userName=${USERNAME}" \
-H "X-API-Key: ${TWITTER_API_KEY}" \
-o "$OUTFILE" 2>/dev/null || {
log "WARN: Failed to pull @${USERNAME}"
continue
}
API_CALLS=$((API_CALLS + 1))
sleep 2 # Rate limit courtesy
else
API_CACHED=$((API_CACHED + 1))
fi
if [ -f "$OUTFILE" ]; then
TWEET_DATA="${TWEET_DATA}
--- @${USERNAME} tweets ---
$(python3 -c "
import json, sys
try:
d = json.load(open('$OUTFILE'))
tweets = d.get('tweets', d.get('data', []))
for t in tweets[:20]:
text = t.get('text', '')[:500]
likes = t.get('likeCount', t.get('public_metrics', {}).get('like_count', 0))
date = t.get('createdAt', t.get('created_at', 'unknown'))
url = t.get('twitterUrl', t.get('url', ''))
print(f'[{date}] ({likes} likes) {text}')
print(f' URL: {url}')
print()
except Exception as e:
print(f'Error reading: {e}', file=sys.stderr)
" 2>/dev/null || echo "(failed to parse)")"
fi
done
log "API usage: ${API_CALLS} calls, ${API_CACHED} cached for ${AGENT}"
# Append to cumulative usage log (create with header if new)
USAGE_CSV="/opt/teleo-eval/logs/x-api-usage.csv"
if [ ! -f "$USAGE_CSV" ]; then
echo "date,agent,api_calls,cached,accounts_total" > "$USAGE_CSV"
fi
ACCOUNT_COUNT=$(echo "$ACCOUNTS" | wc -w | tr -d ' ')
echo "${DATE},${AGENT},${API_CALLS},${API_CACHED},${ACCOUNT_COUNT}" >> "$USAGE_CSV"
fi
# --- Also check for any raw JSON dumps in inbox-raw ---
INBOX_RAW="/opt/teleo-eval/inbox-raw/${AGENT}"
if [ -d "$INBOX_RAW" ] && ls "$INBOX_RAW"/*.json 2>/dev/null | head -1 > /dev/null; then
log "Found raw dumps in $INBOX_RAW"
for RAWFILE in "$INBOX_RAW"/*.json; do
USERNAME=$(basename "$RAWFILE" .json)
TWEET_DATA="${TWEET_DATA}
--- @${USERNAME} tweets (from raw dump) ---
$(python3 -c "
import json, sys
try:
d = json.load(open('$RAWFILE'))
tweets = d.get('tweets', d.get('data', []))
for t in tweets[:20]:
text = t.get('text', '')[:500]
likes = t.get('likeCount', t.get('public_metrics', {}).get('like_count', 0))
date = t.get('createdAt', t.get('created_at', 'unknown'))
url = t.get('twitterUrl', t.get('url', ''))
print(f'[{date}] ({likes} likes) {text}')
print(f' URL: {url}')
print()
except Exception as e:
print(f'Error: {e}', file=sys.stderr)
" 2>/dev/null || echo "(failed to parse)")"
done
fi
# --- Create branch ---
git branch -D "$BRANCH" 2>/dev/null || true
git checkout -b "$BRANCH" >> "$LOG" 2>&1
log "On branch $BRANCH"
# --- Build the research prompt ---
# Write tweet data to a temp file so Claude can read it
echo "$TWEET_DATA" > "$TWEET_FILE"
RESEARCH_PROMPT="You are ${AGENT}, a Teleo knowledge base agent. Domain: ${DOMAIN}.
## Your Task: Self-Directed Research Session
You have ~90 minutes of compute. Use it wisely.
### Step 1: Orient (5 min)
Read these files to understand your current state:
- agents/${AGENT}/identity.md (who you are)
- agents/${AGENT}/beliefs.md (what you believe)
- agents/${AGENT}/reasoning.md (how you think)
- domains/${DOMAIN}/_map.md (your domain's current claims)
### Step 2: Review Recent Tweets (10 min)
Read ${TWEET_FILE} — these are recent tweets from accounts in your domain.
Scan for anything substantive: new claims, evidence, debates, data, counterarguments.
### Step 3: Check Previous Follow-ups (2 min)
Read agents/${AGENT}/musings/ — look for any previous research-*.md files. If they exist, check the 'Follow-up Directions' section at the bottom. These are threads your past self flagged but didn't have time to cover. Give them priority when picking your direction.
### Step 4: Pick ONE Research Question (5 min)
Pick ONE research question — not one topic, but one question that naturally spans multiple accounts and sources. 'How is capital flowing through Solana launchpads?' is one question even though it touches MetaDAO, SOAR, Futardio.
**Direction selection priority** (active inference — pursue surprise, not confirmation):
1. Follow-up ACTIVE THREADS from previous sessions (your past self flagged these)
2. Claims rated 'experimental' or areas where the KB flags live tensions — highest uncertainty = highest learning value
3. Evidence that CHALLENGES your beliefs, not confirms them
4. Cross-domain connections flagged by other agents
5. New developments that change the landscape
Also read agents/${AGENT}/research-journal.md if it exists — this is your cross-session pattern tracker.
Write a brief note explaining your choice to: agents/${AGENT}/musings/research-${DATE}.md
### Step 5: Archive Sources (60 min)
For each relevant tweet/thread, create an archive file:
Path: inbox/archive/YYYY-MM-DD-{author-handle}-{brief-slug}.md
Use this frontmatter:
---
type: source
title: \"Descriptive title\"
author: \"Display Name (@handle)\"
url: https://original-url
date: YYYY-MM-DD
domain: ${DOMAIN}
secondary_domains: []
format: tweet | thread
status: unprocessed
priority: high | medium | low
tags: [topic1, topic2]
---
## Content
[Full text of tweet/thread]
## Agent Notes
**Why this matters:** [1-2 sentences]
**What surprised me:** [Anything unexpected — the extractor needs this to avoid confirming your priors]
**What I expected but didn't find:** [Gaps or missing evidence you noticed]
**KB connections:** [Which existing claims relate?]
**Extraction hints:** [What claims might an extractor pull?]
**Context:** [Who is the author, what debate is this part of?]
## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: [exact claim title this source most relates to]
WHY ARCHIVED: [what pattern or tension this evidences]
EXTRACTION HINT: [what the extractor should focus on — scopes attention]
### Step 5 Rules:
- Archive EVERYTHING substantive, not just what supports your views
- Set all sources to status: unprocessed (a DIFFERENT instance will extract)
- Flag cross-domain sources with flagged_for_{agent}: [\"reason\"]
- Do NOT extract claims yourself — write good notes so the extractor can
- Check inbox/archive/ for duplicates before creating new archives
- Aim for 5-15 source archives per session
### Step 6: Flag Follow-up Directions (5 min)
At the bottom of your research musing (agents/${AGENT}/musings/research-${DATE}.md), add a section:
## Follow-up Directions
Three categories — be specific, not vague:
### Active Threads (continue next session)
- [Thread]: [What to do next, what you'd look for]
### Dead Ends (don't re-run these)
- [What you searched for]: [Why it was empty — saves future you from wasting time]
### Branching Points (one finding opened multiple directions)
- [Finding]: [Direction A vs Direction B — which to pursue first and why]
### Step 7: Update Research Journal (3 min)
Append to agents/${AGENT}/research-journal.md (create if it doesn't exist). This is your cross-session memory — NOT the same as the musing.
Format:
## Session ${DATE}
**Question:** [your research question]
**Key finding:** [most important thing you learned]
**Pattern update:** [did this session confirm, challenge, or extend a pattern you've been tracking?]
**Confidence shift:** [did any of your beliefs get stronger or weaker?]
The journal accumulates session over session. After 5+ sessions, review it for cross-session patterns — when independent sources keep converging on the same observation, that's a claim candidate.
### Step 8: Stop
When you've finished archiving sources, updating your musing, and writing the research journal entry, STOP. Do not try to commit or push — the script handles all git operations after you finish."
# --- Run Claude research session ---
log "Starting Claude research session..."
timeout 5400 "$CLAUDE_BIN" -p "$RESEARCH_PROMPT" \
--allowedTools 'Read,Write,Edit,Glob,Grep' \
--model sonnet \
--permission-mode bypassPermissions \
>> "$LOG" 2>&1 || {
log "WARN: Research session failed or timed out for $AGENT"
git checkout main >> "$LOG" 2>&1
exit 1
}
log "Claude session complete"
# --- Check for changes ---
CHANGED_FILES=$(git status --porcelain)
if [ -z "$CHANGED_FILES" ]; then
log "No sources archived by $AGENT"
git checkout main >> "$LOG" 2>&1
exit 0
fi
# --- Stage and commit ---
git add inbox/archive/ agents/${AGENT}/musings/ agents/${AGENT}/research-journal.md 2>/dev/null || true
if git diff --cached --quiet; then
log "No valid changes to commit"
git checkout main >> "$LOG" 2>&1
exit 0
fi
AGENT_UPPER=$(echo "$AGENT" | sed 's/./\U&/')
SOURCE_COUNT=$(git diff --cached --name-only | grep -c "^inbox/archive/" || echo "0")
git commit -m "${AGENT}: research session ${DATE}${SOURCE_COUNT} sources archived
Pentagon-Agent: ${AGENT_UPPER} <HEADLESS>" >> "$LOG" 2>&1
# --- Push ---
git push -u origin "$BRANCH" --force >> "$LOG" 2>&1
log "Pushed $BRANCH"
# --- Check for existing PR on this branch ---
EXISTING_PR=$(curl -s "${FORGEJO_URL}/api/v1/repos/teleo/teleo-codex/pulls?state=open" \
-H "Authorization: token $AGENT_TOKEN" \
| jq -r ".[] | select(.head.ref == \"$BRANCH\") | .number" 2>/dev/null)
if [ -n "$EXISTING_PR" ]; then
log "PR already exists for $BRANCH (#$EXISTING_PR), skipping creation"
else
# --- Open PR ---
PR_JSON=$(jq -n \
--arg title "${AGENT}: research session ${DATE}" \
--arg body "## Self-Directed Research
Automated research session for ${AGENT} (${DOMAIN}).
Sources archived with status: unprocessed — extract cron will handle claim extraction separately.
Researcher and extractor are different Claude instances to prevent motivated reasoning." \
--arg base "main" \
--arg head "$BRANCH" \
'{title: $title, body: $body, base: $base, head: $head}')
PR_RESULT=$(curl -s -X POST "${FORGEJO_URL}/api/v1/repos/teleo/teleo-codex/pulls" \
-H "Authorization: token $AGENT_TOKEN" \
-H "Content-Type: application/json" \
-d "$PR_JSON" 2>&1)
PR_NUMBER=$(echo "$PR_RESULT" | jq -r '.number // "unknown"' 2>/dev/null || echo "unknown")
log "PR #${PR_NUMBER} opened for ${AGENT}'s research session"
fi
# --- Back to main ---
git checkout main >> "$LOG" 2>&1
log "=== Research session complete for $AGENT ==="

View file

@ -1,169 +0,0 @@
# Self-Directed Research Architecture
Draft — Leo, 2026-03-10
## Core Idea
Each agent gets a daily research session on the VPS. They autonomously pull tweets from their domain accounts, decide what's interesting, archive sources with notes, and push to inbox. A separate extraction cron (already running) picks up the archives and makes claims. The researcher never sees the extraction — preventing motivated reasoning.
## Why Separate Researcher and Extractor
When the same agent researches and extracts, they prime themselves. The researcher finds a tweet they think supports a thesis → writes notes emphasizing that angle → extracts a claim that confirms the thesis. The extraction becomes a formality.
Separation breaks this:
- **Researcher** writes: "This tweet is about X, connects to Y, might challenge Z"
- **Extractor** (different Claude instance, fresh context) reads the source and notes, extracts what's actually there
- Neither has the other's context window or priming
This mirrors our proposer-evaluator separation for claims, applied one layer earlier in the pipeline.
## Architecture
### Three cron stages on VPS
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Research Cron │────▶│ Extract Cron │────▶│ Eval Pipeline │
│ (daily, 2hr) │ │ (every 5 min) │ │ (webhook.py) │
│ │ │ │ │ │
│ Pull tweets │ │ Read archives │ │ Review claims │
│ Pick 1 task │ │ Extract claims │ │ Approve/reject │
│ Archive sources │ │ Open PR │ │ Merge │
│ Push branch+PR │ │ │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
### Research Cron: `research-session.sh`
**Schedule:** Once daily, staggered across agents to respect rate limits
```
# Stagger: each agent gets a 90-min window, overnight PST (10pm-7am)
0 22 * * * /opt/teleo-eval/research-session.sh rio
30 23 * * * /opt/teleo-eval/research-session.sh clay
0 1 * * * /opt/teleo-eval/research-session.sh theseus
30 2 * * * /opt/teleo-eval/research-session.sh vida
0 4 * * * /opt/teleo-eval/research-session.sh astra
30 5 * * * /opt/teleo-eval/research-session.sh leo
```
**Per agent, the research session (~90 min):**
1. Pull latest tweets from agent's network accounts (X API)
2. Read the agent's beliefs, recent claims, open positions
3. Claude prompt: "You are {agent}. Here are your latest tweets from {accounts}. Here is your current knowledge state. Pick ONE research direction that advances your domain understanding. Archive the most relevant sources with notes."
4. Agent writes source archives to `inbox/archive/` with `status: unprocessed`
5. Commit, push to branch, open PR (source-only, no claims)
6. Extract cron picks them up within 5 minutes
**Key constraint:** One Claude session per agent, ~90 minutes, Sonnet model. Total daily VPS research compute: ~9 hours of sequential Sonnet sessions (staggered overnight).
### Research Prompt Structure
```
You are {agent}, a Teleo knowledge base agent specializing in {domain}.
## Your Current State
{Read from agents/{agent}/beliefs.md, reasoning.md, positions/}
## Your Network
{Read from network file — accounts to monitor}
## Recent Tweets
{Raw tweet data pulled from X API}
## Your Task
1. Scan these tweets for anything substantive — new claims, evidence,
debates, data, counterarguments to existing KB positions
2. Pick ONE research direction that would most advance your domain
understanding right now. Consider:
- Gaps in your beliefs that need evidence
- Claims in the KB that might be wrong
- Cross-domain connections you've been flagged about
- New developments that change the landscape
3. Archive the relevant sources (5-15 per session) following the
inbox/archive format with full agent notes
4. Write a brief research summary explaining what you found and why
it matters
## Rules
- Archive EVERYTHING substantive, not just what supports your views
- Write honest agent notes — flag what challenges your beliefs too
- Set all sources to status: unprocessed (a different instance extracts)
- Flag cross-domain sources for other agents
- Do NOT extract claims yourself — that's a separate process
```
### Capacity on Claude Max ($200/month)
**VPS compute budget (all Sonnet):**
- Research cron: 6 agents × 90 min/day = 9 hr/day (overnight)
- Extract cron: ~37 sources × 10 min = 6 hr one-time backlog, then ~1 hr/day steady-state
- Eval pipeline: ~10 PRs/day × 15 min = 2.5 hr/day
- **Total VPS:** ~6.5 hr/day Sonnet (steady state)
**Laptop compute budget (Opus + Sonnet mix):**
- Agent sessions: 2-3 concurrent, ~4-6 hr/day
- Leo coordination: ~1-2 hr/day
**Single subscription feasibility:** Tight but workable if:
- VPS runs overnight (2am-8am staggered research + continuous extraction)
- Laptop agents run during the day
- Never more than 2-3 concurrent sessions total
- VPS uses Sonnet exclusively (cheaper rate limits)
**Risk:** If rate limits tighten or daily message caps exist, the VPS research cron may not complete all 6 agents. Mitigation: priority ordering (run the 3 most active agents daily, others every 2-3 days).
## Contributor Workflow Options
Different people want different levels of involvement:
### Mode 1: Full Researcher
"I found this, here's why it matters, here are the KB connections"
- Uses /ingest on laptop (Track A or B)
- Writes detailed agent notes
- May extract claims themselves
- Highest quality input
### Mode 2: Curator
"Here's a source, it's about X domain"
- Minimal archive file with domain tag and brief notes
- VPS extracts (Track B)
- Good enough for most sources
### Mode 3: Raw Dump
"Here are tweets, figure it out"
- Dumps raw JSON to VPS inbox-raw/
- Leo triages: decides domain, writes archive files
- VPS extracts from Leo's archives
- Lowest effort, decent quality (Leo's triage catches the important stuff)
### Mode 4: Self-Directed Agent (VPS)
"Agent, go research your domain"
- No human involvement beyond initial network setup
- Daily cron pulls tweets, agent picks direction, archives, extraction follows
- Quality depends on prompt engineering + eval pipeline catching errors
All four modes feed into the same extraction → eval pipeline. Quality varies, but the eval pipeline is the quality gate regardless.
## Open Questions
1. **Rate limits**: What are the actual Claude Max per-minute and per-day limits for headless Sonnet sessions? Need empirical data from this first extraction run.
2. **Research quality**: Will a 30-minute Sonnet session produce good enough research notes? Or does research require Opus-level reasoning?
3. **Network bootstrapping**: Agents need network files. Who curates the initial account lists? (Currently Cory + Leo, eventually agents propose additions)
4. **Cross-domain routing**: When the research cron finds cross-domain content, should it archive under the researcher's domain or the correct domain? (Probably correct domain with flagged_for_{researcher})
5. **Feedback loop**: How does extraction quality feed back to improve research notes? If the extractor consistently ignores certain types of notes, the researcher should learn.
6. **Deduplication across agents**: Multiple agents may archive the same tweet (e.g., a Karpathy tweet relevant to both AI systems and collective intelligence). The extract cron needs to detect this.
## Implementation Order
1. ✅ Extract cron (running now — validating extraction quality)
2. **Next**: Research cron — daily self-directed sessions per agent
3. **Then**: Raw dump path — Leo triage from JSON → archive
4. **Later**: Full end-to-end with X API pull integrated into research cron
5. **Eventually**: Feedback loops from eval quality → research prompt tuning

View file

@ -1,201 +0,0 @@
# Skill: Ingest
Research your domain, find source material, and archive it in inbox/. You choose whether to extract claims yourself or let the VPS handle it.
**Archive everything.** The inbox is a library, not a filter. If it's relevant to any Teleo domain, archive it. Null-result sources (no extractable claims) are still valuable — they prevent duplicate work and build domain context.
## Usage
```
/ingest # Research loop: pull tweets, find sources, archive with notes
/ingest @username # Pull and archive a specific X account's content
/ingest url <url> # Archive a paper, article, or thread from URL
/ingest scan # Scan your network for new content since last pull
/ingest extract # Extract claims from sources you've already archived (Track A)
```
## Two Tracks
### Track A: Agent-driven extraction (full control)
You research, archive, AND extract. You see exactly what you're proposing before it goes up.
1. Archive sources with `status: processing`
2. Extract claims yourself using `skills/extract.md`
3. Open a PR with both source archives and claim files
4. Eval pipeline reviews your claims
**Use when:** You're doing a deep dive on a specific topic, care about extraction quality, or want to control the narrative around new claims.
### Track B: VPS extraction (hands-off)
You research and archive. The VPS extracts headlessly.
1. Archive sources with `status: unprocessed`
2. Push source-only PR (merges fast — no claim changes)
3. VPS cron picks up unprocessed sources every 15 minutes
4. Extracts claims via Claude headless, opens a separate PR
5. Eval pipeline reviews the extraction
**Use when:** You're batch-archiving many sources, the content is straightforward, or you want to focus your session time on research rather than extraction.
### The switch is the status field
| Status | What happens |
|--------|-------------|
| `unprocessed` | VPS will extract (Track B) |
| `processing` | You're handling it (Track A) — VPS skips this source |
| `processed` | Already extracted — no further action |
| `null-result` | Reviewed, no claims — no further action |
You can mix tracks freely. Archive 10 sources as `unprocessed` for the VPS, then set 2 high-priority ones to `processing` and extract those yourself.
## Prerequisites
- API key at `~/.pentagon/secrets/twitterapi-io-key`
- Your network file at `~/.pentagon/workspace/collective/x-ingestion/{your-name}-network.json`
- Forgejo token at `~/.pentagon/secrets/forgejo-{your-name}-token`
## The Loop
### Step 1: Research
Find source material relevant to your domain. Sources include:
- **X/Twitter** — tweets, threads, debates from your network accounts
- **Papers** — academic papers, preprints, whitepapers
- **Articles** — blog posts, newsletters, news coverage
- **Reports** — industry reports, data releases, government filings
- **Conversations** — podcast transcripts, interview notes, voicenote transcripts
For X accounts, use `/x-research pull @{username}` to pull tweets, then scan for anything worth archiving. Don't just archive the "best" tweets — archive anything substantive. A thread arguing a wrong position is as valuable as one arguing a right one.
### Step 2: Archive with notes
For each source, create an archive file on your branch:
**Filename:** `inbox/archive/YYYY-MM-DD-{author-handle}-{brief-slug}.md`
```yaml
---
type: source
title: "Descriptive title of the content"
author: "Display Name (@handle)"
twitter_id: "numeric_id_from_author_object" # X sources only
url: https://original-url
date: YYYY-MM-DD
domain: internet-finance | entertainment | ai-alignment | health | space-development | grand-strategy
secondary_domains: [other-domain] # if cross-domain
format: tweet | thread | essay | paper | whitepaper | report | newsletter | news | transcript
status: unprocessed | processing # unprocessed = VPS extracts; processing = you extract
priority: high | medium | low
tags: [topic1, topic2]
flagged_for_rio: ["reason"] # if relevant to another agent's domain
---
```
**Body:** Include the full source text, then your research notes.
```markdown
## Content
[Full text of tweet/thread/article. For long papers, include abstract + key sections.]
## Agent Notes
**Why this matters:** [1-2 sentences — what makes this worth archiving]
**KB connections:** [Which existing claims does this relate to, support, or challenge?]
**Extraction hints:** [What claims might the extractor pull from this? Flag specific passages.]
**Context:** [Anything the extractor needs to know — who the author is, what debate this is part of, etc.]
```
The "Agent Notes" section is critical for Track B. The VPS extractor is good at mechanical extraction but lacks your domain context. Your notes guide it. For Track A, you still benefit from writing notes — they organize your thinking before extraction.
### Step 3: Extract claims (Track A only)
If you set `status: processing`, follow `skills/extract.md`:
1. Read the source completely
2. Separate evidence from interpretation
3. Extract candidate claims (specific, disagreeable, evidence-backed)
4. Check for duplicates against existing KB
5. Write claim files to `domains/{your-domain}/`
6. Update source: `status: processed`, `processed_by`, `processed_date`, `claims_extracted`
### Step 4: Cross-domain flagging
When you find sources outside your domain:
- Archive them anyway (you're already reading them)
- Set the `domain` field to the correct domain, not yours
- Add `flagged_for_{agent}: ["brief reason"]` to frontmatter
- Set `priority: high` if it's urgent or challenges existing claims
### Step 5: Branch, commit, push
```bash
# Branch
git checkout -b {your-name}/sources-{date}-{brief-slug}
# Stage — sources only (Track B) or sources + claims (Track A)
git add inbox/archive/*.md
git add domains/{your-domain}/*.md # Track A only
# Commit
git commit -m "{your-name}: archive {N} sources — {brief description}
- What: {N} sources from {list of authors/accounts}
- Domains: {which domains these cover}
- Track: A (agent-extracted) | B (VPS extraction pending)
Pentagon-Agent: {Name} <{UUID}>"
# Push
FORGEJO_TOKEN=$(cat ~/.pentagon/secrets/forgejo-{your-name}-token)
git push -u https://{your-name}:${FORGEJO_TOKEN}@git.livingip.xyz/teleo/teleo-codex.git {branch-name}
```
Open a PR:
```bash
curl -s -X POST "https://git.livingip.xyz/api/v1/repos/teleo/teleo-codex/pulls" \
-H "Authorization: token ${FORGEJO_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"title": "{your-name}: {archive N sources | extract N claims} — {brief description}",
"body": "## Sources\n{numbered list with titles and domains}\n\n## Claims (Track A only)\n{claim titles}\n\n## Track B sources (VPS extraction pending)\n{list of unprocessed sources}",
"base": "main",
"head": "{branch-name}"
}'
```
## Network Management
Your network file (`{your-name}-network.json`) lists X accounts to monitor:
```json
{
"agent": "your-name",
"domain": "your-domain",
"accounts": [
{"username": "example", "tier": "core", "why": "Reason this account matters"},
{"username": "example2", "tier": "extended", "why": "Secondary but useful"}
]
}
```
**Tiers:**
- `core` — Pull every session. High signal-to-noise.
- `extended` — Pull weekly or when specifically relevant.
- `watch` — Pull once to evaluate, then promote or drop.
Agents without a network file should create one as their first task. Start with 5-10 seed accounts.
## Quality Controls
- **Archive everything substantive.** Don't self-censor. The extractor decides what yields claims.
- **Write good notes.** Your domain context is the difference between a useful source and a pile of text.
- **Check for duplicates.** Don't re-archive sources already in `inbox/archive/`.
- **Flag cross-domain.** If you see something relevant to another agent, flag it — don't assume they'll find it.
- **Log API costs.** Every X pull gets logged to `~/.pentagon/workspace/collective/x-ingestion/pull-log.jsonl`.
- **Source diversity.** If you're archiving 10+ items from one account in a batch, note it — the extractor should be aware of monoculture risk.