Compare commits

..

7 commits

Author SHA1 Message Date
Leo
8344a5d259 Merge branch 'main' into m3taversal/leo-14ff9c29 2026-03-09 19:57:36 +00:00
131d939759 leo: add collective AI alignment section to README
- What: Added "Why AI agents" section explaining co-evolution, adversarial review, and structural safety
- Why: README described what agents do but not why collective AI matters for alignment
- Connections: Links to existing claims on alignment, coordination, collective intelligence

Pentagon-Agent: Leo <14FF9C29-CABF-40C8-8808-B0B495D03FF8>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 19:55:10 +00:00
c8bed09893 Auto: 2 files | 2 files changed, 20 insertions(+), 2 deletions(-) 2026-03-09 19:52:17 +00:00
44c6cc1454 Auto: README.md | 1 file changed, 52 insertions(+) 2026-03-09 19:51:44 +00:00
0dc9a68586 Auto: docs/ingestion-daemon-onboarding.md | 1 file changed, 144 insertions(+), 269 deletions(-) 2026-03-09 19:18:35 +00:00
5db0c660b2 Auto: docs/ingestion-daemon-onboarding.md | 1 file changed, 203 insertions(+), 77 deletions(-) 2026-03-09 19:12:22 +00:00
ec1da89f1f Auto: docs/ingestion-daemon-onboarding.md | 1 file changed, 227 insertions(+) 2026-03-09 19:10:24 +00:00
15 changed files with 443 additions and 1514 deletions

View file

@ -1,67 +0,0 @@
name: Sync Graph Data to teleo-app
# Runs on every merge to main. Extracts graph data from the codex and
# pushes graph-data.json + claims-context.json to teleo-app/public/.
# This triggers a Vercel rebuild automatically.
on:
push:
branches: [main]
paths:
- 'core/**'
- 'domains/**'
- 'foundations/**'
- 'convictions/**'
- 'ops/extract-graph-data.py'
workflow_dispatch: # manual trigger
jobs:
sync:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout teleo-codex
uses: actions/checkout@v4
with:
fetch-depth: 0 # full history for git log agent attribution
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Run extraction
run: |
python3 ops/extract-graph-data.py \
--repo . \
--output /tmp/graph-data.json \
--context-output /tmp/claims-context.json
- name: Checkout teleo-app
uses: actions/checkout@v4
with:
repository: living-ip/teleo-app
token: ${{ secrets.TELEO_APP_TOKEN }}
path: teleo-app
- name: Copy data files
run: |
cp /tmp/graph-data.json teleo-app/public/graph-data.json
cp /tmp/claims-context.json teleo-app/public/claims-context.json
- name: Commit and push to teleo-app
working-directory: teleo-app
run: |
git config user.name "teleo-codex-bot"
git config user.email "bot@livingip.io"
git add public/graph-data.json public/claims-context.json
if git diff --cached --quiet; then
echo "No changes to commit"
else
NODES=$(python3 -c "import json; d=json.load(open('public/graph-data.json')); print(len(d['nodes']))")
EDGES=$(python3 -c "import json; d=json.load(open('public/graph-data.json')); print(len(d['edges']))")
git commit -m "sync: graph data from teleo-codex ($NODES nodes, $EDGES edges)"
git push
fi

View file

@ -1,82 +1,6 @@
# Teleo Codex
# Teleo Codex — Agent Operating Manual
## For Visitors (read this first)
If you're exploring this repo with Claude Code, you're talking to a **collective knowledge base** maintained by 6 AI domain specialists. ~400 claims across 14 knowledge areas, all linked, all traceable from evidence through claims through beliefs to public positions.
### Orientation (run this on first visit)
Don't present a menu. Start a short conversation to figure out who this person is and what they care about.
**Step 1 — Ask what they work on or think about.** One question, open-ended. "What are you working on, or what's on your mind?" Their answer tells you which domain is closest.
**Step 2 — Map them to an agent.** Based on their answer, pick the best-fit agent:
| If they mention... | Route to |
|-------------------|----------|
| Finance, crypto, DeFi, DAOs, prediction markets, tokens | **Rio** — internet finance / mechanism design |
| Media, entertainment, creators, IP, culture, storytelling | **Clay** — entertainment / cultural dynamics |
| AI, alignment, safety, superintelligence, coordination | **Theseus** — AI / alignment / collective intelligence |
| Health, medicine, biotech, longevity, wellbeing | **Vida** — health / human flourishing |
| Space, rockets, orbital, lunar, satellites | **Astra** — space development |
| Strategy, systems thinking, cross-domain, civilization | **Leo** — grand strategy / cross-domain synthesis |
Tell them who you're loading and why: "Based on what you described, I'm going to think from [Agent]'s perspective — they specialize in [domain]. Let me load their worldview." Then load the agent (see instructions below).
**Step 3 — Surface something interesting.** Once loaded, search that agent's domain claims and find 3-5 that are most relevant to what the visitor said. Pick for surprise value — claims they're likely to find unexpected or that challenge common assumptions in their area. Present them briefly: title + one-sentence description + confidence level.
Then ask: "Any of these surprise you, or seem wrong?"
This gets them into conversation immediately. If they push back on a claim, you're in challenge mode. If they want to go deeper on one, you're in explore mode. If they share something you don't know, you're in teach mode. The orientation flows naturally into engagement.
**If they already know what they want:** Some visitors will skip orientation — they'll name an agent directly ("I want to talk to Rio") or ask a specific question. That's fine. Load the agent or answer the question. Orientation is for people who are exploring, not people who already know.
### What visitors can do
1. **Explore** — Ask what the collective (or a specific agent) thinks about any topic. Search the claims and give the grounded answer, with confidence levels and evidence.
2. **Challenge** — Disagree with a claim? Steelman the existing claim, then work through it together. If the counter-evidence changes your understanding, say so explicitly — that's the contribution. The conversation is valuable even if they never file a PR. Only after the conversation has landed, offer to draft a formal challenge for the knowledge base if they want it permanent.
3. **Teach** — They share something new. If it's genuinely novel, draft a claim and show it to them: "Here's how I'd write this up — does this capture it?" They review, edit, approve. Then handle the PR. Their attribution stays on everything.
4. **Propose** — They have their own thesis with evidence. Check it against existing claims, help sharpen it, draft it for their approval, and offer to submit via PR. See CONTRIBUTING.md for the manual path.
### How to behave as a visitor's agent
When the visitor picks an agent lens, load that agent's full context:
1. Read `agents/{name}/identity.md` — adopt their personality and voice
2. Read `agents/{name}/beliefs.md` — these are your active beliefs, cite them
3. Read `agents/{name}/reasoning.md` — this is how you evaluate new information
4. Read `agents/{name}/skills.md` — these are your analytical capabilities
5. Read `core/collective-agent-core.md` — this is your shared DNA
**You are that agent for the duration of the conversation.** Think from their perspective. Use their reasoning framework. Reference their beliefs. When asked about another domain, acknowledge the boundary and cite what that domain's claims say — but filter it through your agent's worldview.
**When the visitor teaches you something new:**
- Search the knowledge base for existing claims on the topic
- If the information is genuinely novel (not a duplicate, specific enough to disagree with, backed by evidence), say so
- **Draft the claim for them** — write the full claim (title, frontmatter, body, wiki links) and show it to them in the conversation. Say: "Here's how I'd write this up as a claim. Does this capture what you mean?"
- **Wait for their approval before submitting.** They may want to edit the wording, sharpen the argument, or adjust the scope. The visitor owns the claim — you're drafting, not deciding.
- Once they approve, use the `/contribute` skill or follow the proposer workflow to create the claim file and PR
- Always attribute the visitor as the source: `source: "visitor-name, original analysis"` or `source: "visitor-name via [article/paper title]"`
**When the visitor challenges a claim:**
- First, steelman the existing claim — explain the best case for it
- Then engage seriously with the counter-evidence. This is a real conversation, not a form to fill out.
- If the challenge changes your understanding, say so explicitly. Update how you reason about the topic in the conversation. The visitor should feel that talking to you was worth something even if they never touch git.
- Only after the conversation has landed, ask if they want to make it permanent: "This changed how I think about [X]. Want me to draft a formal challenge for the knowledge base?" If they say no, that's fine — the conversation was the contribution.
**Start here if you want to browse:**
- `maps/overview.md` — how the knowledge base is organized
- `core/epistemology.md` — how knowledge is structured (evidence → claims → beliefs → positions)
- Any `domains/{domain}/_map.md` — topic map for a specific domain
- Any `agents/{name}/beliefs.md` — what a specific agent believes and why
---
## Agent Operating Manual
*Everything below is operational protocol for the 6 named agents. If you're a visitor, you don't need to read further — the section above is for you.*
> **Exploring this repo?** Start with [README.md](README.md). Pick a domain, read a claim, follow the links. This file is for agents contributing to the knowledge base.
You are an agent in the Teleo collective — a group of AI domain specialists that build and maintain a shared knowledge base. This file tells you how the system works and what the rules are.

View file

@ -1,51 +1,45 @@
# Contributing to Teleo Codex
You're contributing to a living knowledge base maintained by AI agents. There are three ways to contribute — pick the one that fits what you have.
## Three contribution paths
### Path 1: Submit source material
You have an article, paper, report, or thread the agents should read. The agents extract claims — you get attribution.
### Path 2: Propose a claim directly
You have your own thesis backed by evidence. You write the claim yourself.
### Path 3: Challenge an existing claim
You think something in the knowledge base is wrong or missing nuance. You file a challenge with counter-evidence.
---
You're contributing to a living knowledge base maintained by AI agents. Your job is to bring in source material. The agents extract claims, connect them to existing knowledge, and review everything before it merges.
## What you need
- Git access to this repo (GitHub or Forgejo)
- GitHub account with collaborator access to this repo
- Git installed on your machine
- Claude Code (optional but recommended — it helps format claims and check for duplicates)
- A source to contribute (article, report, paper, thread, etc.)
## Path 1: Submit source material
## Step-by-step
This is the simplest contribution. You provide content; the agents do the extraction.
### 1. Clone and branch
### 1. Clone the repo (first time only)
```bash
git clone https://github.com/living-ip/teleo-codex.git
cd teleo-codex
git checkout main && git pull
```
### 2. Pull latest and create a branch
```bash
git checkout main
git pull origin main
git checkout -b contrib/your-name/brief-description
```
### 2. Create a source file
Example: `contrib/alex/ai-alignment-report`
Create a markdown file in `inbox/archive/`:
### 3. Create a source file
Create a markdown file in `inbox/archive/` with this naming convention:
```
inbox/archive/YYYY-MM-DD-author-handle-brief-slug.md
```
### 3. Add frontmatter + content
Example: `inbox/archive/2026-03-07-alex-ai-alignment-landscape.md`
### 4. Add frontmatter
Every source file starts with YAML frontmatter. Copy this template and fill it in:
```yaml
---
@ -59,169 +53,84 @@ format: report
status: unprocessed
tags: [topic1, topic2, topic3]
---
# Full title
[Paste the full content here. More content = better extraction.]
```
**Domain options:** `internet-finance`, `entertainment`, `ai-alignment`, `health`, `space-development`, `grand-strategy`
**Domain options:** `internet-finance`, `entertainment`, `ai-alignment`, `health`, `grand-strategy`
**Format options:** `essay`, `newsletter`, `tweet`, `thread`, `whitepaper`, `paper`, `report`, `news`
### 4. Commit, push, open PR
**Status:** Always set to `unprocessed` — the agents handle the rest.
### 5. Add the content
After the frontmatter, paste the full content of the source. This is what the agents will read and extract claims from. More content = better extraction.
```markdown
---
type: source
title: "AI Alignment in 2026: Where We Stand"
author: "Alex (@alexhandle)"
url: https://example.com/report
date: 2026-03-07
domain: ai-alignment
format: report
status: unprocessed
tags: [ai-alignment, openai, anthropic, safety, governance]
---
# AI Alignment in 2026: Where We Stand
[Full content of the report goes here. Include everything —
the agents need the complete text to extract claims properly.]
```
### 6. Commit and push
```bash
git add inbox/archive/your-file.md
git commit -m "contrib: add [brief description]
git commit -m "contrib: add AI alignment landscape report
Source: [brief description of what this is and why it matters]"
Source: [what this is and why it matters]"
git push -u origin contrib/your-name/brief-description
```
Then open a PR. The domain agent reads your source, extracts claims, Leo reviews, and they merge.
## Path 2: Propose a claim directly
You have domain expertise and want to state a thesis yourself — not just drop source material for agents to process.
### 1. Clone and branch
Same as Path 1.
### 2. Check for duplicates
Before writing, search the knowledge base for existing claims on your topic. Check:
- `domains/{relevant-domain}/` — existing domain claims
- `foundations/` — existing foundation-level claims
- Use grep or Claude Code to search claim titles semantically
### 3. Write your claim file
Create a markdown file in the appropriate domain folder. The filename is the slugified claim title.
```yaml
---
type: claim
domain: ai-alignment
description: "One sentence adding context beyond the title"
confidence: likely
source: "your-name, original analysis; [any supporting references]"
created: 2026-03-10
---
```
**The claim test:** "This note argues that [your title]" must work as a sentence. If it doesn't, your title isn't specific enough.
**Body format:**
```markdown
# [your prose claim title]
[Your argument — why this is supported, what evidence underlies it.
Cite sources, data, studies inline. This is where you make the case.]
**Scope:** [What this claim covers and what it doesn't]
---
Relevant Notes:
- [[existing-claim-title]] — how your claim relates to it
```
Wiki links (`[[claim title]]`) should point to real files in the knowledge base. Check that they resolve.
### 4. Commit, push, open PR
### 7. Open a PR
```bash
git add domains/{domain}/your-claim-file.md
git commit -m "contrib: propose claim — [brief title summary]
gh pr create --title "contrib: AI alignment landscape report" --body "Source material for agent extraction.
- What: [the claim in one sentence]
- Evidence: [primary evidence supporting it]
- Connections: [what existing claims this relates to]"
git push -u origin contrib/your-name/brief-description
- **What:** [one-line description]
- **Domain:** ai-alignment
- **Why it matters:** [why this adds value to the knowledge base]"
```
PR body should include your reasoning for why this adds value to the knowledge base.
Or just go to GitHub and click "Compare & pull request" after pushing.
The domain agent + Leo review your claim against the quality gates (see CLAUDE.md). They may approve, request changes, or explain why it doesn't meet the bar.
### 8. What happens next
## Path 3: Challenge an existing claim
1. **Theseus** (the ai-alignment agent) reads your source and extracts claims
2. **Leo** (the evaluator) reviews the extracted claims for quality
3. You'll see their feedback as PR comments
4. Once approved, the claims merge into the knowledge base
You think a claim in the knowledge base is wrong, overstated, missing context, or contradicted by evidence you have.
You can respond to agent feedback directly in the PR comments.
### 1. Identify the claim
## Your Credit
Find the claim file you're challenging. Note its exact title (the filename without `.md`).
### 2. Clone and branch
Same as above. Name your branch `contrib/your-name/challenge-brief-description`.
### 3. Write your challenge
You have two options:
**Option A — Enrich the existing claim** (if your evidence adds nuance but doesn't contradict):
Edit the existing claim file. Add a `challenged_by` field to the frontmatter and a **Challenges** section to the body:
```yaml
challenged_by:
- "your counter-evidence summary (your-name, date)"
```
```markdown
## Challenges
**[Your name] ([date]):** [Your counter-evidence or counter-argument.
Cite specific sources. Explain what the original claim gets wrong
or what scope it's missing.]
```
**Option B — Propose a counter-claim** (if your evidence supports a different conclusion):
Create a new claim file that explicitly contradicts the existing one. In the body, reference the claim you're challenging and explain why your evidence leads to a different conclusion. Add wiki links to the challenged claim.
### 4. Commit, push, open PR
```bash
git commit -m "contrib: challenge — [existing claim title, briefly]
- What: [what you're challenging and why]
- Counter-evidence: [your primary evidence]"
git push -u origin contrib/your-name/challenge-brief-description
```
The domain agent will steelman the existing claim before evaluating your challenge. If your evidence is strong, the claim gets updated (confidence lowered, scope narrowed, challenged_by added) or your counter-claim merges alongside it. The knowledge base holds competing perspectives — your challenge doesn't delete the original, it adds tension that makes the graph richer.
## Using Claude Code to contribute
If you have Claude Code installed, run it in the repo directory. Claude reads the CLAUDE.md visitor section and can:
- **Search the knowledge base** for existing claims on your topic
- **Check for duplicates** before you write a new claim
- **Format your claim** with proper frontmatter and wiki links
- **Validate wiki links** to make sure they resolve to real files
- **Suggest related claims** you should link to
Just describe what you want to contribute and Claude will help you through the right path.
## Your credit
Every contribution carries provenance. Source archives record who submitted them. Claims record who proposed them. Challenges record who filed them. As your contributions get cited by other claims, your impact is traceable through the knowledge graph. Contributions compound.
Your source archive records you as contributor. As claims derived from your submission get cited by other claims, your contribution's impact is traceable through the knowledge graph. Every claim extracted from your source carries provenance back to you — your contribution compounds as the knowledge base grows.
## Tips
- **More context is better.** For source submissions, paste the full text, not just a link.
- **Pick the right domain.** If it spans multiple, pick the primary one — agents flag cross-domain connections.
- **One source per file, one claim per file.** Atomic contributions are easier to review and link.
- **Original analysis is welcome.** Your own written analysis is as valid as citing someone else's work.
- **Confidence honestly.** If your claim is speculative, say so. Calibrated uncertainty is valued over false confidence.
- **More context is better.** Paste the full article/report, not just a link. Agents extract better from complete text.
- **Pick the right domain.** If your source spans multiple domains, pick the primary one — the agents will flag cross-domain connections.
- **One source per file.** Don't combine multiple articles into one file.
- **Original analysis welcome.** Your own written analysis/report is just as valid as linking to someone else's article. Put yourself as the author.
- **Don't extract claims yourself.** Just provide the source material. The agents handle extraction — that's their job.
## OPSEC
The knowledge base is public. Do not include dollar amounts, deal terms, valuations, or internal business details. Scrub before committing.
The knowledge base is public. Do not include dollar amounts, deal terms, valuations, or internal business details in any content. Scrub before committing.
## Questions?

View file

@ -1,47 +1,63 @@
# Teleo Codex
A knowledge base built by AI agents who specialize in different domains, take positions, disagree with each other, and update when they're wrong. Every claim traces from evidence through argument to public commitments — nothing is asserted without a reason.
Six AI agents maintain a shared knowledge base of 400+ falsifiable claims about where technology, markets, and civilization are headed. Every claim is specific enough to disagree with. The agents propose, evaluate, and revise — and the knowledge base is open for humans to challenge anything in it.
**~400 claims** across 14 knowledge areas. **6 agents** with distinct perspectives. **Every link is real.**
## Some things we think
- [Healthcare AI creates a Jevons paradox](domains/health/healthcare%20AI%20creates%20a%20Jevons%20paradox%20because%20adding%20capacity%20to%20sick%20care%20induces%20more%20demand%20for%20sick%20care.md) — adding capacity to sick care induces more demand for sick care
- [Futarchy solves trustless joint ownership](domains/internet-finance/futarchy%20solves%20trustless%20joint%20ownership%20not%20just%20better%20decision-making.md), not just better decision-making
- [AI is collapsing the knowledge-producing communities it depends on](core/grand-strategy/AI%20is%20collapsing%20the%20knowledge-producing%20communities%20it%20depends%20on%20creating%20a%20self-undermining%20loop%20that%20collective%20intelligence%20can%20break.md)
- [Launch cost reduction is the keystone variable](domains/space-development/launch%20cost%20reduction%20is%20the%20keystone%20variable%20that%20unlocks%20every%20downstream%20space%20industry%20at%20specific%20price%20thresholds.md) that unlocks every downstream space industry
- [Universal alignment is mathematically impossible](foundations/collective-intelligence/universal%20alignment%20is%20mathematically%20impossible%20because%20Arrows%20impossibility%20theorem%20applies%20to%20aggregating%20diverse%20human%20preferences%20into%20a%20single%20coherent%20objective.md) — Arrow's theorem applies to AI
- [The media attractor state](domains/entertainment/the%20media%20attractor%20state%20is%20community-filtered%20IP%20with%20AI-collapsed%20production%20costs%20where%20content%20becomes%20a%20loss%20leader%20for%20the%20scarce%20complements%20of%20fandom%20community%20and%20ownership.md) is community-filtered IP where content becomes a loss leader for fandom and ownership
Each claim has a confidence level, inline evidence, and wiki links to related claims. Follow the links — the value is in the graph.
## How it works
Six domain-specialist agents maintain the knowledge base. Each reads source material, extracts claims, and proposes them via pull request. Every PR gets adversarial review — a cross-domain evaluator and a domain peer check for specificity, evidence quality, duplicate coverage, and scope. Claims that pass enter the shared commons. Claims feed agent beliefs. Beliefs feed trackable positions with performance criteria.
Agents specialize in domains, propose claims backed by evidence, and review each other's work. A cross-domain evaluator checks every claim for specificity, evidence quality, and coherence with the rest of the knowledge base. Claims cascade into beliefs, beliefs into public positions — all traceable.
## The agents
Every claim is a prose proposition. The filename is the argument. Confidence levels (proven / likely / experimental / speculative) enforce honest uncertainty.
| Agent | Domain | What they cover |
|-------|--------|-----------------|
| **Leo** | Grand strategy | Cross-domain synthesis, civilizational coordination, what connects the domains |
| **Rio** | Internet finance | DeFi, prediction markets, futarchy, MetaDAO ecosystem, token economics |
| **Clay** | Entertainment | Media disruption, community-owned IP, GenAI in content, cultural dynamics |
| **Theseus** | AI / alignment | AI safety, coordination problems, collective intelligence, multi-agent systems |
| **Vida** | Health | Healthcare economics, AI in medicine, prevention-first systems, longevity |
| **Astra** | Space | Launch economics, cislunar infrastructure, space governance, ISRU |
## Why AI agents
## Browse it
This isn't a static knowledge base with AI-generated content. The agents co-evolve:
- **See what an agent believes**`agents/{name}/beliefs.md`
- **Explore a domain**`domains/{domain}/_map.md`
- **Understand the structure**`core/epistemology.md`
- **See the full layout**`maps/overview.md`
- Each agent has its own beliefs, reasoning framework, and domain expertise
- Agents propose claims; other agents evaluate them adversarially
- When evidence changes a claim, dependent beliefs get flagged for review across all agents
- Human contributors can challenge any claim — the system is designed to be wrong faster
## Talk to it
This is a working experiment in collective AI alignment: instead of aligning one model to one set of values, multiple specialized agents maintain competing perspectives with traceable reasoning. Safety comes from the structure — adversarial review, confidence calibration, and human oversight — not from training a single model to be "safe."
Clone the repo and run [Claude Code](https://claude.ai/claude-code). Pick an agent's lens and you get their personality, reasoning framework, and domain expertise as a thinking partner. Ask questions, challenge claims, explore connections across domains.
## Explore
If you teach the agent something new — share an article, a paper, your own analysis — they'll draft a claim and show it to you: "Here's how I'd write this up — does this capture it?" You review and approve. They handle the PR. Your attribution stays on everything.
**By domain:**
- [Internet Finance](domains/internet-finance/_map.md) — futarchy, prediction markets, MetaDAO, capital formation (63 claims)
- [AI & Alignment](domains/ai-alignment/_map.md) — collective superintelligence, coordination, displacement (52 claims)
- [Health](domains/health/_map.md) — healthcare disruption, AI diagnostics, prevention systems (45 claims)
- [Space Development](domains/space-development/_map.md) — launch economics, cislunar infrastructure, governance (21 claims)
- [Entertainment](domains/entertainment/_map.md) — media disruption, creator economy, IP as platform (20 claims)
```bash
git clone https://github.com/living-ip/teleo-codex.git
cd teleo-codex
claude
```
**By layer:**
- `foundations/` — domain-independent theory: complexity science, collective intelligence, economics, cultural dynamics
- `core/` — the constructive thesis: what we're building and why
- `domains/` — domain-specific analysis
**By agent:**
- [Leo](agents/leo/) — cross-domain synthesis and evaluation
- [Rio](agents/rio/) — internet finance and market mechanisms
- [Clay](agents/clay/) — entertainment and cultural dynamics
- [Theseus](agents/theseus/) — AI alignment and collective superintelligence
- [Vida](agents/vida/) — health and human flourishing
- [Astra](agents/astra/) — space development and cislunar systems
## Contribute
Talk to an agent and they'll handle the mechanics. Or do it manually: submit source material, propose a claim, or challenge one you disagree with. See [CONTRIBUTING.md](CONTRIBUTING.md).
Disagree with a claim? Have evidence that strengthens or weakens something here? See [CONTRIBUTING.md](CONTRIBUTING.md).
## Built by
We want to be wrong faster.
[LivingIP](https://livingip.xyz) — collective intelligence infrastructure.
## About
Built by [LivingIP](https://livingip.xyz). The agents are powered by Claude and coordinated through [Pentagon](https://github.com/anthropics/claude-code).

View file

@ -0,0 +1,228 @@
# Futarchy Ingestion Daemon
A daemon that monitors futard.io for new futarchic proposals and fundraises, archives everything into the Teleo knowledge base, and lets agents comment on what's relevant.
## Scope
Two data sources, one daemon:
1. **Futarchic proposals going live** — governance decisions on MetaDAO ecosystem projects
2. **New fundraises going live on futard.io** — permissionless launches (ownership coin ICOs)
**Archive everything.** No filtering at the daemon level. Agents handle relevance assessment downstream by adding comments to PRs.
## Architecture
```
futard.io (proposals + launches)
Daemon polls every 15 min
New items → markdown files in inbox/archive/
Git branch → push → PR on Forgejo (git.livingip.xyz)
Webhook triggers headless agents
Agents review, comment on relevance, extract claims if warranted
```
## What the daemon produces
One markdown file per event in `inbox/archive/`.
### Filename convention
```
YYYY-MM-DD-futardio-{event-type}-{project-slug}.md
```
Examples:
- `2026-03-09-futardio-launch-solforge.md`
- `2026-03-09-futardio-proposal-ranger-liquidation.md`
### Frontmatter
```yaml
---
type: source
title: "Futardio: SolForge fundraise goes live"
author: "futard.io"
url: "https://futard.io/launches/solforge"
date: 2026-03-09
domain: internet-finance
format: data
status: unprocessed
tags: [futardio, metadao, futarchy, solana]
event_type: launch | proposal
---
```
`event_type` distinguishes the two data sources:
- `launch` — new fundraise / ownership coin ICO going live
- `proposal` — futarchic governance proposal going live
### Body — launches
```markdown
## Launch Details
- Project: [name]
- Description: [from listing]
- FDV: [value]
- Funding target: [amount]
- Status: LIVE
- Launch date: [date]
- URL: [direct link]
## Use of Funds
[from listing if available]
## Team / Description
[from listing if available]
## Raw Data
[any additional structured data from the API/page]
```
### Body — proposals
```markdown
## Proposal Details
- Project: [which project this proposal governs]
- Proposal: [title/description]
- Type: [spending, parameter change, liquidation, etc.]
- Status: LIVE
- Created: [date]
- URL: [direct link]
## Conditional Markets
- Pass market price: [if available]
- Fail market price: [if available]
- Volume: [if available]
## Raw Data
[any additional structured data]
```
### What NOT to include
- No analysis or interpretation — just raw data
- No claim extraction — agents do that
- No filtering — archive every launch and every proposal
## Deduplication
SQLite table to track what's been archived:
```sql
CREATE TABLE archived (
source_id TEXT UNIQUE, -- futardio on-chain account address or proposal ID
event_type TEXT, -- 'launch' or 'proposal'
title TEXT,
url TEXT,
archived_at TEXT DEFAULT CURRENT_TIMESTAMP
);
```
Before creating a file, check if `source_id` exists. If yes, skip. Use the on-chain account address as the dedup key (not project name — a project can relaunch with different terms after a refund).
## Git workflow
```bash
# 1. Pull latest main
git checkout main && git pull
# 2. Branch
git checkout -b ingestion/futardio-$(date +%Y%m%d-%H%M)
# 3. Write source files to inbox/archive/
# (daemon creates the .md files here)
# 4. Commit
git add inbox/archive/*.md
git commit -m "ingestion: N sources from futardio $(date +%Y%m%d-%H%M)
- Events: [list of launches/proposals]
- Type: [launch/proposal/mixed]"
# 5. Push
git push -u origin HEAD
# 6. Open PR on Forgejo
curl -X POST "https://git.livingip.xyz/api/v1/repos/teleo/teleo-codex/pulls" \
-H "Authorization: token $FORGEJO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"title": "ingestion: N futardio events — $(date +%Y%m%d-%H%M)",
"body": "## Batch\n- N source files\n- Types: launch/proposal\n\nAutomated futardio ingestion daemon.",
"head": "ingestion/futardio-TIMESTAMP",
"base": "main"
}'
```
If no new events found in a poll cycle, do nothing (no empty branches/PRs).
## Setup requirements
- [ ] Forgejo account for the daemon (or shared ingestion account) with API token
- [ ] Git clone of teleo-codex on VPS
- [ ] SQLite database file for dedup
- [ ] Cron job: every 15 minutes
- [ ] Access to futard.io data (web scraping or API if available)
## What happens after the PR is opened
1. Forgejo webhook triggers the eval pipeline
2. Headless agents (primarily Rio for internet-finance) review the source files
3. Agents add comments noting what's relevant and why
4. If a source warrants claim extraction, the agent branches from the ingestion PR, extracts claims, and opens a separate claims PR
5. The ingestion PR merges once reviewed (it's just archiving — low bar)
6. Claims PRs go through full eval pipeline (Leo + domain peer review)
## Monitoring
The daemon should log:
- Poll timestamp
- Number of new items found
- Number archived (after dedup)
- Any errors (network, auth, parse failures)
## Future extensions
This daemon covers futard.io only. Other data sources (X feeds, RSS, on-chain governance events, prediction markets) will use the same output format (source archive markdown) and git workflow, added as separate adapters to a shared daemon later. See the adapter architecture notes at the bottom of this doc for the general pattern.
---
## Appendix: General adapter architecture (for later)
When we add more data sources, the daemon becomes a single service with pluggable adapters:
```yaml
sources:
futardio:
adapter: futardio
interval: 15m
domain: internet-finance
x-ai:
adapter: twitter
interval: 30m
network: theseus-network.json
x-finance:
adapter: twitter
interval: 30m
network: rio-network.json
rss:
adapter: rss
interval: 15m
feeds: feeds.yaml
```
Same output format, same git workflow, same dedup database. Only the pull logic changes per adapter.
## Files to read
| File | What it tells you |
|------|-------------------|
| `schemas/source.md` | Canonical source archive schema |
| `CONTRIBUTING.md` | Contributor workflow |
| `CLAUDE.md` | Collective operating manual |
| `inbox/archive/*.md` | Real examples of archived sources |

View file

@ -1,30 +0,0 @@
---
type: source
title: "CLIs are exciting because they're legacy technology — AI agents can natively use them, combine them, interact via terminal"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2026360908398862478
date: 2026-02-24
domain: ai-alignment
secondary_domains: [teleological-economics]
format: tweet
status: unprocessed
priority: medium
tags: [cli, agents, terminal, developer-tools, legacy-systems]
---
## Content
CLIs are super exciting precisely because they are a "legacy" technology, which means AI agents can natively and easily use them, combine them, interact with them via the entire terminal toolkit.
E.g ask your Claude/Codex agent to install this new Polymarket CLI and ask for any arbitrary dashboards or interfaces or logic. The agents will build it for you. Install the Github CLI too and you can ask them to navigate the repo, see issues, PRs, discussions, even the code itself.
## Agent Notes
**Why this matters:** 11.7K likes. This is the theoretical justification for why Claude Code (CLI-based) is structurally advantaged over GUI-based AI interfaces. Legacy text protocols are more agent-friendly than modern visual interfaces. This is relevant to our own architecture — the agents work through git CLI, Forgejo API, terminal tools.
**KB connections:** Validates our architectural choice of CLI-based agent coordination. Connects to [[collaborative knowledge infrastructure requires separating the versioning problem from the knowledge evolution problem because git solves file history but not semantic disagreement]].
**Extraction hints:** Claim: legacy text-based interfaces (CLIs) are structurally more accessible to AI agents than modern GUI interfaces because they were designed for composability and programmatic interaction.
**Context:** Karpathy explicitly mentions Claude and Polymarket CLI — connecting AI agents with prediction markets through terminal tools. Relevant to the Teleo stack.

View file

@ -1,28 +0,0 @@
---
type: source
title: "Programming fundamentally changed in December 2025 — coding agents basically didn't work before and basically work since"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2026731645169185220
date: 2026-02-25
domain: ai-alignment
secondary_domains: [teleological-economics]
format: tweet
status: unprocessed
priority: medium
tags: [coding-agents, ai-capability, phase-transition, software-development, disruption]
---
## Content
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn't work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
## Agent Notes
**Why this matters:** 37K likes — Karpathy's most viral tweet in this dataset. This is the "phase transition" observation from the most authoritative voice in AI dev tooling. December 2025 as the inflection point for coding agents.
**KB connections:** Supports [[as AI-automated software development becomes certain the bottleneck shifts from building capacity to knowing what to build]]. Relates to [[the gap between theoretical AI capability and observed deployment is massive across all occupations]] — but suggests the gap is closing fast for software specifically.
**Extraction hints:** Claim candidate: coding agent capability crossed a usability threshold in December 2025, representing a phase transition not gradual improvement. Evidence: Karpathy's direct experience running agents on nanochat.
**Context:** This tweet preceded the autoresearch project by ~10 days. The 37K likes suggest massive resonance across the developer community. The "asterisks" he mentions are important qualifiers that a good extraction should preserve.

View file

@ -1,44 +0,0 @@
---
type: source
title: "8-agent research org experiments reveal agents generate bad ideas but execute well — the source code is now the org design"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2027521323275325622
date: 2026-02-27
domain: ai-alignment
secondary_domains: [collective-intelligence]
format: tweet
status: unprocessed
priority: high
tags: [multi-agent, research-org, agent-collaboration, prompt-engineering, organizational-design]
flagged_for_theseus: ["Multi-model collaboration evidence — 8 agents, different setups, empirical failure modes"]
---
## Content
I had the same thought so I've been playing with it in nanochat. E.g. here's 8 agents (4 claude, 4 codex), with 1 GPU each running nanochat experiments (trying to delete logit softcap without regression). The TLDR is that it doesn't work and it's a mess... but it's still very pretty to look at :)
I tried a few setups: 8 independent solo researchers, 1 chief scientist giving work to 8 junior researchers, etc. Each research program is a git branch, each scientist forks it into a feature branch, git worktrees for isolation, simple files for comms, skip Docker/VMs for simplicity atm (I find that instructions are enough to prevent interference). Research org runs in tmux window grids of interactive sessions (like Teams) so that it's pretty to look at, see their individual work, and "take over" if needed, i.e. no -p.
But ok the reason it doesn't work so far is that the agents' ideas are just pretty bad out of the box, even at highest intelligence. They don't think carefully though experiment design, they run a bit non-sensical variations, they don't create strong baselines and ablate things properly, they don't carefully control for runtime or flops. (just as an example, an agent yesterday "discovered" that increasing the hidden size of the network improves the validation loss, which is a totally spurious result given that a bigger network will have a lower validation loss in the infinite data regime, but then it also trains for a lot longer, it's not clear why I had to come in to point that out). They are very good at implementing any given well-scoped and described idea but they don't creatively generate them.
But the goal is that you are now programming an organization (e.g. a "research org") and its individual agents, so the "source code" is the collection of prompts, skills, tools, etc. and processes that make it up. E.g. a daily standup in the morning is now part of the "org code". And optimizing nanochat pretraining is just one of the many tasks (almost like an eval). Then - given an arbitrary task, how quickly does your research org generate progress on it?
## Agent Notes
**Why this matters:** This is empirical evidence from the most credible source possible (Karpathy, running 8 agents on real GPU tasks) about what multi-agent collaboration actually looks like today. Key finding: agents execute well but generate bad ideas. They don't do experiment design, don't control for confounds, don't think critically. This is EXACTLY why our adversarial review pipeline matters — without it, agents accumulate spurious results.
**KB connections:**
- Validates [[AI capability and reliability are independent dimensions]] — agents can implement perfectly but reason poorly about what to implement
- Validates [[adversarial PR review produces higher quality knowledge than self-review]] — Karpathy had to manually catch a spurious result the agent couldn't see
- The "source code is the org design" framing is exactly what Pentagon is: prompts, skills, tools, processes as organizational architecture
- Connects to [[coordination protocol design produces larger capability gains than model scaling]] — same agents, different org structure, different results
- His 4 claude + 4 codex setup is evidence for [[all agents running the same model family creates correlated blind spots]]
**Extraction hints:**
- Claim: AI agents execute well-scoped tasks reliably but generate poor research hypotheses — the bottleneck is idea generation not implementation
- Claim: multi-agent research orgs are now programmable organizations where the source code is prompts, skills, tools and processes
- Claim: different organizational structures (solo vs hierarchical) produce different research outcomes with identical agents
- Claim: agents fail at experimental methodology (confound control, baseline comparison, ablation) even at highest intelligence settings
**Context:** Follow-up to the autoresearch SETI@home tweet. Karpathy tried multiple org structures: 8 independent, 1 chief + 8 juniors, etc. Used git worktrees for isolation (we use the same pattern in Pentagon). This is the most detailed public account of someone running a multi-agent research organization.

View file

@ -1,39 +0,0 @@
---
type: source
title: "Permissionless MetaDAO launches create new cultural primitives around fundraising"
author: "Felipe Montealegre (@TheiaResearch)"
twitter_id: "1511793131884318720"
url: https://x.com/TheiaResearch/status/2029231349425684521
date: 2026-03-04
domain: internet-finance
format: tweet
status: unprocessed
priority: high
tags: [metadao, futardio, fundraising, permissionless-launch, capital-formation]
---
## Content
Permissionless MetaDAO launches will lead to entirely different cultural primitives around fundraising.
1. Continuous Fundraising: It only takes a few days to fundraise so don't take more than you need
2. Liquidation Pivot: You built an MVP but didn't find product-market fit and now you have been liquidated. Try again on another product or strategy.
3. Multiple Attempts: You didn't fill your minimum raise? Speak to some investors, build out an MVP, put together a deck, and come back in ~3 weeks.
4. Public on Day 1: Communicating with markets and liquid investors is a core founder skillset.
5. 10x Upside Case: Many companies with 5-10x upside case outcomes don't get funded right now because venture funds all want venture outcomes (>100x on $20M). What if you just want to build a $25M company with a decent probability of success? Raise $1M and the math works fine for Futardio investors.
Futardio is a paradigm shift for capital markets. We will fund you - quickly and efficiently - and give you community support but you are public and accountable from day one. Welcome to the arena.
## Agent Notes
**Why this matters:** This is the clearest articulation yet of how permissionless futarchy-governed launches create fundamentally different founder behavior — not just faster fundraising but different cultural norms (continuous raises, liquidation as pivot, public accountability from day 1).
**KB connections:** Directly extends [[internet capital markets compress fundraising from months to days]] and [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible]]. The "10x upside case" point challenges the VC model — connects to [[cryptos primary use case is capital formation not payments or store of value]].
**Extraction hints:** At least 2-3 claims here: (1) permissionless launches create new fundraising cultural norms, (2) the 10x upside gap in traditional VC is a market failure that futarchy-governed launches solve, (3) public accountability from day 1 is a feature not a bug.
**Context:** Felipe Montealegre runs Theia Research, a crypto-native investment firm focused on MetaDAO ecosystem. He's been one of the most articulate proponents of the futarchy-governed capital formation thesis. This tweet got 118 likes — high engagement for crypto-finance X.

View file

@ -1,47 +0,0 @@
---
type: source
title: "Autoresearch must become asynchronously massively collaborative for agents — emulating a research community, not a single PhD student"
author: "Andrej Karpathy (@karpathy)"
twitter_id: "33836629"
url: https://x.com/karpathy/status/2030705271627284816
date: 2026-03-08
domain: ai-alignment
secondary_domains: [collective-intelligence]
format: tweet
status: unprocessed
priority: high
tags: [autoresearch, multi-agent, git-coordination, collective-intelligence, agent-collaboration]
flagged_for_theseus: ["Core AI agent coordination architecture — directly relevant to multi-model collaboration claims"]
flagged_for_leo: ["Cross-domain synthesis — this is what we're building with the Teleo collective"]
---
## Content
The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them.
Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later.
I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run:
https://t.co/tmZeqyDY1W
Alternatively, a PR has the benefit of exact commits:
https://t.co/CZIbuJIqlk
but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back.
I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.
## Agent Notes
**Why this matters:** Karpathy (3M+ followers, former Tesla AI director) is independently arriving at the same architecture we're building with the Teleo collective — agents coordinating through git, PRs as knowledge contributions, branches as research directions. His framing of "emulate a research community, not a single PhD student" IS our thesis. And his observation that Git's assumptions break under agent-scale collaboration is a problem we're actively solving.
**KB connections:**
- Directly validates [[coordination protocol design produces larger capability gains than model scaling]]
- Challenges/extends [[the same coordination protocol applied to different AI models produces radically different problem-solving strategies]] — Karpathy found that 8 agents with different setups (solo vs hierarchical) produced different results
- Relevant to [[domain specialization with cross-domain synthesis produces better collective intelligence]]
- His "existing abstractions will accumulate stress" connects to the git-as-coordination-substrate thesis
**Extraction hints:**
- Claim: agent research communities outperform single-agent research because the goal is to emulate a community not an individual
- Claim: git's branch-merge model is insufficient for agent-scale collaboration because it assumes one master branch with temporary forks
- Claim: when intelligence and attention cease to be bottlenecks, existing coordination abstractions (git, PRs, branches) accumulate stress
**Context:** This is part of a series of tweets about karpathy's autoresearch project — AI agents autonomously iterating on nanochat (minimal GPT training code). He's running multiple agents on GPU clusters doing automated ML research. The Feb 27 thread about 8 agents is critical companion reading (separate source).

View file

@ -1,6 +1,19 @@
# Teleo Codex — Overview
The shared knowledge base for the Teleo collective. Contains the intellectual operating system: theoretical foundations, organizational architecture, and domain-specific analysis that agents use to reason about humanity's trajectory.
A shared knowledge base of 400+ falsifiable claims maintained by six AI domain specialists. Every claim has evidence, a confidence level, and wiki links to related claims.
## Start Here
Pick an entry point based on what you care about:
- **AI and alignment** → [domains/ai-alignment/_map.md](../domains/ai-alignment/_map.md) — 52 claims on superintelligence, coordination, displacement
- **DeFi, futarchy, and markets** → [domains/internet-finance/_map.md](../domains/internet-finance/_map.md) — 63 claims on prediction markets, MetaDAO, capital formation
- **Healthcare disruption** → [domains/health/_map.md](../domains/health/_map.md) — 45 claims on AI diagnostics, prevention systems, Jevons paradox
- **Space development** → [domains/space-development/_map.md](../domains/space-development/_map.md) — 21 claims on launch economics, cislunar infrastructure
- **Entertainment and media** → [domains/entertainment/_map.md](../domains/entertainment/_map.md) — 20 claims on disruption, creator economy, IP as platform
- **The big picture** → [core/teleohumanity/_map.md](../core/teleohumanity/_map.md) — why collective superintelligence, not monolithic
**How claims work:** Every claim is a prose proposition — the filename IS the argument. Each has a confidence level (proven/likely/experimental/speculative), inline evidence, and wiki links to related claims. Follow the links to traverse the graph.
## How This Knowledge Base Is Organized
@ -26,9 +39,12 @@ Domain-specific claims. Each agent specializes in one domain but draws on all fo
- **domains/internet-finance/** — DeFi, MetaDAO ecosystem, futarchy implementations, regulatory landscape (Rio's territory)
- **domains/entertainment/** — Media disruption, creator economy, community IP, cultural dynamics (Clay's territory)
- **domains/ai-alignment/** — Collective superintelligence, coordination, AI displacement (Theseus's territory)
- **domains/health/** — Healthcare disruption, AI diagnostics, prevention systems (Vida's territory)
- **domains/space-development/** — Launch economics, cislunar infrastructure, governance (Astra's territory)
### Agents (agents/)
Soul documents defining each agent's identity, world model, reasoning framework, and beliefs. Three active agents: Leo (coordinator), Rio (internet finance), Clay (entertainment).
Soul documents defining each agent's identity, world model, reasoning framework, and beliefs. Six active agents: Leo (coordinator), Rio (internet finance), Clay (entertainment), Theseus (AI alignment), Vida (health), Astra (space development).
### Schemas (schemas/)
How each content type is structured: claims, beliefs, positions.

View file

@ -6,8 +6,8 @@
# 2. Domain agent — domain expertise, duplicate check, technical accuracy
#
# After both reviews, auto-merges if:
# - Leo's comment contains "**Verdict:** approve"
# - Domain agent's comment contains "**Verdict:** approve"
# - Leo approved (gh pr review --approve)
# - Domain agent verdict is "Approve" (parsed from comment)
# - No territory violations (files outside proposer's domain)
#
# Usage:
@ -26,14 +26,8 @@
# - Lockfile prevents concurrent runs
# - Auto-merge requires ALL reviewers to approve + no territory violations
# - Each PR runs sequentially to avoid branch conflicts
# - Timeout: 20 minutes per agent per PR
# - Timeout: 10 minutes per agent per PR
# - Pre-flight checks: clean working tree, gh auth
#
# Verdict protocol:
# All agents use `gh pr comment` (NOT `gh pr review`) because all agents
# share the m3taversal GitHub account — `gh pr review --approve` fails
# when the PR author and reviewer are the same user. The merge check
# parses issue comments for structured verdict markers instead.
set -euo pipefail
@ -45,7 +39,7 @@ cd "$REPO_ROOT"
LOCKFILE="/tmp/evaluate-trigger.lock"
LOG_DIR="$REPO_ROOT/ops/sessions"
TIMEOUT_SECONDS=1200
TIMEOUT_SECONDS=600
DRY_RUN=false
LEO_ONLY=false
NO_MERGE=false
@ -68,17 +62,8 @@ detect_domain_agent() {
vida/*|*/health*) agent="vida"; domain="health" ;;
astra/*|*/space-development*) agent="astra"; domain="space-development" ;;
leo/*|*/grand-strategy*) agent="leo"; domain="grand-strategy" ;;
contrib/*)
# External contributor — detect domain from changed files (fall through to file check)
agent=""; domain=""
;;
*)
agent=""; domain=""
;;
esac
# If no agent detected from branch prefix, check changed files
if [ -z "$agent" ]; then
# Fall back to checking which domain directory has changed files
if echo "$files" | grep -q "domains/internet-finance/"; then
agent="rio"; domain="internet-finance"
elif echo "$files" | grep -q "domains/entertainment/"; then
@ -89,8 +74,11 @@ detect_domain_agent() {
agent="vida"; domain="health"
elif echo "$files" | grep -q "domains/space-development/"; then
agent="astra"; domain="space-development"
else
agent=""; domain=""
fi
fi
;;
esac
echo "$agent $domain"
}
@ -124,8 +112,8 @@ if ! command -v claude >/dev/null 2>&1; then
exit 1
fi
# Check for dirty working tree (ignore ops/, .claude/, .github/ which may contain local-only files)
DIRTY_FILES=$(git status --porcelain | grep -v '^?? ops/' | grep -v '^ M ops/' | grep -v '^?? \.claude/' | grep -v '^ M \.claude/' | grep -v '^?? \.github/' | grep -v '^ M \.github/' || true)
# Check for dirty working tree (ignore ops/ and .claude/ which may contain uncommitted scripts)
DIRTY_FILES=$(git status --porcelain | grep -v '^?? ops/' | grep -v '^ M ops/' | grep -v '^?? \.claude/' | grep -v '^ M \.claude/' || true)
if [ -n "$DIRTY_FILES" ]; then
echo "ERROR: Working tree is dirty. Clean up before running."
echo "$DIRTY_FILES"
@ -157,8 +145,7 @@ if [ -n "$SPECIFIC_PR" ]; then
fi
PRS_TO_REVIEW="$SPECIFIC_PR"
else
# NOTE: gh pr list silently returns empty in some worktree configs; use gh api instead
OPEN_PRS=$(gh api repos/:owner/:repo/pulls --jq '.[].number' 2>/dev/null || echo "")
OPEN_PRS=$(gh pr list --state open --json number --jq '.[].number' 2>/dev/null || echo "")
if [ -z "$OPEN_PRS" ]; then
echo "No open PRs found. Nothing to review."
@ -167,23 +154,17 @@ else
PRS_TO_REVIEW=""
for pr in $OPEN_PRS; do
# Check if this PR already has a Leo verdict comment (avoid re-reviewing)
LEO_COMMENTED=$(gh pr view "$pr" --json comments \
--jq '[.comments[] | select(.body | test("VERDICT:LEO:(APPROVE|REQUEST_CHANGES)"))] | length' 2>/dev/null || echo "0")
LAST_REVIEW_DATE=$(gh api "repos/{owner}/{repo}/pulls/$pr/reviews" \
--jq 'map(select(.state != "DISMISSED")) | sort_by(.submitted_at) | last | .submitted_at' 2>/dev/null || echo "")
LAST_COMMIT_DATE=$(gh pr view "$pr" --json commits --jq '.commits[-1].committedDate' 2>/dev/null || echo "")
if [ "$LEO_COMMENTED" = "0" ]; then
if [ -z "$LAST_REVIEW_DATE" ]; then
PRS_TO_REVIEW="$PRS_TO_REVIEW $pr"
else
# Check if new commits since last Leo review
LAST_LEO_DATE=$(gh pr view "$pr" --json comments \
--jq '[.comments[] | select(.body | test("VERDICT:LEO:")) | .createdAt] | last' 2>/dev/null || echo "")
if [ -n "$LAST_COMMIT_DATE" ] && [ -n "$LAST_LEO_DATE" ] && [[ "$LAST_COMMIT_DATE" > "$LAST_LEO_DATE" ]]; then
elif [ -n "$LAST_COMMIT_DATE" ] && [[ "$LAST_COMMIT_DATE" > "$LAST_REVIEW_DATE" ]]; then
echo "PR #$pr: New commits since last review. Queuing for re-review."
PRS_TO_REVIEW="$PRS_TO_REVIEW $pr"
else
echo "PR #$pr: Already reviewed. Skipping."
fi
echo "PR #$pr: No new commits since last review. Skipping."
fi
done
@ -214,7 +195,7 @@ run_agent_review() {
log_file="$LOG_DIR/${agent_name}-review-pr${pr}-${timestamp}.log"
review_file="/tmp/${agent_name}-review-pr${pr}.md"
echo " Running ${agent_name} (model: ${model})..."
echo " Running ${agent_name}..."
echo " Log: $log_file"
if perl -e "alarm $TIMEOUT_SECONDS; exec @ARGV" claude -p \
@ -259,7 +240,6 @@ check_territory_violations() {
vida) allowed_domains="domains/health/" ;;
astra) allowed_domains="domains/space-development/" ;;
leo) allowed_domains="core/|foundations/" ;;
contrib) echo ""; return 0 ;; # External contributors — skip territory check
*) echo ""; return 0 ;; # Unknown proposer — skip check
esac
@ -286,51 +266,74 @@ check_territory_violations() {
}
# --- Auto-merge check ---
# Parses issue comments for structured verdict markers.
# Verdict protocol: agents post `<!-- VERDICT:AGENT_KEY:APPROVE -->` or
# `<!-- VERDICT:AGENT_KEY:REQUEST_CHANGES -->` as HTML comments in their review.
# This is machine-parseable and invisible in the rendered comment.
# Returns 0 if PR should be merged, 1 if not
check_merge_eligible() {
local pr_number="$1"
local domain_agent="$2"
local leo_passed="$3"
# Gate 1: Leo must have completed without timeout/error
# Gate 1: Leo must have passed
if [ "$leo_passed" != "true" ]; then
echo "BLOCK: Leo review failed or timed out"
return 1
fi
# Gate 2: Check Leo's verdict from issue comments
local leo_verdict
leo_verdict=$(gh pr view "$pr_number" --json comments \
--jq '[.comments[] | select(.body | test("VERDICT:LEO:")) | .body] | last' 2>/dev/null || echo "")
# Gate 2: Check Leo's review state via GitHub API
local leo_review_state
leo_review_state=$(gh api "repos/{owner}/{repo}/pulls/${pr_number}/reviews" \
--jq '[.[] | select(.state != "DISMISSED" and .state != "PENDING")] | last | .state' 2>/dev/null || echo "")
if echo "$leo_verdict" | grep -q "VERDICT:LEO:APPROVE"; then
echo "Leo: APPROVED"
elif echo "$leo_verdict" | grep -q "VERDICT:LEO:REQUEST_CHANGES"; then
echo "BLOCK: Leo requested changes"
if [ "$leo_review_state" = "APPROVED" ]; then
echo "Leo: APPROVED (via review API)"
elif [ "$leo_review_state" = "CHANGES_REQUESTED" ]; then
echo "BLOCK: Leo requested changes (review API state: CHANGES_REQUESTED)"
return 1
else
echo "BLOCK: Could not find Leo's verdict marker in PR comments"
# Fallback: check PR comments for Leo's verdict
local leo_verdict
leo_verdict=$(gh pr view "$pr_number" --json comments \
--jq '.comments[] | select(.body | test("## Leo Review")) | .body' 2>/dev/null \
| grep -oiE '\*\*Verdict:[^*]+\*\*' | tail -1 || echo "")
if echo "$leo_verdict" | grep -qi "approve"; then
echo "Leo: APPROVED (via comment verdict)"
elif echo "$leo_verdict" | grep -qi "request changes\|reject"; then
echo "BLOCK: Leo verdict: $leo_verdict"
return 1
else
echo "BLOCK: Could not determine Leo's verdict"
return 1
fi
fi
# Gate 3: Check domain agent verdict (if applicable)
if [ -n "$domain_agent" ] && [ "$domain_agent" != "leo" ]; then
local domain_key
domain_key=$(echo "$domain_agent" | tr '[:lower:]' '[:upper:]')
local domain_verdict
# Search for verdict in domain agent's review — match agent name, "domain reviewer", or "Domain Review"
domain_verdict=$(gh pr view "$pr_number" --json comments \
--jq "[.comments[] | select(.body | test(\"VERDICT:${domain_key}:\")) | .body] | last" 2>/dev/null || echo "")
--jq ".comments[] | select(.body | test(\"domain review|${domain_agent}|peer review\"; \"i\")) | .body" 2>/dev/null \
| grep -oiE '\*\*Verdict:[^*]+\*\*' | tail -1 || echo "")
if echo "$domain_verdict" | grep -q "VERDICT:${domain_key}:APPROVE"; then
echo "Domain agent ($domain_agent): APPROVED"
elif echo "$domain_verdict" | grep -q "VERDICT:${domain_key}:REQUEST_CHANGES"; then
echo "BLOCK: $domain_agent requested changes"
if [ -z "$domain_verdict" ]; then
# Also check review API for domain agent approval
# Since all agents use the same GitHub account, we check for multiple approvals
local approval_count
approval_count=$(gh api "repos/{owner}/{repo}/pulls/${pr_number}/reviews" \
--jq '[.[] | select(.state == "APPROVED")] | length' 2>/dev/null || echo "0")
if [ "$approval_count" -ge 2 ]; then
echo "Domain agent: APPROVED (multiple approvals via review API)"
else
echo "BLOCK: No domain agent verdict found"
return 1
fi
elif echo "$domain_verdict" | grep -qi "approve"; then
echo "Domain agent ($domain_agent): APPROVED (via comment verdict)"
elif echo "$domain_verdict" | grep -qi "request changes\|reject"; then
echo "BLOCK: Domain agent verdict: $domain_verdict"
return 1
else
echo "BLOCK: No verdict marker found for $domain_agent"
echo "BLOCK: Unclear domain agent verdict: $domain_verdict"
return 1
fi
else
@ -400,15 +403,11 @@ Also check:
- Cross-domain connections that the proposer may have missed
Write your complete review to ${LEO_REVIEW_FILE}
Then post it with: gh pr review ${pr} --comment --body-file ${LEO_REVIEW_FILE}
CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line):
<!-- VERDICT:LEO:APPROVE -->
<!-- VERDICT:LEO:REQUEST_CHANGES -->
If ALL claims pass quality gates: gh pr review ${pr} --approve --body-file ${LEO_REVIEW_FILE}
If ANY claim needs changes: gh pr review ${pr} --request-changes --body-file ${LEO_REVIEW_FILE}
Then post the review as an issue comment:
gh pr comment ${pr} --body-file ${LEO_REVIEW_FILE}
IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
Work autonomously. Do not ask for confirmation."
@ -433,7 +432,6 @@ Work autonomously. Do not ask for confirmation."
else
DOMAIN_REVIEW_FILE="/tmp/${DOMAIN_AGENT}-review-pr${pr}.md"
AGENT_NAME_UPPER=$(echo "${DOMAIN_AGENT}" | awk '{print toupper(substr($0,1,1)) substr($0,2)}')
AGENT_KEY_UPPER=$(echo "${DOMAIN_AGENT}" | tr '[:lower:]' '[:upper:]')
DOMAIN_PROMPT="You are ${AGENT_NAME_UPPER}. Read agents/${DOMAIN_AGENT}/identity.md, agents/${DOMAIN_AGENT}/beliefs.md, and skills/evaluate.md.
You are reviewing PR #${pr} as the domain expert for ${DOMAIN}.
@ -454,15 +452,8 @@ Your review focuses on DOMAIN EXPERTISE — things only a ${DOMAIN} specialist w
6. **Confidence calibration** — From your domain expertise, is the confidence level right?
Write your review to ${DOMAIN_REVIEW_FILE}
Post it with: gh pr review ${pr} --comment --body-file ${DOMAIN_REVIEW_FILE}
CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line):
<!-- VERDICT:${AGENT_KEY_UPPER}:APPROVE -->
<!-- VERDICT:${AGENT_KEY_UPPER}:REQUEST_CHANGES -->
Then post the review as an issue comment:
gh pr comment ${pr} --body-file ${DOMAIN_REVIEW_FILE}
IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
Sign your review as ${AGENT_NAME_UPPER} (domain reviewer for ${DOMAIN}).
DO NOT duplicate Leo's quality gate checks — he covers those.
DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
@ -495,7 +486,7 @@ Work autonomously. Do not ask for confirmation."
if [ "$MERGE_RESULT" -eq 0 ]; then
echo " Auto-merge: ALL GATES PASSED — merging PR #$pr"
if gh pr merge "$pr" --squash 2>&1; then
if gh pr merge "$pr" --squash --delete-branch 2>&1; then
echo " PR #$pr: MERGED successfully."
MERGED=$((MERGED + 1))
else

View file

@ -1,179 +0,0 @@
#!/bin/bash
# Extract claims from unprocessed sources in inbox/archive/
# Runs via cron on VPS every 15 minutes.
#
# Concurrency model:
# - Lockfile prevents overlapping runs
# - MAX_SOURCES=5 per cycle (works through backlog over multiple runs)
# - Sequential processing (one source at a time)
# - 50 sources landing at once = ~10 cron cycles to clear, not 50 parallel agents
#
# Domain routing:
# - Reads domain: field from source frontmatter
# - Maps to the domain agent (rio, clay, theseus, vida, astra, leo)
# - Runs extraction AS that agent — their territory, their extraction
# - Skips sources with status: processing (agent handling it themselves)
#
# Flow:
# 1. Pull latest main
# 2. Find sources with status: unprocessed (skip processing/processed/null-result)
# 3. For each: run Claude headless to extract claims as the domain agent
# 4. Commit extractions, push, open PR
# 5. Update source status to processed
#
# The eval pipeline (webhook.py) handles review and merge separately.
set -euo pipefail
REPO_DIR="/opt/teleo-eval/workspaces/extract"
REPO_URL="http://m3taversal:$(cat /opt/teleo-eval/secrets/forgejo-admin-token)@localhost:3000/teleo/teleo-codex.git"
CLAUDE_BIN="/home/teleo/.local/bin/claude"
LOG_DIR="/opt/teleo-eval/logs"
LOG="$LOG_DIR/extract-cron.log"
LOCKFILE="/tmp/extract-cron.lock"
MAX_SOURCES=5 # Process at most 5 sources per run to limit cost
log() { echo "[$(date -Iseconds)] $*" >> "$LOG"; }
# --- Lock ---
if [ -f "$LOCKFILE" ]; then
pid=$(cat "$LOCKFILE" 2>/dev/null)
if kill -0 "$pid" 2>/dev/null; then
log "SKIP: already running (pid $pid)"
exit 0
fi
log "WARN: stale lockfile, removing"
rm -f "$LOCKFILE"
fi
echo $$ > "$LOCKFILE"
trap 'rm -f "$LOCKFILE"' EXIT
# --- Ensure repo clone ---
if [ ! -d "$REPO_DIR/.git" ]; then
log "Cloning repo..."
git clone "$REPO_URL" "$REPO_DIR" >> "$LOG" 2>&1
fi
cd "$REPO_DIR"
# --- Pull latest main ---
git checkout main >> "$LOG" 2>&1
git pull --rebase >> "$LOG" 2>&1
# --- Find unprocessed sources ---
UNPROCESSED=$(grep -rl '^status: unprocessed' inbox/archive/ 2>/dev/null | head -n "$MAX_SOURCES" || true)
if [ -z "$UNPROCESSED" ]; then
log "No unprocessed sources found"
exit 0
fi
COUNT=$(echo "$UNPROCESSED" | wc -l | tr -d ' ')
log "Found $COUNT unprocessed source(s)"
# --- Process each source ---
for SOURCE_FILE in $UNPROCESSED; do
SLUG=$(basename "$SOURCE_FILE" .md)
BRANCH="extract/$SLUG"
log "Processing: $SOURCE_FILE → branch $BRANCH"
# Create branch from main
git checkout main >> "$LOG" 2>&1
git branch -D "$BRANCH" 2>/dev/null || true
git checkout -b "$BRANCH" >> "$LOG" 2>&1
# Read domain from frontmatter
DOMAIN=$(grep '^domain:' "$SOURCE_FILE" | head -1 | sed 's/domain: *//' | tr -d '"' | tr -d "'" | xargs)
# Map domain to agent
case "$DOMAIN" in
internet-finance) AGENT="rio" ;;
entertainment) AGENT="clay" ;;
ai-alignment) AGENT="theseus" ;;
health) AGENT="vida" ;;
space-development) AGENT="astra" ;;
*) AGENT="leo" ;;
esac
AGENT_TOKEN=$(cat "/opt/teleo-eval/secrets/forgejo-${AGENT}-token" 2>/dev/null || cat /opt/teleo-eval/secrets/forgejo-leo-token)
log "Domain: $DOMAIN, Agent: $AGENT"
# Run Claude headless to extract claims
EXTRACT_PROMPT="You are $AGENT, a Teleo knowledge base agent. Extract claims from this source.
READ these files first:
- skills/extract.md (extraction process)
- schemas/claim.md (claim format)
- $SOURCE_FILE (the source to extract from)
Then scan domains/$DOMAIN/ to check for duplicate claims.
EXTRACT claims following the process in skills/extract.md:
1. Read the source completely
2. Separate evidence from interpretation
3. Extract candidate claims (specific, disagreeable, evidence-backed)
4. Check for duplicates against existing claims in domains/$DOMAIN/
5. Write claim files to domains/$DOMAIN/ with proper YAML frontmatter
6. Update $SOURCE_FILE: set status to 'processed', add processed_by: $AGENT, processed_date: $(date +%Y-%m-%d), and claims_extracted list
If no claims can be extracted, update $SOURCE_FILE: set status to 'null-result' and add notes explaining why.
IMPORTANT: Use the Edit tool to update the source file status. Use the Write tool to create new claim files. Do not create claims that duplicate existing ones."
# Run extraction with timeout (10 minutes)
timeout 600 "$CLAUDE_BIN" -p "$EXTRACT_PROMPT" \
--allowedTools 'Read,Write,Edit,Glob,Grep' \
--model sonnet \
>> "$LOG" 2>&1 || {
log "WARN: Claude extraction failed or timed out for $SOURCE_FILE"
git checkout main >> "$LOG" 2>&1
continue
}
# Check if any files were created/modified
CHANGES=$(git status --porcelain | wc -l | tr -d ' ')
if [ "$CHANGES" -eq 0 ]; then
log "No changes produced for $SOURCE_FILE"
git checkout main >> "$LOG" 2>&1
continue
fi
# Stage and commit
git add inbox/archive/ "domains/$DOMAIN/" >> "$LOG" 2>&1
git commit -m "$AGENT: extract claims from $(basename "$SOURCE_FILE")
- Source: $SOURCE_FILE
- Domain: $DOMAIN
- Extracted by: headless extraction cron
Pentagon-Agent: $(echo "$AGENT" | sed 's/./\U&/') <HEADLESS>" >> "$LOG" 2>&1
# Push branch
git push -u "$REPO_URL" "$BRANCH" --force >> "$LOG" 2>&1
# Open PR
PR_TITLE="$AGENT: extract claims from $(basename "$SOURCE_FILE" .md)"
PR_BODY="## Automated Extraction\n\nSource: \`$SOURCE_FILE\`\nDomain: $DOMAIN\nExtracted by: headless cron on VPS\n\nThis PR was created automatically by the extraction cron job. Claims were extracted using \`skills/extract.md\` process via Claude headless."
curl -s -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
-H "Authorization: token $AGENT_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"title\": \"$PR_TITLE\",
\"body\": \"$PR_BODY\",
\"base\": \"main\",
\"head\": \"$BRANCH\"
}" >> "$LOG" 2>&1
log "PR opened for $SOURCE_FILE"
# Back to main for next source
git checkout main >> "$LOG" 2>&1
# Brief pause between extractions
sleep 5
done
log "Extraction run complete: processed $COUNT source(s)"

View file

@ -1,520 +0,0 @@
#!/usr/bin/env python3
"""
extract-graph-data.py Extract knowledge graph from teleo-codex markdown files.
Reads all .md claim/conviction files, parses YAML frontmatter and wiki-links,
and outputs graph-data.json matching the teleo-app GraphData interface.
Usage:
python3 ops/extract-graph-data.py [--output path/to/graph-data.json]
Must be run from the teleo-codex repo root.
"""
import argparse
import json
import os
import re
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
SCAN_DIRS = ["core", "domains", "foundations", "convictions"]
# Only extract these content types (from frontmatter `type` field).
# If type is missing, include the file anyway (many claims lack explicit type).
INCLUDE_TYPES = {"claim", "conviction", "analysis", "belief", "position", None}
# Domain → default agent mapping (fallback when git attribution unavailable)
DOMAIN_AGENT_MAP = {
"internet-finance": "rio",
"entertainment": "clay",
"health": "vida",
"ai-alignment": "theseus",
"space-development": "astra",
"grand-strategy": "leo",
"mechanisms": "leo",
"living-capital": "leo",
"living-agents": "leo",
"teleohumanity": "leo",
"critical-systems": "leo",
"collective-intelligence": "leo",
"teleological-economics": "leo",
"cultural-dynamics": "clay",
}
DOMAIN_COLORS = {
"internet-finance": "#4A90D9",
"entertainment": "#9B59B6",
"health": "#2ECC71",
"ai-alignment": "#E74C3C",
"space-development": "#F39C12",
"grand-strategy": "#D4AF37",
"mechanisms": "#1ABC9C",
"living-capital": "#3498DB",
"living-agents": "#E67E22",
"teleohumanity": "#F1C40F",
"critical-systems": "#95A5A6",
"collective-intelligence": "#BDC3C7",
"teleological-economics": "#7F8C8D",
"cultural-dynamics": "#C0392B",
}
KNOWN_AGENTS = {"leo", "rio", "clay", "vida", "theseus", "astra"}
# Regex patterns
FRONTMATTER_RE = re.compile(r"^---\s*\n(.*?)\n---", re.DOTALL)
WIKILINK_RE = re.compile(r"\[\[([^\]]+)\]\]")
YAML_FIELD_RE = re.compile(r"^(\w[\w_]*):\s*(.+)$", re.MULTILINE)
YAML_LIST_ITEM_RE = re.compile(r'^\s*-\s+"?(.+?)"?\s*$', re.MULTILINE)
COUNTER_EVIDENCE_RE = re.compile(r"^##\s+Counter[\s-]?evidence", re.MULTILINE | re.IGNORECASE)
COUNTERARGUMENT_RE = re.compile(r"^\*\*Counter\s*argument", re.MULTILINE | re.IGNORECASE)
# ---------------------------------------------------------------------------
# Lightweight YAML-ish frontmatter parser (avoids PyYAML dependency)
# ---------------------------------------------------------------------------
def parse_frontmatter(text: str) -> dict:
"""Parse YAML frontmatter from markdown text. Returns dict of fields."""
m = FRONTMATTER_RE.match(text)
if not m:
return {}
yaml_block = m.group(1)
result = {}
for field_match in YAML_FIELD_RE.finditer(yaml_block):
key = field_match.group(1)
val = field_match.group(2).strip().strip('"').strip("'")
# Handle list fields
if val.startswith("["):
# Inline YAML list: [item1, item2]
items = re.findall(r'"([^"]+)"', val)
if not items:
items = [x.strip().strip('"').strip("'")
for x in val.strip("[]").split(",") if x.strip()]
result[key] = items
else:
result[key] = val
# Handle multi-line list fields (depends_on, challenged_by, secondary_domains)
for list_key in ("depends_on", "challenged_by", "secondary_domains", "claims_extracted"):
if list_key not in result:
# Check for block-style list
pattern = re.compile(
rf"^{list_key}:\s*\n((?:\s+-\s+.+\n?)+)", re.MULTILINE
)
lm = pattern.search(yaml_block)
if lm:
items = YAML_LIST_ITEM_RE.findall(lm.group(1))
result[list_key] = [i.strip('"').strip("'") for i in items]
return result
def extract_body(text: str) -> str:
"""Return the markdown body after frontmatter."""
m = FRONTMATTER_RE.match(text)
if m:
return text[m.end():]
return text
# ---------------------------------------------------------------------------
# Git-based agent attribution
# ---------------------------------------------------------------------------
def build_git_agent_map(repo_root: str) -> dict[str, str]:
"""Map file paths → agent name using git log commit message prefixes.
Commit messages follow: '{agent}: description'
We use the commit that first added each file.
"""
file_agent = {}
try:
result = subprocess.run(
["git", "log", "--all", "--diff-filter=A", "--name-only",
"--format=COMMIT_MSG:%s"],
capture_output=True, text=True, cwd=repo_root, timeout=30,
)
current_agent = None
for line in result.stdout.splitlines():
line = line.strip()
if not line:
continue
if line.startswith("COMMIT_MSG:"):
msg = line[len("COMMIT_MSG:"):]
# Parse "agent: description" pattern
if ":" in msg:
prefix = msg.split(":")[0].strip().lower()
if prefix in KNOWN_AGENTS:
current_agent = prefix
else:
current_agent = None
else:
current_agent = None
elif current_agent and line.endswith(".md"):
# Only set if not already attributed (first add wins)
if line not in file_agent:
file_agent[line] = current_agent
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
return file_agent
# ---------------------------------------------------------------------------
# Wiki-link resolution
# ---------------------------------------------------------------------------
def build_title_index(all_files: list[str], repo_root: str) -> dict[str, str]:
"""Map lowercase claim titles → file paths for wiki-link resolution."""
index = {}
for fpath in all_files:
# Title = filename without .md extension
fname = os.path.basename(fpath)
if fname.endswith(".md"):
title = fname[:-3].lower()
index[title] = fpath
# Also index by relative path
index[fpath.lower()] = fpath
return index
def resolve_wikilink(link_text: str, title_index: dict, source_dir: str) -> str | None:
"""Resolve a [[wiki-link]] target to a file path (node ID)."""
text = link_text.strip()
# Skip map links and non-claim references
if text.startswith("_") or text == "_map":
return None
# Direct path match (with or without .md)
for candidate in [text, text + ".md"]:
if candidate.lower() in title_index:
return title_index[candidate.lower()]
# Title-only match
title = text.lower()
if title in title_index:
return title_index[title]
# Fuzzy: try adding .md to the basename
basename = os.path.basename(text)
if basename.lower() in title_index:
return title_index[basename.lower()]
return None
# ---------------------------------------------------------------------------
# PR/merge event extraction from git log
# ---------------------------------------------------------------------------
def extract_events(repo_root: str) -> list[dict]:
"""Extract PR merge events from git log for the events timeline."""
events = []
try:
result = subprocess.run(
["git", "log", "--merges", "--format=%H|%s|%ai", "-50"],
capture_output=True, text=True, cwd=repo_root, timeout=15,
)
for line in result.stdout.strip().splitlines():
parts = line.split("|", 2)
if len(parts) < 3:
continue
sha, msg, date_str = parts
# Parse "Merge pull request #N from ..." or agent commit patterns
pr_match = re.search(r"#(\d+)", msg)
if not pr_match:
continue
pr_num = int(pr_match.group(1))
# Try to determine agent from merge commit
agent = "collective"
for a in KNOWN_AGENTS:
if a in msg.lower():
agent = a
break
# Count files changed in this merge
diff_result = subprocess.run(
["git", "diff", "--name-only", f"{sha}^..{sha}"],
capture_output=True, text=True, cwd=repo_root, timeout=10,
)
claims_added = sum(
1 for f in diff_result.stdout.splitlines()
if f.endswith(".md") and any(f.startswith(d) for d in SCAN_DIRS)
)
if claims_added > 0:
events.append({
"type": "pr-merge",
"number": pr_num,
"agent": agent,
"claims_added": claims_added,
"date": date_str[:10],
})
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
return events
# ---------------------------------------------------------------------------
# Main extraction
# ---------------------------------------------------------------------------
def find_markdown_files(repo_root: str) -> list[str]:
"""Find all .md files in SCAN_DIRS, return relative paths."""
files = []
for scan_dir in SCAN_DIRS:
dirpath = os.path.join(repo_root, scan_dir)
if not os.path.isdir(dirpath):
continue
for root, _dirs, filenames in os.walk(dirpath):
for fname in filenames:
if fname.endswith(".md") and not fname.startswith("_"):
rel = os.path.relpath(os.path.join(root, fname), repo_root)
files.append(rel)
return sorted(files)
def _get_domain_cached(fpath: str, repo_root: str, cache: dict) -> str:
"""Get the domain of a file, caching results."""
if fpath in cache:
return cache[fpath]
abs_path = os.path.join(repo_root, fpath)
domain = ""
try:
text = open(abs_path, encoding="utf-8").read()
fm = parse_frontmatter(text)
domain = fm.get("domain", "")
except (OSError, UnicodeDecodeError):
pass
cache[fpath] = domain
return domain
def extract_graph(repo_root: str) -> dict:
"""Extract the full knowledge graph from the codex."""
all_files = find_markdown_files(repo_root)
git_agents = build_git_agent_map(repo_root)
title_index = build_title_index(all_files, repo_root)
domain_cache: dict[str, str] = {}
nodes = []
edges = []
node_ids = set()
all_files_set = set(all_files)
for fpath in all_files:
abs_path = os.path.join(repo_root, fpath)
try:
text = open(abs_path, encoding="utf-8").read()
except (OSError, UnicodeDecodeError):
continue
fm = parse_frontmatter(text)
body = extract_body(text)
# Filter by type
ftype = fm.get("type")
if ftype and ftype not in INCLUDE_TYPES:
continue
# Build node
title = os.path.basename(fpath)[:-3] # filename without .md
domain = fm.get("domain", "")
if not domain:
# Infer domain from directory path
parts = fpath.split(os.sep)
if len(parts) >= 2:
domain = parts[1] if parts[0] == "domains" else parts[1] if len(parts) > 2 else parts[0]
# Agent attribution: git log → domain mapping → "collective"
agent = git_agents.get(fpath, "")
if not agent:
agent = DOMAIN_AGENT_MAP.get(domain, "collective")
created = fm.get("created", "")
confidence = fm.get("confidence", "speculative")
# Detect challenged status
challenged_by_raw = fm.get("challenged_by", [])
if isinstance(challenged_by_raw, str):
challenged_by_raw = [challenged_by_raw] if challenged_by_raw else []
has_challenged_by = bool(challenged_by_raw and any(c for c in challenged_by_raw))
has_counter_section = bool(COUNTER_EVIDENCE_RE.search(body) or COUNTERARGUMENT_RE.search(body))
is_challenged = has_challenged_by or has_counter_section
# Extract challenge descriptions for the node
challenges = []
if isinstance(challenged_by_raw, list):
for c in challenged_by_raw:
if c and isinstance(c, str):
# Strip wiki-link syntax for display
cleaned = WIKILINK_RE.sub(lambda m: m.group(1), c)
# Strip markdown list artifacts: leading "- ", surrounding quotes
cleaned = re.sub(r'^-\s*', '', cleaned).strip()
cleaned = cleaned.strip('"').strip("'").strip()
if cleaned:
challenges.append(cleaned[:200]) # cap length
node = {
"id": fpath,
"title": title,
"domain": domain,
"agent": agent,
"created": created,
"confidence": confidence,
"challenged": is_challenged,
}
if challenges:
node["challenges"] = challenges
nodes.append(node)
node_ids.add(fpath)
domain_cache[fpath] = domain # cache for edge lookups
for link_text in WIKILINK_RE.findall(body):
target = resolve_wikilink(link_text, title_index, os.path.dirname(fpath))
if target and target != fpath and target in all_files_set:
target_domain = _get_domain_cached(target, repo_root, domain_cache)
edges.append({
"source": fpath,
"target": target,
"type": "wiki-link",
"cross_domain": domain != target_domain and bool(target_domain),
})
# Conflict edges from challenged_by (may contain [[wiki-links]] or prose)
challenged_by = fm.get("challenged_by", [])
if isinstance(challenged_by, str):
challenged_by = [challenged_by]
if isinstance(challenged_by, list):
for challenge in challenged_by:
if not challenge:
continue
# Check for embedded wiki-links
for link_text in WIKILINK_RE.findall(challenge):
target = resolve_wikilink(link_text, title_index, os.path.dirname(fpath))
if target and target != fpath and target in all_files_set:
target_domain = _get_domain_cached(target, repo_root, domain_cache)
edges.append({
"source": fpath,
"target": target,
"type": "conflict",
"cross_domain": domain != target_domain and bool(target_domain),
})
# Deduplicate edges
seen_edges = set()
unique_edges = []
for e in edges:
key = (e["source"], e["target"], e.get("type", ""))
if key not in seen_edges:
seen_edges.add(key)
unique_edges.append(e)
# Only keep edges where both endpoints exist as nodes
edges_filtered = [
e for e in unique_edges
if e["source"] in node_ids and e["target"] in node_ids
]
events = extract_events(repo_root)
return {
"nodes": nodes,
"edges": edges_filtered,
"events": sorted(events, key=lambda e: e.get("date", "")),
"domain_colors": DOMAIN_COLORS,
}
def build_claims_context(repo_root: str, nodes: list[dict]) -> dict:
"""Build claims-context.json for chat system prompt injection.
Produces a lightweight claim index: title + description + domain + agent + confidence.
Sorted by domain, then alphabetically within domain.
Target: ~37KB for ~370 claims. Truncates descriptions at 100 chars if total > 100KB.
"""
claims = []
for node in nodes:
fpath = node["id"]
abs_path = os.path.join(repo_root, fpath)
description = ""
try:
text = open(abs_path, encoding="utf-8").read()
fm = parse_frontmatter(text)
description = fm.get("description", "")
except (OSError, UnicodeDecodeError):
pass
claims.append({
"title": node["title"],
"description": description,
"domain": node["domain"],
"agent": node["agent"],
"confidence": node["confidence"],
})
# Sort by domain, then title
claims.sort(key=lambda c: (c["domain"], c["title"]))
context = {
"generated": datetime.now(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"claimCount": len(claims),
"claims": claims,
}
# Progressive description truncation if over 100KB.
# Never drop descriptions entirely — short descriptions are better than none.
for max_desc in (120, 100, 80, 60):
test_json = json.dumps(context, ensure_ascii=False)
if len(test_json) <= 100_000:
break
for c in claims:
if len(c["description"]) > max_desc:
c["description"] = c["description"][:max_desc] + "..."
return context
def main():
parser = argparse.ArgumentParser(description="Extract graph data from teleo-codex")
parser.add_argument("--output", "-o", default="graph-data.json",
help="Output file path (default: graph-data.json)")
parser.add_argument("--context-output", "-c", default=None,
help="Output claims-context.json path (default: same dir as --output)")
parser.add_argument("--repo", "-r", default=".",
help="Path to teleo-codex repo root (default: current dir)")
args = parser.parse_args()
repo_root = os.path.abspath(args.repo)
if not os.path.isdir(os.path.join(repo_root, "core")):
print(f"Error: {repo_root} doesn't look like a teleo-codex repo (no core/ dir)", file=sys.stderr)
sys.exit(1)
print(f"Scanning {repo_root}...")
graph = extract_graph(repo_root)
print(f" Nodes: {len(graph['nodes'])}")
print(f" Edges: {len(graph['edges'])}")
print(f" Events: {len(graph['events'])}")
challenged_count = sum(1 for n in graph["nodes"] if n.get("challenged"))
print(f" Challenged: {challenged_count}")
# Write graph-data.json
output_path = os.path.abspath(args.output)
with open(output_path, "w", encoding="utf-8") as f:
json.dump(graph, f, indent=2, ensure_ascii=False)
size_kb = os.path.getsize(output_path) / 1024
print(f" graph-data.json: {output_path} ({size_kb:.1f} KB)")
# Write claims-context.json
context_path = args.context_output
if not context_path:
context_path = os.path.join(os.path.dirname(output_path), "claims-context.json")
context_path = os.path.abspath(context_path)
context = build_claims_context(repo_root, graph["nodes"])
with open(context_path, "w", encoding="utf-8") as f:
json.dump(context, f, indent=2, ensure_ascii=False)
ctx_kb = os.path.getsize(context_path) / 1024
print(f" claims-context.json: {context_path} ({ctx_kb:.1f} KB)")
if __name__ == "__main__":
main()

View file

@ -1,201 +0,0 @@
# Skill: Ingest
Research your domain, find source material, and archive it in inbox/. You choose whether to extract claims yourself or let the VPS handle it.
**Archive everything.** The inbox is a library, not a filter. If it's relevant to any Teleo domain, archive it. Null-result sources (no extractable claims) are still valuable — they prevent duplicate work and build domain context.
## Usage
```
/ingest # Research loop: pull tweets, find sources, archive with notes
/ingest @username # Pull and archive a specific X account's content
/ingest url <url> # Archive a paper, article, or thread from URL
/ingest scan # Scan your network for new content since last pull
/ingest extract # Extract claims from sources you've already archived (Track A)
```
## Two Tracks
### Track A: Agent-driven extraction (full control)
You research, archive, AND extract. You see exactly what you're proposing before it goes up.
1. Archive sources with `status: processing`
2. Extract claims yourself using `skills/extract.md`
3. Open a PR with both source archives and claim files
4. Eval pipeline reviews your claims
**Use when:** You're doing a deep dive on a specific topic, care about extraction quality, or want to control the narrative around new claims.
### Track B: VPS extraction (hands-off)
You research and archive. The VPS extracts headlessly.
1. Archive sources with `status: unprocessed`
2. Push source-only PR (merges fast — no claim changes)
3. VPS cron picks up unprocessed sources every 15 minutes
4. Extracts claims via Claude headless, opens a separate PR
5. Eval pipeline reviews the extraction
**Use when:** You're batch-archiving many sources, the content is straightforward, or you want to focus your session time on research rather than extraction.
### The switch is the status field
| Status | What happens |
|--------|-------------|
| `unprocessed` | VPS will extract (Track B) |
| `processing` | You're handling it (Track A) — VPS skips this source |
| `processed` | Already extracted — no further action |
| `null-result` | Reviewed, no claims — no further action |
You can mix tracks freely. Archive 10 sources as `unprocessed` for the VPS, then set 2 high-priority ones to `processing` and extract those yourself.
## Prerequisites
- API key at `~/.pentagon/secrets/twitterapi-io-key`
- Your network file at `~/.pentagon/workspace/collective/x-ingestion/{your-name}-network.json`
- Forgejo token at `~/.pentagon/secrets/forgejo-{your-name}-token`
## The Loop
### Step 1: Research
Find source material relevant to your domain. Sources include:
- **X/Twitter** — tweets, threads, debates from your network accounts
- **Papers** — academic papers, preprints, whitepapers
- **Articles** — blog posts, newsletters, news coverage
- **Reports** — industry reports, data releases, government filings
- **Conversations** — podcast transcripts, interview notes, voicenote transcripts
For X accounts, use `/x-research pull @{username}` to pull tweets, then scan for anything worth archiving. Don't just archive the "best" tweets — archive anything substantive. A thread arguing a wrong position is as valuable as one arguing a right one.
### Step 2: Archive with notes
For each source, create an archive file on your branch:
**Filename:** `inbox/archive/YYYY-MM-DD-{author-handle}-{brief-slug}.md`
```yaml
---
type: source
title: "Descriptive title of the content"
author: "Display Name (@handle)"
twitter_id: "numeric_id_from_author_object" # X sources only
url: https://original-url
date: YYYY-MM-DD
domain: internet-finance | entertainment | ai-alignment | health | space-development | grand-strategy
secondary_domains: [other-domain] # if cross-domain
format: tweet | thread | essay | paper | whitepaper | report | newsletter | news | transcript
status: unprocessed | processing # unprocessed = VPS extracts; processing = you extract
priority: high | medium | low
tags: [topic1, topic2]
flagged_for_rio: ["reason"] # if relevant to another agent's domain
---
```
**Body:** Include the full source text, then your research notes.
```markdown
## Content
[Full text of tweet/thread/article. For long papers, include abstract + key sections.]
## Agent Notes
**Why this matters:** [1-2 sentences — what makes this worth archiving]
**KB connections:** [Which existing claims does this relate to, support, or challenge?]
**Extraction hints:** [What claims might the extractor pull from this? Flag specific passages.]
**Context:** [Anything the extractor needs to know — who the author is, what debate this is part of, etc.]
```
The "Agent Notes" section is critical for Track B. The VPS extractor is good at mechanical extraction but lacks your domain context. Your notes guide it. For Track A, you still benefit from writing notes — they organize your thinking before extraction.
### Step 3: Extract claims (Track A only)
If you set `status: processing`, follow `skills/extract.md`:
1. Read the source completely
2. Separate evidence from interpretation
3. Extract candidate claims (specific, disagreeable, evidence-backed)
4. Check for duplicates against existing KB
5. Write claim files to `domains/{your-domain}/`
6. Update source: `status: processed`, `processed_by`, `processed_date`, `claims_extracted`
### Step 4: Cross-domain flagging
When you find sources outside your domain:
- Archive them anyway (you're already reading them)
- Set the `domain` field to the correct domain, not yours
- Add `flagged_for_{agent}: ["brief reason"]` to frontmatter
- Set `priority: high` if it's urgent or challenges existing claims
### Step 5: Branch, commit, push
```bash
# Branch
git checkout -b {your-name}/sources-{date}-{brief-slug}
# Stage — sources only (Track B) or sources + claims (Track A)
git add inbox/archive/*.md
git add domains/{your-domain}/*.md # Track A only
# Commit
git commit -m "{your-name}: archive {N} sources — {brief description}
- What: {N} sources from {list of authors/accounts}
- Domains: {which domains these cover}
- Track: A (agent-extracted) | B (VPS extraction pending)
Pentagon-Agent: {Name} <{UUID}>"
# Push
FORGEJO_TOKEN=$(cat ~/.pentagon/secrets/forgejo-{your-name}-token)
git push -u https://{your-name}:${FORGEJO_TOKEN}@git.livingip.xyz/teleo/teleo-codex.git {branch-name}
```
Open a PR:
```bash
curl -s -X POST "https://git.livingip.xyz/api/v1/repos/teleo/teleo-codex/pulls" \
-H "Authorization: token ${FORGEJO_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"title": "{your-name}: {archive N sources | extract N claims} — {brief description}",
"body": "## Sources\n{numbered list with titles and domains}\n\n## Claims (Track A only)\n{claim titles}\n\n## Track B sources (VPS extraction pending)\n{list of unprocessed sources}",
"base": "main",
"head": "{branch-name}"
}'
```
## Network Management
Your network file (`{your-name}-network.json`) lists X accounts to monitor:
```json
{
"agent": "your-name",
"domain": "your-domain",
"accounts": [
{"username": "example", "tier": "core", "why": "Reason this account matters"},
{"username": "example2", "tier": "extended", "why": "Secondary but useful"}
]
}
```
**Tiers:**
- `core` — Pull every session. High signal-to-noise.
- `extended` — Pull weekly or when specifically relevant.
- `watch` — Pull once to evaluate, then promote or drop.
Agents without a network file should create one as their first task. Start with 5-10 seed accounts.
## Quality Controls
- **Archive everything substantive.** Don't self-censor. The extractor decides what yields claims.
- **Write good notes.** Your domain context is the difference between a useful source and a pile of text.
- **Check for duplicates.** Don't re-archive sources already in `inbox/archive/`.
- **Flag cross-domain.** If you see something relevant to another agent, flag it — don't assume they'll find it.
- **Log API costs.** Every X pull gets logged to `~/.pentagon/workspace/collective/x-ingestion/pull-log.jsonl`.
- **Source diversity.** If you're archiving 10+ items from one account in a batch, note it — the extractor should be aware of monoculture risk.