teleo-infrastructure/README.md

# teleo-infrastructure

This repo runs the pipeline that processes contributions into the
[teleo-codex](https://github.com/living-ip/teleo-codex) knowledge base.

Every claim on `main` has been extracted from a source, validated for schema
and duplicates, evaluated by at least two independent reviewers, and merged
through an event-sourced audit log. The whole flow is an async Python daemon
talking to a Forgejo git server, an SQLite WAL state store, OpenRouter (for
most LLM calls), and the Anthropic Claude CLI (for Opus deep reviews).

**Production state** (live):

| Metric | Value |
|---|---|
| Claims merged into `main` | 1,546 across 13 domains |
| PRs merged through the pipeline | 1,975 |
| Merge throughput (last 7d) | 508 PRs (~73/day) |
| Review approval rate | 94% |
| Cost per merged claim (last 30d) | $0.10 incl. extract + triage + multi-tier review |
| Production agents | 6 (rio, theseus, leo, vida, astra, clay) |

## Pipeline

Concurrent stage loops in a single daemon (`teleo-pipeline.py`), coordinated
by SQLite. Circuit breakers cap costs, retry budgets cap attempts, and merges
are serialized per-domain to avoid cross-PR conflicts.

```mermaid
flowchart LR
  Inbox["inbox/queue/"] --> Extract
  Extract["Extract<br/>(Sonnet 4.5)"] --> Validate
  Validate["Validate<br/>(tier 0, $0)"] --> Evaluate
  Evaluate["Evaluate<br/>(tiered, multi-model)"] --> Merge
  Merge["Merge<br/>(Forgejo, domain-serial)"] --> Effects
  Effects["Effects<br/>cascade · backlinks · reciprocal edges"]
```

If any reviewer rejects, the PR gets a structured rationale and either
re-extraction guidance (for fixable issues) or a terminal close (for
scope or duplicate problems). Approved merges trigger downstream effects:

- **Cascade** — agents whose beliefs/positions depend on the changed claim get inbox notifications
- **Bidirectional provenance** — `sourced_from:` is stamped on each claim at extraction; the source's `claims_extracted:` list is updated post-merge
- **Reciprocal edges** — when a new claim has `supports: [X]`, X's frontmatter is updated with `supports: [new]`
- **Cross-domain index** — entity mentions across domain boundaries are logged for silo detection

## Multi-agent review

Reviews aren't free. Tier classification is deterministic where possible
(changes to `core/` or `foundations/` always go Deep) and otherwise picked
by Haiku based on PR scope. Last 30d distribution: 76% Standard, 21% Light,
2% Deep.

```mermaid
flowchart TD
  PR[New PR] --> Classify{Classify}
  Classify -->|"core/, foundations/, challenged"| Deep
  Classify -->|default| Standard
  Classify -->|single claim, low risk| Light
  Light["Light tier<br/>Domain agent only"] --> Result
  Standard["Standard tier<br/>Domain agent + Leo (Sonnet 4.5)"] --> Result
  Deep["Deep tier<br/>Domain agent + Leo (Opus)"] --> Result
  Result{Both approve?}
  Result -->|yes| MergeOK[Merge]
  Result -->|no| Reject[Structured rejection<br/>+ re-extract guidance]
```

Domain agents bring domain expertise: **Rio** (internet-finance), **Vida**
(health), **Astra** (space-development), **Clay** (entertainment),
**Theseus** (ai-alignment). **Leo** brings cross-domain consistency on
every PR. Disagreement between the two reviewers surfaces in `audit_log`
and is tracked as a quality signal, not silenced.

Model diversity isn't cosmetic — same-family models share ~60% of their
errors (Kim et al. ICML 2025). Pipeline mixes Haiku for triage, Gemini 2.5
Flash for domain review, Sonnet 4.5 for Leo standard, Opus for Leo deep.

## Contributor flow

External contributors submit PRs to
[`living-ip/teleo-codex`](https://github.com/living-ip/teleo-codex) on GitHub.
A mirror sync (every 2 minutes) fast-forwards the PR onto Forgejo, where
the pipeline picks it up. From there it's the same flow as agent-authored
PRs — same tiers, same reviewers, same merge rules.

The contributor-facing guide lives in
[`teleo-codex/CONTRIBUTING.md`](https://github.com/living-ip/teleo-codex/blob/main/CONTRIBUTING.md).

## Repository layout

| Directory       | What it does                                              |
|-----------------|-----------------------------------------------------------|
| `lib/`          | Pipeline modules — config, db, extract, evaluate, merge, cascade |
| `diagnostics/`  | Argus monitoring dashboard (4 pages: ops, health, agents, epistemic) |
| `telegram/`     | Telegram bot that answers from the knowledge base         |
| `research/`     | Nightly autonomous research sessions for domain agents    |
| `agent-state/`  | File-backed state for cross-session agent continuity      |
| `deploy/`       | Auto-deploy pipeline (Forgejo → working dirs → systemd)   |
| `systemd/`      | Service definitions for daemon + dashboard + agents       |
| `scripts/`      | Backfills and one-off migrations                          |
| `tests/`        | pytest suite                                              |
| `docs/`         | Architecture specs and operational protocols              |

## Ownership

Code review authority is enforced by [`CODEOWNERS`](./CODEOWNERS) — every
file has one accountable agent. The high-level map:

- **Ship** — pipeline core, telegram, deploy, agent-state, research, systemd
- **Epimetheus** — extraction (intake, entity processing, pre-screening, post-extract validation)
- **Leo** — evaluation (claim review, analytics, attribution)
- **Argus** — health (diagnostics dashboard, alerting, claim index, search)
- **Ganymede** — tests (pytest suite, integration, code review gate)

For active sprint work and per-agent in-flight items, see each agent's
status report in their Pentagon profile.

## Development

```bash
pip install -e ".[dev]"
pytest
```

## Operations

Production deployment runs on a single VPS. Runbook, restart procedures,
secret rotation, and on-call live in the private
[`teleo-ops`](https://github.com/living-ip/teleo-ops) repo (request access).

## License

[TBD]