- teleo-pipeline.py: async daemon with 4 stage loops (ingest/validate/evaluate/merge) - lib/: config, db, evaluate, validate, merge, breaker, costs, health, log modules - INFRASTRUCTURE.md: comprehensive deep-dive for onboarding - teleo-pipeline.service: systemd unit file Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
447 lines
17 KiB
Markdown
447 lines
17 KiB
Markdown
# Teleo Infrastructure Deep Dive
|
|
|
|
## Overview
|
|
|
|
Teleo runs a **knowledge extraction and evaluation pipeline** on a single VPS. Six AI domain agents (Rio, Clay, Theseus, Vida, Astra, Leo) continuously extract claims from source material, evaluate them through a multi-stage review process, and merge approved claims into a shared knowledge base.
|
|
|
|
The system is mid-migration from **7 bash cron scripts** (v1) to a **single Python async daemon** (v2). Pipeline v2 handles validate, evaluate, and merge. Extraction still runs on v1 cron. Ingest (Phase 4) will complete the migration.
|
|
|
|
```
|
|
Source Material → Ingest → Validate → Evaluate → Merge → Knowledge Base
|
|
(cron v1) (stub) (v2) (v2) (v2) (git repo)
|
|
```
|
|
|
|
---
|
|
|
|
## VPS
|
|
|
|
- **Host**: `77.42.65.182` (Hetzner, Debian)
|
|
- **SSH**: `root@77.42.65.182` (key auth)
|
|
- **Disk**: 150GB, 19GB used (13%)
|
|
- **User**: `teleo` (pipeline runs as this user)
|
|
- **Base dir**: `/opt/teleo-eval/`
|
|
|
|
### Directory Layout
|
|
|
|
```
|
|
/opt/teleo-eval/
|
|
├── pipeline/ # Pipeline v2 daemon
|
|
│ ├── teleo-pipeline.py # Main entry point (4 async stage loops)
|
|
│ ├── pipeline.db # SQLite WAL state store (160KB)
|
|
│ ├── .venv/ # Python virtualenv (aiohttp)
|
|
│ └── lib/
|
|
│ ├── config.py # All constants, model assignments, overflow policies
|
|
│ ├── db.py # Schema, migrations, connection management
|
|
│ ├── validate.py # Tier 0 validation (schema, links, duplicates)
|
|
│ ├── evaluate.py # Triage + domain review + Leo review
|
|
│ ├── merge.py # Domain-serialized rebase + Forgejo API merge
|
|
│ ├── health.py # HTTP health API (localhost:8080)
|
|
│ ├── breaker.py # Circuit breaker per stage
|
|
│ ├── costs.py # API cost tracking with daily budgets
|
|
│ └── log.py # JSON structured logging
|
|
├── workspaces/
|
|
│ ├── teleo-codex.git/ # Bare repo (49MB) — pipeline's git backend
|
|
│ └── main/ # Main branch worktree (for validation checks)
|
|
├── mirror/
|
|
│ └── teleo-codex.git/ # Separate bare repo for GitHub↔Forgejo sync
|
|
├── secrets/
|
|
│ ├── forgejo-admin-token # Admin Forgejo API token
|
|
│ ├── forgejo-{agent}-token # Per-agent tokens (rio, clay, theseus, vida, astra, leo)
|
|
│ ├── github-pat # GitHub mirror push token
|
|
│ ├── openrouter-key # OpenRouter API key
|
|
│ ├── twitterapi-io-key # X/Twitter API key
|
|
│ └── x-bearer-token # X bearer token
|
|
├── logs/ # Log files for cron scripts and pipeline
|
|
├── *.sh # Legacy cron scripts (being replaced)
|
|
└── eval/ # Legacy eval scripts
|
|
```
|
|
|
|
---
|
|
|
|
## Services
|
|
|
|
### Forgejo (Git Forge)
|
|
|
|
- **Runs in**: Docker container (`codeberg.org/forgejo/forgejo:9`)
|
|
- **Port**: 3000 (HTTP), 2222 (SSH)
|
|
- **Public URL**: `https://git.livingip.xyz`
|
|
- **Repo**: `teleo/teleo-codex`
|
|
- **Purpose**: Hosts the knowledge base repo, manages PRs, stores review comments
|
|
- **Users**: Per-agent Forgejo accounts (`rio`, `clay`, `theseus`, `vida`, `astra`, `leo`, `teleo`)
|
|
|
|
### Pipeline v2 Daemon
|
|
|
|
- **Service**: `teleo-pipeline.service` (systemd)
|
|
- **Commands**: `systemctl {start|stop|restart|status} teleo-pipeline`
|
|
- **Logs**: `journalctl -u teleo-pipeline -f`
|
|
- **Health**: `curl localhost:8080/health`
|
|
- **Shutdown**: SIGTERM → 60s drain → force-cancel → kill subprocesses (180s total)
|
|
|
|
### Active Cron Jobs (teleo user)
|
|
|
|
| Schedule | Script | Purpose |
|
|
|----------|--------|---------|
|
|
| `*/3 * * *` | `extract-cron.sh` | Source extraction (v1, still active) |
|
|
| `*/2 * * *` | `sync-mirror.sh` | Forgejo↔GitHub bidirectional sync |
|
|
| `*/2 * * *` | `fetch-bare.sh` | Fetch latest into bare repo |
|
|
| `0 0 * * *` | `pipeline-health-check.sh` | Daily health metrics |
|
|
| `0 */2 * * *` | `pipeline-health-check.py` | 2-hourly health report |
|
|
|
|
### Disabled Cron Jobs (replaced by Pipeline v2)
|
|
|
|
- `fix-extraction-prs.py` — replaced by `validate.py`
|
|
- `eval-dispatcher.sh` — replaced by `evaluate.py`
|
|
- `merge-retry.sh` — replaced by `merge.py`
|
|
- Research sessions (rio, clay, theseus, vida, astra) — disabled during pipeline migration
|
|
|
|
### GitHub Mirror
|
|
|
|
- **Repo**: `github.com/user/teleo-codex` (public mirror)
|
|
- **Sync**: Bidirectional, Forgejo authoritative on conflict
|
|
- **Frequency**: Every 2 minutes via `sync-mirror.sh`
|
|
- **Security**: GitHub→Forgejo path never auto-processes branches. Only PRs trigger pipeline work.
|
|
|
|
---
|
|
|
|
## Pipeline v2 Architecture
|
|
|
|
### Stage Loop
|
|
|
|
Each stage runs as an async task with its own interval, circuit breaker, and shutdown check:
|
|
|
|
```python
|
|
async def stage_loop(name, interval, func, conn, breaker):
|
|
while not shutdown_event.is_set():
|
|
if breaker.allow_request():
|
|
succeeded, failed = await func(conn, max_workers=breaker.max_workers())
|
|
# Record success/failure for breaker
|
|
await asyncio.wait_for(shutdown_event.wait(), timeout=interval)
|
|
```
|
|
|
|
| Stage | Interval | Function | Status |
|
|
|-------|----------|----------|--------|
|
|
| Ingest | 60s | `ingest_cycle()` | **Stub** — Phase 4 |
|
|
| Validate | 30s | `validate_cycle()` | **Live** |
|
|
| Evaluate | 30s | `evaluate_cycle()` | **Live** |
|
|
| Merge | 30s | `merge_cycle()` | **Live** |
|
|
|
|
### Crash Recovery
|
|
|
|
On startup, the daemon recovers interrupted state from prior crashes:
|
|
|
|
1. Sources stuck in `extracting` → increment retry counter → `unprocessed` (or `error` if budget exhausted)
|
|
2. PRs stuck in `merging` → `approved` (re-enter merge queue)
|
|
3. PRs stuck in `reviewing` → `open` (re-enter eval queue)
|
|
4. Orphan git worktrees (`/tmp/teleo-extract-*`, `/tmp/teleo-merge-*`) cleaned up
|
|
|
|
---
|
|
|
|
## Stage 1: Validate (`lib/validate.py`)
|
|
|
|
Runs Tier 0 structural validation on PRs with `status='open'` and `tier0_pass IS NULL`.
|
|
|
|
### Checks
|
|
|
|
1. **Schema validation** — YAML frontmatter has required fields (type, domain, description, confidence, source, created)
|
|
2. **Date format** — `created` field is valid YYYY-MM-DD
|
|
3. **Title format** — Prose proposition, not a label (heuristic: 8+ words, no bare noun phrases)
|
|
4. **Wiki link validity** — `[[links]]` resolve to real files in the repo
|
|
5. **Universal quantifier check** — Flags claims using "all", "always", "never", "every" without scoping
|
|
6. **Domain-directory match** — Claim's `domain` field matches its file path
|
|
7. **Description quality** — Description adds info beyond the title (not a substring)
|
|
8. **Near-duplicate detection** — Trigram similarity against existing claims
|
|
9. **Proposition heuristic** — Title passes the claim test ("This note argues that [title]" works)
|
|
|
|
### Output
|
|
|
|
- Posts Tier 0 validation comment on Forgejo PR (with SHA-based idempotency marker)
|
|
- Sets `tier0_pass = 1` (pass) or `tier0_pass = 0` (fail)
|
|
- Failing PRs remain `status='open'` but are excluded from eval queue
|
|
|
|
---
|
|
|
|
## Stage 2: Evaluate (`lib/evaluate.py`)
|
|
|
|
The core intelligence stage. Domain-first, Leo-last architecture.
|
|
|
|
### PR Flow
|
|
|
|
```
|
|
PR (open, tier0_pass=1)
|
|
│
|
|
├─ Triage (Haiku/OpenRouter) → DEEP / STANDARD / LIGHT
|
|
│
|
|
├─ Domain Review (Sonnet/Claude Max → overflow GPT-4o/OpenRouter)
|
|
│ ├─ REJECT → status='open', feedback stored, Leo skipped
|
|
│ └─ APPROVE → continue to Leo
|
|
│
|
|
├─ Leo Review (Opus/Claude Max → overflow: queue only)
|
|
│ ├─ REJECT → status='open', feedback stored
|
|
│ └─ APPROVE → continue
|
|
│
|
|
├─ LIGHT tier: Leo skipped, domain-only gate
|
|
│
|
|
├─ Both approve → formal Forgejo approvals (2 agent tokens) → status='approved'
|
|
│
|
|
└─ Musings bypass: PRs touching only agents/*/musings/ auto-approve
|
|
```
|
|
|
|
### Model Routing
|
|
|
|
| Stage | Primary | Overflow | Policy |
|
|
|-------|---------|----------|--------|
|
|
| Triage | Haiku (OpenRouter) | — | Always API |
|
|
| Domain review | Sonnet (Claude Max) | GPT-4o (OpenRouter) | `overflow` |
|
|
| Leo review | Opus (Claude Max) | — | `queue` (never overflow) |
|
|
| DEEP cross-family | GPT-4o (OpenRouter) | — | Always API (not yet implemented) |
|
|
|
|
**Claude Max** is a subscription — free but rate-limited. When rate-limited, the CLI returns `"You've hit your limit"` on **stdout** (not stderr) with exit code 1. The pipeline detects this and applies the overflow policy.
|
|
|
|
**Key design principle**: Opus is the scarce resource. Domain review (Sonnet) filters first — high volume, catches most issues. Leo review (Opus) only sees pre-filtered PRs. This maximizes value per scarce Opus call.
|
|
|
|
### Domain Routing
|
|
|
|
Domain detection reads diff file paths (`domains/`, `entities/`, `core/`, `foundations/`) and maps to the responsible agent:
|
|
|
|
| Domain | Agent |
|
|
|--------|-------|
|
|
| internet-finance, mechanisms, living-capital, teleological-economics | Rio |
|
|
| entertainment, cultural-dynamics | Clay |
|
|
| ai-alignment, living-agents, critical-systems, collective-intelligence | Theseus |
|
|
| health | Vida |
|
|
| space-development | Astra |
|
|
| teleohumanity, grand-strategy | Leo |
|
|
|
|
### Backoff and Resume
|
|
|
|
- **10-minute backoff**: PRs attempted within the last 10 minutes are skipped (prevents retry storms during rate limits)
|
|
- **Domain review resume**: If domain review completed but Leo review was rate-limited, domain review is skipped on retry (no wasted OpenRouter calls)
|
|
- **`last_attempt` tracking**: Set at the start of `evaluate_pr`, persists through status revert
|
|
|
|
### Review Attribution
|
|
|
|
- Domain review comments post from the domain agent's Forgejo account (e.g., Rio posts Rio's review)
|
|
- Leo review comments post from Leo's Forgejo account
|
|
- Formal approvals come from 2 agent tokens (not the PR author)
|
|
|
|
### Verdict Parsing
|
|
|
|
Reviews end with HTML comment tags:
|
|
```
|
|
<!-- VERDICT:RIO:APPROVE -->
|
|
<!-- VERDICT:LEO:REQUEST_CHANGES -->
|
|
<!-- ISSUES: broken_wiki_links, confidence_miscalibration -->
|
|
```
|
|
|
|
---
|
|
|
|
## Stage 3: Merge (`lib/merge.py`)
|
|
|
|
Domain-serialized priority queue with rebase-before-merge.
|
|
|
|
### Design
|
|
|
|
- **Domain serialization**: Same-domain merges are serial (prevents `_map.md` conflicts). Cross-domain merges are parallel.
|
|
- **Two-layer locking**: `asyncio.Lock` per domain (fast path, lost on crash) + `prs.status='merging'` in SQLite (durable, crash recovery)
|
|
- **NOT EXISTS subquery**: SQL defense-in-depth prevents two PRs in the same domain from merging simultaneously
|
|
|
|
### Merge Flow
|
|
|
|
```
|
|
1. Discover external PRs (pagination over Forgejo API)
|
|
- Detect origin: pipeline vs human (by author login)
|
|
- Human PRs: priority='high', ack comment posted
|
|
|
|
2. For each domain with approved PRs:
|
|
a. Claim next PR (atomic UPDATE...RETURNING with priority queue)
|
|
b. Create git worktree at /tmp/teleo-merge-{branch}
|
|
c. Capture expected SHA (pin for force-with-lease)
|
|
d. Fetch origin/main, check if rebase needed
|
|
e. Rebase onto main (abort on conflict → status='conflict')
|
|
f. Force-push with --force-with-lease={branch}:{expected_sha}
|
|
g. Merge via Forgejo API
|
|
h. Delete remote branch
|
|
i. Cleanup worktree
|
|
```
|
|
|
|
### Priority Queue
|
|
|
|
```sql
|
|
COALESCE(p.priority, s.priority, 'medium')
|
|
-- PR-level priority > source-level priority > default 'medium'
|
|
-- NULL falls to ELSE 4 (intentionally below explicit medium)
|
|
```
|
|
|
|
| Priority | Value | Use |
|
|
|----------|-------|-----|
|
|
| critical | 0 | Reserved for explicit human override |
|
|
| high | 1 | Human-submitted PRs |
|
|
| medium | 2 | Standard pipeline PRs |
|
|
| low | 3 | Explicitly deprioritized |
|
|
| NULL | 4 | Unclassified (below medium) |
|
|
|
|
### Timeouts
|
|
|
|
- **Merge timeout**: 5 minutes per PR. Exceeding → `status='conflict'`
|
|
- **Rebase timeout**: 2 minutes
|
|
- **Push timeout**: 30 seconds
|
|
- **API merge failure**: Sets `status='conflict'` (not `approved` — prevents infinite retry)
|
|
|
|
---
|
|
|
|
## Database Schema
|
|
|
|
SQLite WAL mode. Schema version 2.
|
|
|
|
### Tables
|
|
|
|
**`sources`** — Source material pipeline
|
|
- `path` (PK), `status`, `priority`, `extraction_model`, `claims_count`, `pr_number`
|
|
- `transient_retries`, `substantive_retries`, `last_error`, `feedback`
|
|
|
|
**`prs`** — Pull request lifecycle
|
|
- `number` (PK), `source_path`, `branch`, `status`, `domain`, `tier`
|
|
- `tier0_pass`, `leo_verdict`, `domain_verdict`, `domain_agent`, `domain_model`
|
|
- `priority`, `origin` (pipeline/human), `last_attempt`
|
|
|
|
**`costs`** — API spend tracking
|
|
- `(date, model, stage)` (composite PK), `calls`, `input_tokens`, `output_tokens`, `cost_usd`
|
|
|
|
**`circuit_breakers`** — Per-stage health
|
|
- `name` (PK), `state` (closed/open/halfopen), `failures`, `successes`, `last_success_at`
|
|
|
|
**`audit_log`** — Event log
|
|
- `id`, `timestamp`, `stage`, `event`, `detail` (JSON)
|
|
|
|
### PR Status Lifecycle
|
|
|
|
```
|
|
open → validating → open (tier0_pass set)
|
|
→ reviewing → approved → merging → merged
|
|
→ open (rejected, feedback stored)
|
|
→ conflict (rebase/merge failed)
|
|
→ zombie (stuck, manual intervention)
|
|
```
|
|
|
|
---
|
|
|
|
## Health API
|
|
|
|
`GET localhost:8080/health` returns:
|
|
|
|
```json
|
|
{
|
|
"status": "healthy|degraded|stalled",
|
|
"breakers": {
|
|
"ingest": {"state": "closed", "failures": 0},
|
|
"validate": {"state": "closed", "failures": 0, "last_success_age_s": 30, "stalled": false},
|
|
"evaluate": {"state": "closed", "failures": 0, "last_success_age_s": 45, "stalled": false},
|
|
"merge": {"state": "closed", "failures": 0}
|
|
},
|
|
"sources": {"unprocessed": 10, "extracting": 2},
|
|
"prs": {"open": 117, "approved": 5, "merging": 1},
|
|
"merge_queue_by_domain": {"internet-finance": 3, "health": 2},
|
|
"budget": {"ok": true, "spend": 1.23, "budget": 20.0, "pct": 6.2},
|
|
"metabolic": {
|
|
"null_result_rate_24h": 0.05,
|
|
"domain_approval_rate_24h": 0.96,
|
|
"leo_approval_rate_24h": 0.85
|
|
}
|
|
}
|
|
```
|
|
|
|
**Stall detection**: If `now() - last_success_at > 2 * interval`, the stage is stalled.
|
|
|
|
---
|
|
|
|
## Circuit Breakers
|
|
|
|
Each stage has an independent circuit breaker:
|
|
|
|
- **Closed** (normal): All requests pass
|
|
- **Open** (tripped): Requests blocked for `BREAKER_COOLDOWN` (15 min)
|
|
- **Half-open**: One test request allowed; success → closed, failure → open
|
|
|
|
Triggers: 5 consecutive failures trip the breaker. Worker count reduces under pressure.
|
|
|
|
---
|
|
|
|
## Cost Management
|
|
|
|
- **Daily budget**: $20 USD (OpenRouter)
|
|
- **Warning threshold**: 80% of budget
|
|
- **Claude Max**: Free (tracked for volume, cost = $0)
|
|
- **Budget check**: Health API reports spend, pipeline can pause extraction when budget exhausted
|
|
|
|
---
|
|
|
|
## Known Issues and Deferred Work
|
|
|
|
### Active Issues
|
|
|
|
1. **PR #702 in `conflict`**: Archive-only PR, Forgejo returned 500 on merge API. Likely needs manual merge or close.
|
|
2. **36 PRs failed Tier 0**: Will not enter eval. Need either re-extraction or closure.
|
|
3. **Domain-rejected PR limbo** (Ganymede warning #4): PRs rejected by domain review have `status='open'` but exit the eval queue. No path to re-extraction or closure. Needs `domain_rejected` status or auto-close mechanism.
|
|
4. **DEEP cross-family review not implemented** (Ganymede warning #5): Docstring promises GPT-4o adversarial review for DEEP PRs after both domain and Leo approve. Not in code.
|
|
5. **Sonnet leniency tracking**: 96% domain approval rate. Need to measure Opus disagreement rate when it comes online (Mar 13, 5pm UTC). If Opus rejects >15% of domain-approved PRs, domain prompt needs tightening.
|
|
|
|
### Deferred Nits
|
|
|
|
- `entity_diff` from `_filter_diff()` is returned but unused
|
|
- Formal approvals use hardcoded agent order instead of actual reviewers
|
|
- `aiohttp.ClientSession` created per API call (should be one per cycle)
|
|
|
|
### Phase 4: Ingest Module (`lib/ingest.py`)
|
|
|
|
Not yet built. Will port `extract-cron.sh` + `extract-worker.sh`. When complete, the remaining v1 cron scripts can be disabled.
|
|
|
|
### Phase 5: Integration + Cutover
|
|
|
|
Full pipeline test with all 4 stages. Disable remaining cron scripts. Re-enable research sessions.
|
|
|
|
---
|
|
|
|
## Operational Runbook
|
|
|
|
### Check pipeline health
|
|
```bash
|
|
ssh root@77.42.65.182 'curl -s localhost:8080/health | python3 -m json.tool'
|
|
```
|
|
|
|
### View logs
|
|
```bash
|
|
ssh root@77.42.65.182 'journalctl -u teleo-pipeline -f' # live
|
|
ssh root@77.42.65.182 'journalctl -u teleo-pipeline -n 50' # recent
|
|
ssh root@77.42.65.182 'journalctl -u teleo-pipeline --since "1 hour ago"'
|
|
```
|
|
|
|
### Restart pipeline
|
|
```bash
|
|
ssh root@77.42.65.182 'systemctl restart teleo-pipeline'
|
|
```
|
|
|
|
### Query database
|
|
```bash
|
|
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "SELECT status, count(*) FROM prs GROUP BY status"'
|
|
```
|
|
|
|
### Deploy code changes
|
|
```bash
|
|
scp lib/evaluate.py root@77.42.65.182:/opt/teleo-eval/pipeline/lib/evaluate.py
|
|
ssh root@77.42.65.182 'chown teleo:teleo /opt/teleo-eval/pipeline/lib/evaluate.py && systemctl restart teleo-pipeline'
|
|
```
|
|
|
|
### Reset a stuck PR
|
|
```bash
|
|
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "UPDATE prs SET status = \"open\", leo_verdict = \"pending\", domain_verdict = \"pending\" WHERE number = 702"'
|
|
```
|
|
|
|
### Check circuit breakers
|
|
```bash
|
|
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "SELECT * FROM circuit_breakers"'
|
|
```
|
|
|
|
### View cost breakdown
|
|
```bash
|
|
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "SELECT model, stage, calls, cost_usd FROM costs WHERE date = date(\"now\") ORDER BY cost_usd DESC"'
|
|
```
|