- teleo-pipeline.py: async daemon with 4 stage loops (ingest/validate/evaluate/merge) - lib/: config, db, evaluate, validate, merge, breaker, costs, health, log modules - INFRASTRUCTURE.md: comprehensive deep-dive for onboarding - teleo-pipeline.service: systemd unit file Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
17 KiB
Teleo Infrastructure Deep Dive
Overview
Teleo runs a knowledge extraction and evaluation pipeline on a single VPS. Six AI domain agents (Rio, Clay, Theseus, Vida, Astra, Leo) continuously extract claims from source material, evaluate them through a multi-stage review process, and merge approved claims into a shared knowledge base.
The system is mid-migration from 7 bash cron scripts (v1) to a single Python async daemon (v2). Pipeline v2 handles validate, evaluate, and merge. Extraction still runs on v1 cron. Ingest (Phase 4) will complete the migration.
Source Material → Ingest → Validate → Evaluate → Merge → Knowledge Base
(cron v1) (stub) (v2) (v2) (v2) (git repo)
VPS
- Host:
77.42.65.182(Hetzner, Debian) - SSH:
root@77.42.65.182(key auth) - Disk: 150GB, 19GB used (13%)
- User:
teleo(pipeline runs as this user) - Base dir:
/opt/teleo-eval/
Directory Layout
/opt/teleo-eval/
├── pipeline/ # Pipeline v2 daemon
│ ├── teleo-pipeline.py # Main entry point (4 async stage loops)
│ ├── pipeline.db # SQLite WAL state store (160KB)
│ ├── .venv/ # Python virtualenv (aiohttp)
│ └── lib/
│ ├── config.py # All constants, model assignments, overflow policies
│ ├── db.py # Schema, migrations, connection management
│ ├── validate.py # Tier 0 validation (schema, links, duplicates)
│ ├── evaluate.py # Triage + domain review + Leo review
│ ├── merge.py # Domain-serialized rebase + Forgejo API merge
│ ├── health.py # HTTP health API (localhost:8080)
│ ├── breaker.py # Circuit breaker per stage
│ ├── costs.py # API cost tracking with daily budgets
│ └── log.py # JSON structured logging
├── workspaces/
│ ├── teleo-codex.git/ # Bare repo (49MB) — pipeline's git backend
│ └── main/ # Main branch worktree (for validation checks)
├── mirror/
│ └── teleo-codex.git/ # Separate bare repo for GitHub↔Forgejo sync
├── secrets/
│ ├── forgejo-admin-token # Admin Forgejo API token
│ ├── forgejo-{agent}-token # Per-agent tokens (rio, clay, theseus, vida, astra, leo)
│ ├── github-pat # GitHub mirror push token
│ ├── openrouter-key # OpenRouter API key
│ ├── twitterapi-io-key # X/Twitter API key
│ └── x-bearer-token # X bearer token
├── logs/ # Log files for cron scripts and pipeline
├── *.sh # Legacy cron scripts (being replaced)
└── eval/ # Legacy eval scripts
Services
Forgejo (Git Forge)
- Runs in: Docker container (
codeberg.org/forgejo/forgejo:9) - Port: 3000 (HTTP), 2222 (SSH)
- Public URL:
https://git.livingip.xyz - Repo:
teleo/teleo-codex - Purpose: Hosts the knowledge base repo, manages PRs, stores review comments
- Users: Per-agent Forgejo accounts (
rio,clay,theseus,vida,astra,leo,teleo)
Pipeline v2 Daemon
- Service:
teleo-pipeline.service(systemd) - Commands:
systemctl {start|stop|restart|status} teleo-pipeline - Logs:
journalctl -u teleo-pipeline -f - Health:
curl localhost:8080/health - Shutdown: SIGTERM → 60s drain → force-cancel → kill subprocesses (180s total)
Active Cron Jobs (teleo user)
| Schedule | Script | Purpose |
|---|---|---|
*/3 * * * |
extract-cron.sh |
Source extraction (v1, still active) |
*/2 * * * |
sync-mirror.sh |
Forgejo↔GitHub bidirectional sync |
*/2 * * * |
fetch-bare.sh |
Fetch latest into bare repo |
0 0 * * * |
pipeline-health-check.sh |
Daily health metrics |
0 */2 * * * |
pipeline-health-check.py |
2-hourly health report |
Disabled Cron Jobs (replaced by Pipeline v2)
fix-extraction-prs.py— replaced byvalidate.pyeval-dispatcher.sh— replaced byevaluate.pymerge-retry.sh— replaced bymerge.py- Research sessions (rio, clay, theseus, vida, astra) — disabled during pipeline migration
GitHub Mirror
- Repo:
github.com/user/teleo-codex(public mirror) - Sync: Bidirectional, Forgejo authoritative on conflict
- Frequency: Every 2 minutes via
sync-mirror.sh - Security: GitHub→Forgejo path never auto-processes branches. Only PRs trigger pipeline work.
Pipeline v2 Architecture
Stage Loop
Each stage runs as an async task with its own interval, circuit breaker, and shutdown check:
async def stage_loop(name, interval, func, conn, breaker):
while not shutdown_event.is_set():
if breaker.allow_request():
succeeded, failed = await func(conn, max_workers=breaker.max_workers())
# Record success/failure for breaker
await asyncio.wait_for(shutdown_event.wait(), timeout=interval)
| Stage | Interval | Function | Status |
|---|---|---|---|
| Ingest | 60s | ingest_cycle() |
Stub — Phase 4 |
| Validate | 30s | validate_cycle() |
Live |
| Evaluate | 30s | evaluate_cycle() |
Live |
| Merge | 30s | merge_cycle() |
Live |
Crash Recovery
On startup, the daemon recovers interrupted state from prior crashes:
- Sources stuck in
extracting→ increment retry counter →unprocessed(orerrorif budget exhausted) - PRs stuck in
merging→approved(re-enter merge queue) - PRs stuck in
reviewing→open(re-enter eval queue) - Orphan git worktrees (
/tmp/teleo-extract-*,/tmp/teleo-merge-*) cleaned up
Stage 1: Validate (lib/validate.py)
Runs Tier 0 structural validation on PRs with status='open' and tier0_pass IS NULL.
Checks
- Schema validation — YAML frontmatter has required fields (type, domain, description, confidence, source, created)
- Date format —
createdfield is valid YYYY-MM-DD - Title format — Prose proposition, not a label (heuristic: 8+ words, no bare noun phrases)
- Wiki link validity —
[[links]]resolve to real files in the repo - Universal quantifier check — Flags claims using "all", "always", "never", "every" without scoping
- Domain-directory match — Claim's
domainfield matches its file path - Description quality — Description adds info beyond the title (not a substring)
- Near-duplicate detection — Trigram similarity against existing claims
- Proposition heuristic — Title passes the claim test ("This note argues that [title]" works)
Output
- Posts Tier 0 validation comment on Forgejo PR (with SHA-based idempotency marker)
- Sets
tier0_pass = 1(pass) ortier0_pass = 0(fail) - Failing PRs remain
status='open'but are excluded from eval queue
Stage 2: Evaluate (lib/evaluate.py)
The core intelligence stage. Domain-first, Leo-last architecture.
PR Flow
PR (open, tier0_pass=1)
│
├─ Triage (Haiku/OpenRouter) → DEEP / STANDARD / LIGHT
│
├─ Domain Review (Sonnet/Claude Max → overflow GPT-4o/OpenRouter)
│ ├─ REJECT → status='open', feedback stored, Leo skipped
│ └─ APPROVE → continue to Leo
│
├─ Leo Review (Opus/Claude Max → overflow: queue only)
│ ├─ REJECT → status='open', feedback stored
│ └─ APPROVE → continue
│
├─ LIGHT tier: Leo skipped, domain-only gate
│
├─ Both approve → formal Forgejo approvals (2 agent tokens) → status='approved'
│
└─ Musings bypass: PRs touching only agents/*/musings/ auto-approve
Model Routing
| Stage | Primary | Overflow | Policy |
|---|---|---|---|
| Triage | Haiku (OpenRouter) | — | Always API |
| Domain review | Sonnet (Claude Max) | GPT-4o (OpenRouter) | overflow |
| Leo review | Opus (Claude Max) | — | queue (never overflow) |
| DEEP cross-family | GPT-4o (OpenRouter) | — | Always API (not yet implemented) |
Claude Max is a subscription — free but rate-limited. When rate-limited, the CLI returns "You've hit your limit" on stdout (not stderr) with exit code 1. The pipeline detects this and applies the overflow policy.
Key design principle: Opus is the scarce resource. Domain review (Sonnet) filters first — high volume, catches most issues. Leo review (Opus) only sees pre-filtered PRs. This maximizes value per scarce Opus call.
Domain Routing
Domain detection reads diff file paths (domains/, entities/, core/, foundations/) and maps to the responsible agent:
| Domain | Agent |
|---|---|
| internet-finance, mechanisms, living-capital, teleological-economics | Rio |
| entertainment, cultural-dynamics | Clay |
| ai-alignment, living-agents, critical-systems, collective-intelligence | Theseus |
| health | Vida |
| space-development | Astra |
| teleohumanity, grand-strategy | Leo |
Backoff and Resume
- 10-minute backoff: PRs attempted within the last 10 minutes are skipped (prevents retry storms during rate limits)
- Domain review resume: If domain review completed but Leo review was rate-limited, domain review is skipped on retry (no wasted OpenRouter calls)
last_attempttracking: Set at the start ofevaluate_pr, persists through status revert
Review Attribution
- Domain review comments post from the domain agent's Forgejo account (e.g., Rio posts Rio's review)
- Leo review comments post from Leo's Forgejo account
- Formal approvals come from 2 agent tokens (not the PR author)
Verdict Parsing
Reviews end with HTML comment tags:
<!-- VERDICT:RIO:APPROVE -->
<!-- VERDICT:LEO:REQUEST_CHANGES -->
<!-- ISSUES: broken_wiki_links, confidence_miscalibration -->
Stage 3: Merge (lib/merge.py)
Domain-serialized priority queue with rebase-before-merge.
Design
- Domain serialization: Same-domain merges are serial (prevents
_map.mdconflicts). Cross-domain merges are parallel. - Two-layer locking:
asyncio.Lockper domain (fast path, lost on crash) +prs.status='merging'in SQLite (durable, crash recovery) - NOT EXISTS subquery: SQL defense-in-depth prevents two PRs in the same domain from merging simultaneously
Merge Flow
1. Discover external PRs (pagination over Forgejo API)
- Detect origin: pipeline vs human (by author login)
- Human PRs: priority='high', ack comment posted
2. For each domain with approved PRs:
a. Claim next PR (atomic UPDATE...RETURNING with priority queue)
b. Create git worktree at /tmp/teleo-merge-{branch}
c. Capture expected SHA (pin for force-with-lease)
d. Fetch origin/main, check if rebase needed
e. Rebase onto main (abort on conflict → status='conflict')
f. Force-push with --force-with-lease={branch}:{expected_sha}
g. Merge via Forgejo API
h. Delete remote branch
i. Cleanup worktree
Priority Queue
COALESCE(p.priority, s.priority, 'medium')
-- PR-level priority > source-level priority > default 'medium'
-- NULL falls to ELSE 4 (intentionally below explicit medium)
| Priority | Value | Use |
|---|---|---|
| critical | 0 | Reserved for explicit human override |
| high | 1 | Human-submitted PRs |
| medium | 2 | Standard pipeline PRs |
| low | 3 | Explicitly deprioritized |
| NULL | 4 | Unclassified (below medium) |
Timeouts
- Merge timeout: 5 minutes per PR. Exceeding →
status='conflict' - Rebase timeout: 2 minutes
- Push timeout: 30 seconds
- API merge failure: Sets
status='conflict'(notapproved— prevents infinite retry)
Database Schema
SQLite WAL mode. Schema version 2.
Tables
sources — Source material pipeline
path(PK),status,priority,extraction_model,claims_count,pr_numbertransient_retries,substantive_retries,last_error,feedback
prs — Pull request lifecycle
number(PK),source_path,branch,status,domain,tiertier0_pass,leo_verdict,domain_verdict,domain_agent,domain_modelpriority,origin(pipeline/human),last_attempt
costs — API spend tracking
(date, model, stage)(composite PK),calls,input_tokens,output_tokens,cost_usd
circuit_breakers — Per-stage health
name(PK),state(closed/open/halfopen),failures,successes,last_success_at
audit_log — Event log
id,timestamp,stage,event,detail(JSON)
PR Status Lifecycle
open → validating → open (tier0_pass set)
→ reviewing → approved → merging → merged
→ open (rejected, feedback stored)
→ conflict (rebase/merge failed)
→ zombie (stuck, manual intervention)
Health API
GET localhost:8080/health returns:
{
"status": "healthy|degraded|stalled",
"breakers": {
"ingest": {"state": "closed", "failures": 0},
"validate": {"state": "closed", "failures": 0, "last_success_age_s": 30, "stalled": false},
"evaluate": {"state": "closed", "failures": 0, "last_success_age_s": 45, "stalled": false},
"merge": {"state": "closed", "failures": 0}
},
"sources": {"unprocessed": 10, "extracting": 2},
"prs": {"open": 117, "approved": 5, "merging": 1},
"merge_queue_by_domain": {"internet-finance": 3, "health": 2},
"budget": {"ok": true, "spend": 1.23, "budget": 20.0, "pct": 6.2},
"metabolic": {
"null_result_rate_24h": 0.05,
"domain_approval_rate_24h": 0.96,
"leo_approval_rate_24h": 0.85
}
}
Stall detection: If now() - last_success_at > 2 * interval, the stage is stalled.
Circuit Breakers
Each stage has an independent circuit breaker:
- Closed (normal): All requests pass
- Open (tripped): Requests blocked for
BREAKER_COOLDOWN(15 min) - Half-open: One test request allowed; success → closed, failure → open
Triggers: 5 consecutive failures trip the breaker. Worker count reduces under pressure.
Cost Management
- Daily budget: $20 USD (OpenRouter)
- Warning threshold: 80% of budget
- Claude Max: Free (tracked for volume, cost = $0)
- Budget check: Health API reports spend, pipeline can pause extraction when budget exhausted
Known Issues and Deferred Work
Active Issues
- PR #702 in
conflict: Archive-only PR, Forgejo returned 500 on merge API. Likely needs manual merge or close. - 36 PRs failed Tier 0: Will not enter eval. Need either re-extraction or closure.
- Domain-rejected PR limbo (Ganymede warning #4): PRs rejected by domain review have
status='open'but exit the eval queue. No path to re-extraction or closure. Needsdomain_rejectedstatus or auto-close mechanism. - DEEP cross-family review not implemented (Ganymede warning #5): Docstring promises GPT-4o adversarial review for DEEP PRs after both domain and Leo approve. Not in code.
- Sonnet leniency tracking: 96% domain approval rate. Need to measure Opus disagreement rate when it comes online (Mar 13, 5pm UTC). If Opus rejects >15% of domain-approved PRs, domain prompt needs tightening.
Deferred Nits
entity_difffrom_filter_diff()is returned but unused- Formal approvals use hardcoded agent order instead of actual reviewers
aiohttp.ClientSessioncreated per API call (should be one per cycle)
Phase 4: Ingest Module (lib/ingest.py)
Not yet built. Will port extract-cron.sh + extract-worker.sh. When complete, the remaining v1 cron scripts can be disabled.
Phase 5: Integration + Cutover
Full pipeline test with all 4 stages. Disable remaining cron scripts. Re-enable research sessions.
Operational Runbook
Check pipeline health
ssh root@77.42.65.182 'curl -s localhost:8080/health | python3 -m json.tool'
View logs
ssh root@77.42.65.182 'journalctl -u teleo-pipeline -f' # live
ssh root@77.42.65.182 'journalctl -u teleo-pipeline -n 50' # recent
ssh root@77.42.65.182 'journalctl -u teleo-pipeline --since "1 hour ago"'
Restart pipeline
ssh root@77.42.65.182 'systemctl restart teleo-pipeline'
Query database
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "SELECT status, count(*) FROM prs GROUP BY status"'
Deploy code changes
scp lib/evaluate.py root@77.42.65.182:/opt/teleo-eval/pipeline/lib/evaluate.py
ssh root@77.42.65.182 'chown teleo:teleo /opt/teleo-eval/pipeline/lib/evaluate.py && systemctl restart teleo-pipeline'
Reset a stuck PR
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "UPDATE prs SET status = \"open\", leo_verdict = \"pending\", domain_verdict = \"pending\" WHERE number = 702"'
Check circuit breakers
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "SELECT * FROM circuit_breakers"'
View cost breakdown
ssh root@77.42.65.182 'sqlite3 /opt/teleo-eval/pipeline/pipeline.db "SELECT model, stage, calls, cost_usd FROM costs WHERE date = date(\"now\") ORDER BY cost_usd DESC"'