|
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
reweave.py and ingestion run as the operator Forgejo token, so the prior
opener-based classifier set submitted_by=m3taversal for every system
maintenance PR. backfill_submitted_by.py never overrides non-NULL rows,
so this misattribution accumulated: ~2,748 reweave/ingestion PRs and
~3,706 <agent>/ research/entity PRs were credited to the operator on
the leaderboard and contribution_events table.
Two parts:
1. lib/merge.py: at PR discovery, classify by branch prefix first.
reweave/, ingestion/ -> submitted_by = 'pipeline'
<agent>/ (per _AGENT_NAMES) -> submitted_by = '<agent>'
otherwise human -> submitted_by = author.lower()
otherwise pipeline -> submitted_by = None
(extract.py sets from proposed_by)
Origin flag updated so domain detection and priority still fire for
branch-classified pipeline PRs. Human PRs lowercased to maintain the
canonical-handle contract enforced in PR #9.
2. scripts/reattribute-by-branch-prefix.py: historical cleanup.
Per affected PR (atomic):
- UPDATE prs.submitted_by -> target
- UPDATE sources.submitted_by where source_path matches
- UPDATE contribution_events handle ('m3taversal',role='author')
-> target, kind='agent'. Collision (target already has author
event for PR) deletes the m3ta row; target wins.
Scope is deliberately conservative: extract/ branches stay attributed
to m3taversal because proposed_by-missing legitimately defaults to the
operator (telegram drops). Only reweave/, ingestion/, and <agent>/.
Dry-run shows 6,454 PRs + 284 events to move. Pre-flight collision
query returns 0; pre-flight kind check confirms m3ta has only role=author
events on this set (no challenger/synthesizer/evaluator).
Idempotent. Dry-run by default. Run with --apply after deploy + DB
snapshot.
|
||
|---|---|---|
| .forgejo/workflows | ||
| agent-state | ||
| deploy | ||
| diagnostics | ||
| docs | ||
| hermes-agent | ||
| lib | ||
| ops | ||
| research | ||
| scripts | ||
| systemd | ||
| telegram | ||
| tests | ||
| .gitignore | ||
| CODEOWNERS | ||
| fetch_coins.py | ||
| pyproject.toml | ||
| README.md | ||
| reweave.py | ||
| teleo-pipeline.py | ||
teleo-infrastructure
This repo runs the pipeline that processes contributions into the teleo-codex knowledge base.
Every claim on main has been extracted from a source, validated for schema
and duplicates, evaluated by at least two independent reviewers, and merged
through an event-sourced audit log. The whole flow is an async Python daemon
talking to a Forgejo git server, an SQLite WAL state store, OpenRouter (for
most LLM calls), and the Anthropic Claude CLI (for Opus deep reviews).
Production state (live):
| Metric | Value |
|---|---|
Claims merged into main |
1,546 across 13 domains |
| PRs merged through the pipeline | 1,975 |
| Merge throughput (last 7d) | 508 PRs (~73/day) |
| Review approval rate | 94% |
| Cost per merged claim (last 30d) | $0.10 incl. extract + triage + multi-tier review |
| Production agents | 6 (rio, theseus, leo, vida, astra, clay) |
Pipeline
Concurrent stage loops in a single daemon (teleo-pipeline.py), coordinated
by SQLite. Circuit breakers cap costs, retry budgets cap attempts, and merges
are serialized per-domain to avoid cross-PR conflicts.
flowchart LR
Inbox["inbox/queue/"] --> Extract
Extract["Extract<br/>(Sonnet 4.5)"] --> Validate
Validate["Validate<br/>(tier 0, $0)"] --> Evaluate
Evaluate["Evaluate<br/>(tiered, multi-model)"] --> Merge
Merge["Merge<br/>(Forgejo, domain-serial)"] --> Effects
Effects["Effects<br/>cascade · backlinks · reciprocal edges"]
If any reviewer rejects, the PR gets a structured rationale and either re-extraction guidance (for fixable issues) or a terminal close (for scope or duplicate problems). Approved merges trigger downstream effects:
- Cascade — agents whose beliefs/positions depend on the changed claim get inbox notifications
- Bidirectional provenance —
sourced_from:is stamped on each claim at extraction; the source'sclaims_extracted:list is updated post-merge - Reciprocal edges — when a new claim has
supports: [X], X's frontmatter is updated withsupports: [new] - Cross-domain index — entity mentions across domain boundaries are logged for silo detection
Multi-agent review
Reviews aren't free. Tier classification is deterministic where possible
(changes to core/ or foundations/ always go Deep) and otherwise picked
by Haiku based on PR scope. Last 30d distribution: 76% Standard, 21% Light,
2% Deep.
flowchart TD
PR[New PR] --> Classify{Classify}
Classify -->|"core/, foundations/, challenged"| Deep
Classify -->|default| Standard
Classify -->|single claim, low risk| Light
Light["Light tier<br/>Domain agent only"] --> Result
Standard["Standard tier<br/>Domain agent + Leo (Sonnet 4.5)"] --> Result
Deep["Deep tier<br/>Domain agent + Leo (Opus)"] --> Result
Result{Both approve?}
Result -->|yes| MergeOK[Merge]
Result -->|no| Reject[Structured rejection<br/>+ re-extract guidance]
Domain agents bring domain expertise: Rio (internet-finance), Vida
(health), Astra (space-development), Clay (entertainment),
Theseus (ai-alignment). Leo brings cross-domain consistency on
every PR. Disagreement between the two reviewers surfaces in audit_log
and is tracked as a quality signal, not silenced.
Model diversity isn't cosmetic — same-family models share ~60% of their errors (Kim et al. ICML 2025). Pipeline mixes Haiku for triage, Gemini 2.5 Flash for domain review, Sonnet 4.5 for Leo standard, Opus for Leo deep.
Contributor flow
External contributors submit PRs to
living-ip/teleo-codex on GitHub.
A mirror sync (every 2 minutes) fast-forwards the PR onto Forgejo, where
the pipeline picks it up. From there it's the same flow as agent-authored
PRs — same tiers, same reviewers, same merge rules.
The contributor-facing guide lives in
teleo-codex/CONTRIBUTING.md.
Repository layout
| Directory | What it does |
|---|---|
lib/ |
Pipeline modules — config, db, extract, evaluate, merge, cascade |
diagnostics/ |
Argus monitoring dashboard (4 pages: ops, health, agents, epistemic) |
telegram/ |
Telegram bot that answers from the knowledge base |
research/ |
Nightly autonomous research sessions for domain agents |
agent-state/ |
File-backed state for cross-session agent continuity |
deploy/ |
Auto-deploy pipeline (Forgejo → working dirs → systemd) |
systemd/ |
Service definitions for daemon + dashboard + agents |
scripts/ |
Backfills and one-off migrations |
tests/ |
pytest suite |
docs/ |
Architecture specs and operational protocols |
Ownership
Code review authority is enforced by CODEOWNERS — every
file has one accountable agent. The high-level map:
- Ship — pipeline core, telegram, deploy, agent-state, research, systemd
- Epimetheus — extraction (intake, entity processing, pre-screening, post-extract validation)
- Leo — evaluation (claim review, analytics, attribution)
- Argus — health (diagnostics dashboard, alerting, claim index, search)
- Ganymede — tests (pytest suite, integration, code review gate)
For active sprint work and per-agent in-flight items, see each agent's status report in their Pentagon profile.
Development
pip install -e ".[dev]"
pytest
Operations
Production deployment runs on a single VPS. Runbook, restart procedures,
secret rotation, and on-call live in the private
teleo-ops repo (request access).
License
[TBD]