Teleo evaluation pipeline infrastructure — Python async daemon for claim extraction, validation, evaluation, and merge
Closes the systematic external-PR attribution gap diagnosed on FwazB PR #4066. The Forgejo PR was being created via admin-token (m3taversal), so prs.submitted_by ended up as the bot identity. record_contributor_attribution treats m3taversal as a bot and finds no other signal for fork PRs (commit authors are bot-rewritten by sync-mirror), so zero author events emit. Mechanism — for gh-pr-* branches in the auto-create path: 1. After deriving GH_PR_NUM from branch name, GET /pulls/{N} from GitHub API 2. Extract user.login (e.g. "FwazB") from the PR's head author 3. Validate against GitHub username regex: ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$ 4. Lowercase to match contributor.py canonical handle storage 5. Include submitted_by in the same UPDATE that sets github_pr + source_channel The regex doubles as the SQL-injection safety boundary (no parametric binding in bash sqlite3). 14 cases tested locally including SQL injection probes — all rejected, all real handles pass. Failure modes: - API call fails or returns no user.login → fall back to link-only UPDATE (existing behavior). Better than failing the whole step. - Regex rejects malformed login → same fall-back. Preserves audit trail. Smoke-tested against real FwazB PR #90 on VPS: extracts "FwazB", lowercases to "fwazb", regex passes, would write submitted_by='fwazb'. Once deployed, record_contributor_attribution will trust submitted_by as fallback when no agent trailer found, emit author event with weight 0.30 → FwazB-class contributors auto-attributed end-to-end with zero manual backfill. Architectural decisions per Ship's Apr 28 sign-off: - (a) Sweep stays zero-API: no per-row API calls in Step 0 self-heal. This Step 4.5 fix is create-time only. Existing rows (FwazB) already manually backfilled — no further population to backfill. - (b) Skip _BOT_AUTHORS exception (#2 in original ticket): once submitted_by is correct, the bot filter doesn't fire on external PRs anymore. - (c) Defer frontmatter rewriting: convenience, not load-bearing. |
||
|---|---|---|
| .forgejo/workflows | ||
| agent-state | ||
| deploy | ||
| diagnostics | ||
| docs | ||
| hermes-agent | ||
| lib | ||
| ops | ||
| research | ||
| scripts | ||
| systemd | ||
| telegram | ||
| tests | ||
| .gitignore | ||
| CODEOWNERS | ||
| fetch_coins.py | ||
| pyproject.toml | ||
| README.md | ||
| reweave.py | ||
| teleo-pipeline.py | ||
teleo-infrastructure
Pipeline infrastructure for the Teleo collective knowledge base. Async Python daemon that extracts, validates, evaluates, and merges claims via Forgejo PRs.
Directory Structure
teleo-infrastructure/
├── teleo-pipeline.py # Daemon entry point
├── reweave.py # Reciprocal edge maintenance
├── lib/ # Pipeline modules (Python package)
├── diagnostics/ # Monitoring dashboard (port 8081)
├── telegram/ # Telegram bot interface
├── deploy/ # Deployment + mirror scripts
├── systemd/ # Service definitions
├── agent-state/ # Cross-session agent state
├── research/ # Nightly research orchestration
├── hermes-agent/ # Hermes agent setup
├── scripts/ # One-off backfills + migrations
├── tests/ # Test suite
└── docs/ # Operational documentation
Ownership
Each directory has one owning agent. The owner is accountable for correctness and reviews all changes to their section. See CODEOWNERS for per-file detail.
| Directory | Owner | What it does |
|---|---|---|
lib/ (core) |
Ship | Config, DB, merge, cascade, validation, LLM calls |
lib/ (extraction) |
Epimetheus | Source extraction, entity processing, pre-screening |
lib/ (evaluation) |
Leo | Claim evaluation, analytics, attribution |
lib/ (health) |
Argus | Health checks, search, claim index |
diagnostics/ |
Argus | 4-page dashboard, alerting, vitality metrics |
telegram/ |
Ship | Telegram bot, X integration, retrieval |
deploy/ |
Ship | rsync deploy, GitHub-Forgejo mirror |
systemd/ |
Ship | teleo-pipeline, teleo-diagnostics, teleo-agent@ |
agent-state/ |
Ship | Bootstrap, state library, cascade inbox processor |
research/ |
Ship | Nightly research sessions, prompt templates |
scripts/ |
Ship | Backfills, migrations, one-off maintenance |
tests/ |
Ganymede | pytest suite, integration tests |
docs/ |
Shared | Architecture, specs, protocols |
VPS Layout
Runs on Hetzner CAX31 (77.42.65.182) as user teleo.
| VPS Path | Repo Source | Service |
|---|---|---|
/opt/teleo-eval/pipeline/ |
lib/, teleo-pipeline.py, reweave.py |
teleo-pipeline |
/opt/teleo-eval/diagnostics/ |
diagnostics/ |
teleo-diagnostics |
/opt/teleo-eval/telegram/ |
telegram/ |
(manual) |
/opt/teleo-eval/agent-state/ |
agent-state/ |
(used by research-session.sh) |
Quick Start
# Run tests
pip install -e ".[dev]"
pytest
# Deploy to VPS
./deploy/deploy.sh --dry-run # preview
./deploy/deploy.sh # deploy