Teleo evaluation pipeline infrastructure — Python async daemon for claim extraction, validation, evaluation, and merge
- Claim-shape detector: if YAML has type: claim, force STANDARD minimum (Theseus) - Random pre-merge promotion: 15% of LIGHT → STANDARD before eval (Rio) - LIGHT_SKIP_LLM config flag: skip domain+Leo review for LIGHT (Rhea: env var rollback) - Updated both_approve: domain_verdict=skipped is valid for LIGHT auto-approve - Cost recording: only charge for reviews that actually ran - SAMPLE_AUDIT_RATE bumped 0.10 → 0.15, audit model = Opus (Leo: different family from Haiku) Multi-agent design review: Rio (gaming vectors, model diversity), Theseus (correlated blindspots, claim-shape guard), Rhea (shadow mode, config flag, deployment), Leo (approval). Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A> |
||
|---|---|---|
| .forgejo/workflows | ||
| lib | ||
| tests | ||
| .gitignore | ||
| deploy.sh | ||
| INFRASTRUCTURE.md | ||
| pyproject.toml | ||
| teleo-pipeline.py | ||
| teleo-pipeline.service | ||