Phase 1 Step 3 — migrate research-session.sh and pipeline-health-check.py off Forgejo onto GitHub living-ip/decision-engine. eval-dispatcher.sh / eval-worker.sh documented as dead code (replaced by daemon).
5.9 KiB
Phase 1 Step 3: Script Migration to GitHub
Summary
Migrated critical-path scripts from Forgejo (git.livingip.xyz / teleo/teleo-codex) to GitHub (living-ip/decision-engine). Audit found two of the four planned scripts are dead code; scope reduced from 4 scripts to 2.
| Script | Status | Action |
|---|---|---|
research/research-session.sh |
live (cron paused 2026-05-12 pending Hermes) | migrated this PR |
pipeline-health-check.py (VPS root, unversioned) |
live, cron every 2h | migrated, deploy notes below |
eval/eval-dispatcher.sh |
dead since 2026-03-12 | deprecated, see handoff/deprecated/eval-scripts.md |
eval/eval-worker.sh |
dead since 2026-03-12 | deprecated, see handoff/deprecated/eval-scripts.md |
What changed in research/research-session.sh
Forgejo → GitHub rewire. Same control flow, same Claude invocation, same agent-state hooks. Only external integrations swapped.
| Change | Before | After |
|---|---|---|
| API base | http://localhost:3000 (Forgejo) |
https://api.github.com |
| Repo | teleo/teleo-codex |
living-ip/decision-engine |
| Token file | /opt/teleo-eval/secrets/forgejo-${AGENT}-token (per-agent), fallback to admin |
/opt/teleo-eval/secrets/github-admin-token (single livingIPbot, per Option A) |
| REST API auth | ?token=<pat> query or Authorization: token <pat> header |
Authorization: Bearer <pat> + GitHub API version header |
| Git auth | http.extraHeader: Authorization: token <pat> |
url.<base>.insteadOf rewrite injecting x-access-token:<pat>@github.com |
| PR list query | pulls?state=open then jq filter |
pulls?state=open&head=living-ip:<branch> (server-side filter) |
| PR create | POST /api/v1/repos/.../pulls |
POST /repos/.../pulls + GitHub API version header |
Per-agent identity (deferred)
Phase 1 uses Option A: single livingIPbot PAT for all agents. The AGENT_TOKEN variable remains as a placeholder so per-agent elevation in Phase 2 is a one-line change.
When Billy elevates: generate github-${AGENT}-token files at /opt/teleo-eval/secrets/, switch the PR-creation curl to use AGENT_TOKEN. Git operations stay on the bot token (it's the one with push access to all agent branches). Per-agent VERDICT comments / PR opens become visible in commit history as separate authors.
Security note: token in URL rewrite
The insteadOf rewrite injects the PAT into the URL only at command-execution time. It does NOT persist in .git/config or git remote -v. Verified: post-push remote -v shows the clean https://github.com/living-ip/decision-engine.git URL.
Risk surfaces that remain:
ps auxfduring the git command shows the rewrite arg with the token- If the script's log file gets verbose enough, token could appear in error output
Mitigation for Billy: switch to a git credential helper (git-credential-store or a custom helper that reads from the secrets file) to remove the in-flight exposure entirely. Out of scope for Phase 1.
Smoke test results
Performed against living-ip/decision-engine end-to-end, without invoking Claude:
✅ git clone (depth=1) via insteadOf rewrite
✅ branch create + commit
✅ git push (authenticated)
✅ PR list API (server-side head= filter)
✅ remote -v shows clean URL (token not persisted)
✅ branch cleanup
Static checks: bash -n passes, no residual Forgejo references in the file.
pipeline-health-check.py — deploy notes (NOT auto-deployed)
This script lives at /opt/teleo-eval/pipeline-health-check.py on the VPS — NOT in this repo. It was never added to teleo-infrastructure; lives only as a VPS-local script.
The migrated version is at /tmp/pipeline-health-check.py.new on the VPS. To go live:
# Backup current
cp /opt/teleo-eval/pipeline-health-check.py /opt/teleo-eval/pipeline-health-check.py.bak-pre-github
# Promote new version
cp /tmp/pipeline-health-check.py.new /opt/teleo-eval/pipeline-health-check.py
chmod +x /opt/teleo-eval/pipeline-health-check.py
# Cron continues to run it every 2h; no cron change needed.
Before promoting: confirm with Fwaz/m3ta whether the script should also be added to this repo for versioning. Recommended yes; out of scope for this PR.
Until promoted, the live VPS script keeps reading from Forgejo. Fine during cutover window. Will produce empty/stale metrics once Forgejo is decommissioned (Step 7) if not promoted by then.
Auto-deploy of research-session.sh
research/research-session.sh is in the repo's research/ directory. The auto-deploy script (teleo-auto-deploy.timer) rsyncs the repo into /opt/teleo-eval/pipeline/. Check whether research/ is in the rsync manifest — if not, the migrated script won't reach the runtime path that cron used to invoke (/opt/teleo-eval/research-session.sh).
If research/ is NOT in the rsync manifest (or the runtime path differs from pipeline/research/research-session.sh), Billy should add it during productionization. Until then, the migrated script needs a manual cp to /opt/teleo-eval/research-session.sh.
This was a pre-existing topology issue; not introduced by this PR.
When the cron gets re-enabled
The research-session crons were paused 2026-05-12 with comment PAUSED 2026-05-12 (architecture change). They should stay paused until Phase 1 Step 4 (Leo on Hermes) is verified — Hermes-Leo's research loop replaces this script for Leo.
For the other 5 agents (Theseus, Rio, Vida, Clay, Astra): this script remains the fallback path during the Hermes rollout. Billy uses Leo as the pattern and can either re-enable cron or invoke from Hermes per agent.
Hermes runtime note (Step 4 preview)
While auditing the repo, found hermes-agent/ directory in teleo-infrastructure root. Not investigated as part of Step 3. Will audit during Step 4.
Files changed in this PR
research/research-session.sh— migrated (+29 / −14 lines)handoff/phase1-step3-script-migration.md— this file (new)handoff/deprecated/eval-scripts.md— deprecation notes (new)