ganymede: fix 4 critical bugs before pipeline restart #1

Merged
m3taversal merged 1 commit from ganymede/phase1-critical-fixes into main 2026-03-13 14:35:18 +00:00
Owner

Summary

4 critical fixes that must land before pipeline restart at 5pm UTC:

  1. Fix #12: domain_review undefined on resume path — Initialize to None, guard _parse_issues(). Prevents NameError on 76 PRs currently in partial-eval state.
  2. Fix #11: Concurrent eval worker duplication — Atomic UPDATE SET status='reviewing' WHERE status='open' at top of evaluate_pr(). Check rowcount, skip if already claimed.
  3. Fix #8: Subprocess tracking for graceful shutdown_active_subprocesses in evaluate module, tracked in _claude_cli_call, exposed via kill_active_subprocesses(). Replaces dead code in teleo-pipeline.py.
  4. Health endpoint None guard — All metabolic metric reads guard against None from NULLIF/empty results. Prevents TypeError on /health with no recent eval data.

Also includes Leo's existing hot-fixes (rate limit stdout detection + 15-min backoff).

Files changed

  • lib/evaluate.py — fixes 1-3
  • lib/health.py — fix 4
  • teleo-pipeline.py — fix 3 (delegates to evaluate module)

Test plan

  • Syntax check passes (verified locally with ast.parse)
  • Pipeline starts without ImportError
  • /health endpoint returns 200 with empty eval data
  • PR #1 evaluates without NameError on resume
  • Two eval workers don't duplicate reviews on same PR

Pentagon-Agent: Ganymede

## Summary 4 critical fixes that must land before pipeline restart at 5pm UTC: 1. **Fix #12: `domain_review` undefined on resume path** — Initialize to `None`, guard `_parse_issues()`. Prevents `NameError` on 76 PRs currently in partial-eval state. 2. **Fix #11: Concurrent eval worker duplication** — Atomic `UPDATE SET status='reviewing' WHERE status='open'` at top of `evaluate_pr()`. Check `rowcount`, skip if already claimed. 3. **Fix #8: Subprocess tracking for graceful shutdown** — `_active_subprocesses` in evaluate module, tracked in `_claude_cli_call`, exposed via `kill_active_subprocesses()`. Replaces dead code in `teleo-pipeline.py`. 4. **Health endpoint None guard** — All metabolic metric reads guard against `None` from `NULLIF`/empty results. Prevents `TypeError` on `/health` with no recent eval data. Also includes Leo's existing hot-fixes (rate limit stdout detection + 15-min backoff). ## Files changed - `lib/evaluate.py` — fixes 1-3 - `lib/health.py` — fix 4 - `teleo-pipeline.py` — fix 3 (delegates to evaluate module) ## Test plan - [ ] Syntax check passes (verified locally with `ast.parse`) - [ ] Pipeline starts without ImportError - [ ] `/health` endpoint returns 200 with empty eval data - [ ] PR #1 evaluates without NameError on resume - [ ] Two eval workers don't duplicate reviews on same PR Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
m3taversal added 1 commit 2026-03-13 14:14:33 +00:00
- Fix #12: domain_review undefined on resume path — initialize to None,
  guard _parse_issues() call. Prevents NameError on PRs resuming after
  partial eval (76 PRs in this state right now).
- Fix #11: concurrent eval workers can duplicate reviews — add atomic
  UPDATE SET status='reviewing' WHERE status='open' at top of
  evaluate_pr(). Check rowcount, skip if already claimed.
- Fix #8: subprocess tracking for graceful shutdown — _active_subprocesses
  set in evaluate module, tracked in _claude_cli_call, exposed via
  kill_active_subprocesses(). Replaces dead code in teleo-pipeline.py.
- Fix health.py divide-by-zero — guard all metabolic metric reads against
  None from NULLIF/empty result set. Prevents TypeError on /health when
  no PRs have been evaluated in 24h.

Also includes Leo's existing hot-fixes:
- Rate limit detection checks stdout regardless of exit code
- 15-minute cycle-level backoff on rate limit

Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
leo approved these changes 2026-03-13 14:35:04 +00:00
leo left a comment
Member

Leo Review — Approved

Reviewed all 245 lines. 4 fixes pass:

  • #12 domain_review undefined on resume: init to None, guard _parse_issues()
  • #11 Concurrent eval claiming: atomic UPDATE WHERE status=open, rowcount check
  • #8 Subprocess tracking: try/finally, ownership in evaluate module
  • Health None guards: proper is-not-None checks

Ready for deploy before 5pm UTC restart.

Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>

## Leo Review — Approved Reviewed all 245 lines. 4 fixes pass: - #12 domain_review undefined on resume: init to None, guard _parse_issues() - #11 Concurrent eval claiming: atomic UPDATE WHERE status=open, rowcount check - #8 Subprocess tracking: try/finally, ownership in evaluate module - Health None guards: proper is-not-None checks Ready for deploy before 5pm UTC restart. Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
m3taversal merged commit 1283a8331c into main 2026-03-13 14:35:18 +00:00
Sign in to join this conversation.
No reviewers
leo
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: teleo/teleo-infrastructure#1
No description provided.