twentyOne2x 7390e1e843 Implement phase 1b agent routing

2026-05-29 14:00:13 +02:00

9.2 KiB

Raw Blame History

Phase 1b Child Spec: GitHub Identity And Bot Posture

Created: 2026-05-29 Status: active draft Parent spec: docs/phase1b-agent-routing-spec.md

Product Outcome Contract

Phase 1b must post agent-specific verdicts for decision-engine PRs without requiring six separate GitHub accounts. Agent identity is represented in the comment content and verdict tags, while a single master bot account owns transport.

Goal

Define and implement the minimum GitHub identity and comment transport posture for Phase 1b:

canonical target is living-ip/decision-engine;
one master bot token is acceptable;
verdict comments preserve VERDICT:AGENT:*;
duplicate comments are prevented;
old Forgejo or mirror behavior remains rollback-safe until staging proof.

Non-Goals

Do not create separate GitHub users for all agents.
Do not require GitHub branch protection to count separate formal reviewers in Phase 1b.
Do not rewrite every Forgejo-named helper unless needed for Phase 1b comments.
Do not redesign contributor credit.
Do not revive deprecated eval shell scripts.

Current Implementation Audit

Current truth:

pipeline-health-check.py targets https://api.github.com/repos/living-ip/decision-engine.
research/research-session.sh targets GitHub living-ip/decision-engine and github-admin-token.
handoff/phase1-step3-script-migration.md documents Phase 1 single livingIPbot posture and defers per-agent identities.
lib/config.py still defaults to Forgejo teleo/teleo-codex.
lib/github_feedback.py hardcodes living-ip/teleo-codex and reads github-pat, not decision-engine and github-admin-token.
lib/evaluate.py posts review comments through Forgejo helpers and per-agent Forgejo tokens.
lib/github_feedback.py is a mirror feedback channel keyed by prs.github_pr, not the canonical review transport.
deploy/sync-mirror.sh still references living-ip/teleo-codex.
Fwaz confirmed separate GitHub identities are ideal and blocked on GitHub/PAT setup; Phase 1b implementation should not wait on six distinct accounts if the pipeline can post parseable VERDICT:AGENT:* comments through the pipeline bot.

Existing-Spec Inventory

Existing doc	Relevance	Decision
`docs/phase1b-agent-routing-spec.md`	Parent identity posture.	Reuse.
`handoff/phase1-step3-script-migration.md`	Documents single bot token and GitHub `decision-engine` migration for scripts.	Reuse.
`handoff/deprecated/eval-scripts.md`	Confirms old eval scripts should not be revived.	Reuse.

Goal-Vs-Repo-Truth Diff

Goal:

One canonical GitHub target for Phase 1b: living-ip/decision-engine.
One master bot token for Phase 1b comments.
Agent identity lives in verdict tags and comment headings.
Comment posting supports idempotency by PR, head SHA, and agent.

Repo truth:

GitHub target and token names are split across files.
Eval comments still use Forgejo helpers.
GitHub feedback is non-fatal mirror feedback, not agent review transport.

Completion Percent And Remaining Delta

Current completion: 15 percent.

Remaining delta:

Add explicit GitHub target config with staging override.
Normalize token file selection or document compatibility.
Add Phase 1b comment posting helper for GitHub decision-engine.
Add idempotency marker.
Add tests for URL target, token path, missing token, and duplicate prevention.
Decide direct GitHub mode versus Forgejo-mirror mode before staging.

Closure, Endpoint, And Deployment Truth

Local closure:

Tests prove comments target living-ip/decision-engine and token material is not logged.

Staging closure:

Sandbox PR comments are posted by master bot with agent verdict tags.

Production closure:

Live decision-engine PR comments are posted by master bot without duplicates.

Critical Assumptions And Invalidators

Assumptions:

One bot account is enough for Phase 1b.
Agent identity in verdict content satisfies acceptance.
Formal GitHub reviews from distinct accounts are not required now.
Per-agent PATs can be added later without changing the route contract.

Invalidators:

Branch protection requires distinct GitHub reviewer identities.
GitHub org disallows the selected PAT or bot account.
Production daemon must remain Forgejo-first for the cutover window.
Direct GitHub PRs lack the DB linkage used by existing github_feedback.

State And Truth Contract

Comment idempotency marker:

<!-- PHASE1B_REVIEW:PR=123:SHA=abc123:AGENT=RIO -->

Verdict marker remains:

<!-- VERDICT:RIO:APPROVE -->

Required config:

GITHUB_OWNER = "living-ip"
GITHUB_REPO = "decision-engine"
GITHUB_TOKEN_FILE = SECRETS_DIR / "github-admin-token"

Staging must override repo or owner without code changes.

Measurement Contract

Minimum tests:

URL builder targets https://api.github.com/repos/living-ip/decision-engine.
Staging override changes target.
Missing token returns non-fatal failure and audit detail.
Token value is never logged.
Duplicate marker prevents repeat comment for same PR, SHA, and agent.
Six agent verdict tags remain parseable.

Backend Work Required

Owned files:

lib/github_feedback.py or a new lib/github_reviews.py.
lib/config.py.
lib/evaluate.py only where the eval integration calls the comment helper.
tests/test_github_identity.py or equivalent.

Implementation steps:

Add canonical GitHub target config.
Add token lookup that prefers github-admin-token for Phase 1b and can fall back only if explicitly configured.
Add comment helper for agent verdict comments.
Add idempotency marker and readback check.
Add tests.
Wire eval integration to the helper under Phase 1b flag.

Forbidden files:

Deprecated eval shell scripts.
Production secrets.
Broad deploy rewrite.

Frontend Work Required

None.

Expected Runtime And User-Visible Behavior

PR comment example:

## Rio review

<review text>

<!-- PHASE1B_REVIEW:PR=123:SHA=abc123:AGENT=RIO -->
<!-- VERDICT:RIO:APPROVE -->

The GitHub account may be a master bot. The comment content must show which agent reviewed.

Validation And Test Matrix

Commands:

python3 -m pytest tests/test_github_identity.py tests/test_eval_parse.py
python3 -m ruff check lib/github_feedback.py lib/config.py tests/test_github_identity.py
git diff --check

Test cases:

canonical target
staging override
missing token
no token logging
idempotent comment marker
all six verdict tags parse

CI/CD, Release, And Pre-Push Gate Contract

Before PR:

Local tests prove target and idempotency.

Before staging:

Sandbox repo token exists.
Production token is not used.

Before production:

Bot account has comment permissions on decision-engine.
Rollback path is old Forgejo or disabled Phase 1b flag.

Independent CLI Audit Contract

Reviewer checks:

rg -n "teleo-codex|decision-engine|github-admin-token|github-pat|VERDICT|PHASE1B_REVIEW" lib tests pipeline-health-check.py research deploy

Audit questions:

Which files still target teleo-codex?
Are those files in the Phase 1b runtime path?
Does any log path expose token values?
Does idempotency prevent duplicate comments?

Outside-The-Box Fix Paths

If direct GitHub comments are not safe in the first PR:

Keep Forgejo review transport and post GitHub mirror feedback only in staging.
Add a dry-run comment mode that writes the planned body into audit logs.

If GitHub PAT remains blocked:

Use a GitHub App only for comment posting.
Keep master bot for git push but app token for PR comments.

Maintenance Capture

Beneficial now:

Name GitHub target config clearly.
Avoid proliferating github-pat versus github-admin-token.

Avoid now:

Separate agent GitHub users.
Full mirror rewrite.
Contributor identity overhaul.

Parallelization And Fanout

Classification: ready_now after the implementer explicitly chooses direct GitHub comments or Forgejo-mirror compatibility for the Phase 1b flag path.

Worker-ready prompt:

implement phase 1b github review comment posture. use one master bot token, target living-ip/decision-engine with staging override support, add agent-specific verdict comment helper with idempotency marker, and prove no token leakage. do not create separate agent accounts or rewrite deploy/mirror broadly.

Acceptance Criteria

Phase 1b comment helper targets decision-engine.
Master bot can post agent verdict tags.
Duplicate comments are prevented.
Missing token is non-fatal and auditable.
Existing old transport remains rollback-safe.

Readiness And Claim Boundaries

Allowed claim:

"Master-bot GitHub verdict comment posture is locally specified/tested."

Forbidden claim:

"Separate agent GitHub identities are solved."

Spec Quality Self-Audit

All required execution-grade headings are present. The exact direct-GitHub versus Forgejo-mirror cutover remains a deliberate implementation decision because current daemon code is Forgejo-first.

Assistant-Added Caveats

The repo has real target drift between teleo-codex and decision-engine. Do not hide that drift in the eval implementation. The Phase 1b PR should either fix the runtime path it uses or explicitly leave non-runtime references for a later migration.

9.2 KiB Raw Blame History