296 lines
9.2 KiB
Markdown
296 lines
9.2 KiB
Markdown
# Phase 1b Child Spec: GitHub Identity And Bot Posture
|
|
|
|
Created: 2026-05-29
|
|
Status: active draft
|
|
Parent spec: `docs/phase1b-agent-routing-spec.md`
|
|
|
|
## Product Outcome Contract
|
|
|
|
Phase 1b must post agent-specific verdicts for `decision-engine` PRs without requiring six separate GitHub accounts. Agent identity is represented in the comment content and verdict tags, while a single master bot account owns transport.
|
|
|
|
## Goal
|
|
|
|
Define and implement the minimum GitHub identity and comment transport posture for Phase 1b:
|
|
|
|
- canonical target is `living-ip/decision-engine`;
|
|
- one master bot token is acceptable;
|
|
- verdict comments preserve `VERDICT:AGENT:*`;
|
|
- duplicate comments are prevented;
|
|
- old Forgejo or mirror behavior remains rollback-safe until staging proof.
|
|
|
|
## Non-Goals
|
|
|
|
- Do not create separate GitHub users for all agents.
|
|
- Do not require GitHub branch protection to count separate formal reviewers in Phase 1b.
|
|
- Do not rewrite every Forgejo-named helper unless needed for Phase 1b comments.
|
|
- Do not redesign contributor credit.
|
|
- Do not revive deprecated eval shell scripts.
|
|
|
|
## Current Implementation Audit
|
|
|
|
Current truth:
|
|
|
|
- `pipeline-health-check.py` targets `https://api.github.com/repos/living-ip/decision-engine`.
|
|
- `research/research-session.sh` targets GitHub `living-ip/decision-engine` and `github-admin-token`.
|
|
- `handoff/phase1-step3-script-migration.md` documents Phase 1 single `livingIPbot` posture and defers per-agent identities.
|
|
- `lib/config.py` still defaults to Forgejo `teleo/teleo-codex`.
|
|
- `lib/github_feedback.py` hardcodes `living-ip/teleo-codex` and reads `github-pat`, not `decision-engine` and `github-admin-token`.
|
|
- `lib/evaluate.py` posts review comments through Forgejo helpers and per-agent Forgejo tokens.
|
|
- `lib/github_feedback.py` is a mirror feedback channel keyed by `prs.github_pr`, not the canonical review transport.
|
|
- `deploy/sync-mirror.sh` still references `living-ip/teleo-codex`.
|
|
- Fwaz confirmed separate GitHub identities are ideal and blocked on GitHub/PAT setup; Phase 1b implementation should not wait on six distinct accounts if the pipeline can post parseable `VERDICT:AGENT:*` comments through the pipeline bot.
|
|
|
|
## Existing-Spec Inventory
|
|
|
|
| Existing doc | Relevance | Decision |
|
|
| --- | --- | --- |
|
|
| `docs/phase1b-agent-routing-spec.md` | Parent identity posture. | Reuse. |
|
|
| `handoff/phase1-step3-script-migration.md` | Documents single bot token and GitHub `decision-engine` migration for scripts. | Reuse. |
|
|
| `handoff/deprecated/eval-scripts.md` | Confirms old eval scripts should not be revived. | Reuse. |
|
|
|
|
## Goal-Vs-Repo-Truth Diff
|
|
|
|
Goal:
|
|
|
|
- One canonical GitHub target for Phase 1b: `living-ip/decision-engine`.
|
|
- One master bot token for Phase 1b comments.
|
|
- Agent identity lives in verdict tags and comment headings.
|
|
- Comment posting supports idempotency by PR, head SHA, and agent.
|
|
|
|
Repo truth:
|
|
|
|
- GitHub target and token names are split across files.
|
|
- Eval comments still use Forgejo helpers.
|
|
- GitHub feedback is non-fatal mirror feedback, not agent review transport.
|
|
|
|
## Completion Percent And Remaining Delta
|
|
|
|
Current completion: 15 percent.
|
|
|
|
Remaining delta:
|
|
|
|
1. Add explicit GitHub target config with staging override.
|
|
2. Normalize token file selection or document compatibility.
|
|
3. Add Phase 1b comment posting helper for GitHub `decision-engine`.
|
|
4. Add idempotency marker.
|
|
5. Add tests for URL target, token path, missing token, and duplicate prevention.
|
|
6. Decide direct GitHub mode versus Forgejo-mirror mode before staging.
|
|
|
|
## Closure, Endpoint, And Deployment Truth
|
|
|
|
Local closure:
|
|
|
|
- Tests prove comments target `living-ip/decision-engine` and token material is not logged.
|
|
|
|
Staging closure:
|
|
|
|
- Sandbox PR comments are posted by master bot with agent verdict tags.
|
|
|
|
Production closure:
|
|
|
|
- Live `decision-engine` PR comments are posted by master bot without duplicates.
|
|
|
|
## Critical Assumptions And Invalidators
|
|
|
|
Assumptions:
|
|
|
|
- One bot account is enough for Phase 1b.
|
|
- Agent identity in verdict content satisfies acceptance.
|
|
- Formal GitHub reviews from distinct accounts are not required now.
|
|
- Per-agent PATs can be added later without changing the route contract.
|
|
|
|
Invalidators:
|
|
|
|
- Branch protection requires distinct GitHub reviewer identities.
|
|
- GitHub org disallows the selected PAT or bot account.
|
|
- Production daemon must remain Forgejo-first for the cutover window.
|
|
- Direct GitHub PRs lack the DB linkage used by existing `github_feedback`.
|
|
|
|
## State And Truth Contract
|
|
|
|
Comment idempotency marker:
|
|
|
|
```text
|
|
<!-- PHASE1B_REVIEW:PR=123:SHA=abc123:AGENT=RIO -->
|
|
```
|
|
|
|
Verdict marker remains:
|
|
|
|
```text
|
|
<!-- VERDICT:RIO:APPROVE -->
|
|
```
|
|
|
|
Required config:
|
|
|
|
```python
|
|
GITHUB_OWNER = "living-ip"
|
|
GITHUB_REPO = "decision-engine"
|
|
GITHUB_TOKEN_FILE = SECRETS_DIR / "github-admin-token"
|
|
```
|
|
|
|
Staging must override repo or owner without code changes.
|
|
|
|
## Measurement Contract
|
|
|
|
Minimum tests:
|
|
|
|
- URL builder targets `https://api.github.com/repos/living-ip/decision-engine`.
|
|
- Staging override changes target.
|
|
- Missing token returns non-fatal failure and audit detail.
|
|
- Token value is never logged.
|
|
- Duplicate marker prevents repeat comment for same PR, SHA, and agent.
|
|
- Six agent verdict tags remain parseable.
|
|
|
|
## Backend Work Required
|
|
|
|
Owned files:
|
|
|
|
- `lib/github_feedback.py` or a new `lib/github_reviews.py`.
|
|
- `lib/config.py`.
|
|
- `lib/evaluate.py` only where the eval integration calls the comment helper.
|
|
- `tests/test_github_identity.py` or equivalent.
|
|
|
|
Implementation steps:
|
|
|
|
1. Add canonical GitHub target config.
|
|
2. Add token lookup that prefers `github-admin-token` for Phase 1b and can fall back only if explicitly configured.
|
|
3. Add comment helper for agent verdict comments.
|
|
4. Add idempotency marker and readback check.
|
|
5. Add tests.
|
|
6. Wire eval integration to the helper under Phase 1b flag.
|
|
|
|
Forbidden files:
|
|
|
|
- Deprecated eval shell scripts.
|
|
- Production secrets.
|
|
- Broad deploy rewrite.
|
|
|
|
## Frontend Work Required
|
|
|
|
None.
|
|
|
|
## Expected Runtime And User-Visible Behavior
|
|
|
|
PR comment example:
|
|
|
|
```text
|
|
## Rio review
|
|
|
|
<review text>
|
|
|
|
<!-- PHASE1B_REVIEW:PR=123:SHA=abc123:AGENT=RIO -->
|
|
<!-- VERDICT:RIO:APPROVE -->
|
|
```
|
|
|
|
The GitHub account may be a master bot. The comment content must show which agent reviewed.
|
|
|
|
## Validation And Test Matrix
|
|
|
|
Commands:
|
|
|
|
```bash
|
|
python3 -m pytest tests/test_github_identity.py tests/test_eval_parse.py
|
|
python3 -m ruff check lib/github_feedback.py lib/config.py tests/test_github_identity.py
|
|
git diff --check
|
|
```
|
|
|
|
Test cases:
|
|
|
|
- canonical target
|
|
- staging override
|
|
- missing token
|
|
- no token logging
|
|
- idempotent comment marker
|
|
- all six verdict tags parse
|
|
|
|
## CI/CD, Release, And Pre-Push Gate Contract
|
|
|
|
Before PR:
|
|
|
|
- Local tests prove target and idempotency.
|
|
|
|
Before staging:
|
|
|
|
- Sandbox repo token exists.
|
|
- Production token is not used.
|
|
|
|
Before production:
|
|
|
|
- Bot account has comment permissions on `decision-engine`.
|
|
- Rollback path is old Forgejo or disabled Phase 1b flag.
|
|
|
|
## Independent CLI Audit Contract
|
|
|
|
Reviewer checks:
|
|
|
|
```bash
|
|
rg -n "teleo-codex|decision-engine|github-admin-token|github-pat|VERDICT|PHASE1B_REVIEW" lib tests pipeline-health-check.py research deploy
|
|
```
|
|
|
|
Audit questions:
|
|
|
|
- Which files still target `teleo-codex`?
|
|
- Are those files in the Phase 1b runtime path?
|
|
- Does any log path expose token values?
|
|
- Does idempotency prevent duplicate comments?
|
|
|
|
## Outside-The-Box Fix Paths
|
|
|
|
If direct GitHub comments are not safe in the first PR:
|
|
|
|
- Keep Forgejo review transport and post GitHub mirror feedback only in staging.
|
|
- Add a dry-run comment mode that writes the planned body into audit logs.
|
|
|
|
If GitHub PAT remains blocked:
|
|
|
|
- Use a GitHub App only for comment posting.
|
|
- Keep master bot for git push but app token for PR comments.
|
|
|
|
## Maintenance Capture
|
|
|
|
Beneficial now:
|
|
|
|
- Name GitHub target config clearly.
|
|
- Avoid proliferating `github-pat` versus `github-admin-token`.
|
|
|
|
Avoid now:
|
|
|
|
- Separate agent GitHub users.
|
|
- Full mirror rewrite.
|
|
- Contributor identity overhaul.
|
|
|
|
## Parallelization And Fanout
|
|
|
|
Classification: ready_now after the implementer explicitly chooses direct GitHub comments or Forgejo-mirror compatibility for the Phase 1b flag path.
|
|
|
|
Worker-ready prompt:
|
|
|
|
```text
|
|
implement phase 1b github review comment posture. use one master bot token, target living-ip/decision-engine with staging override support, add agent-specific verdict comment helper with idempotency marker, and prove no token leakage. do not create separate agent accounts or rewrite deploy/mirror broadly.
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- Phase 1b comment helper targets `decision-engine`.
|
|
- Master bot can post agent verdict tags.
|
|
- Duplicate comments are prevented.
|
|
- Missing token is non-fatal and auditable.
|
|
- Existing old transport remains rollback-safe.
|
|
|
|
## Readiness And Claim Boundaries
|
|
|
|
Allowed claim:
|
|
|
|
- "Master-bot GitHub verdict comment posture is locally specified/tested."
|
|
|
|
Forbidden claim:
|
|
|
|
- "Separate agent GitHub identities are solved."
|
|
|
|
## Spec Quality Self-Audit
|
|
|
|
All required execution-grade headings are present. The exact direct-GitHub versus Forgejo-mirror cutover remains a deliberate implementation decision because current daemon code is Forgejo-first.
|
|
|
|
## Assistant-Added Caveats
|
|
|
|
The repo has real target drift between `teleo-codex` and `decision-engine`. Do not hide that drift in the eval implementation. The Phase 1b PR should either fix the runtime path it uses or explicitly leave non-runtime references for a later migration.
|