Add phase 1b local review guide
This commit is contained in:
parent
ca96f5f8e3
commit
cdb0b1498d
1 changed files with 125 additions and 0 deletions
125
docs/phase1b/local-review-guide.md
Normal file
125
docs/phase1b/local-review-guide.md
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
# Phase 1b Local Review Guide
|
||||
|
||||
Status: local-only review artifact
|
||||
Branch: `phase1b-agent-routing-local`
|
||||
|
||||
## What This Repo Is
|
||||
|
||||
`teleo-infrastructure` is the pipeline/runtime repo. For Phase 1b, it owns the evaluation daemon logic that watches PRs, fetches diffs, runs reviewers, posts verdict comments, and moves PR state toward merge or feedback.
|
||||
|
||||
Canonical split for this phase:
|
||||
|
||||
- KB repo: `decision-engine`
|
||||
- implementation/runtime repo: `teleo-infrastructure`
|
||||
- production runtime: VPS under `/opt/teleo-eval`, not currently accessible from this workspace
|
||||
|
||||
## What This Branch Changes
|
||||
|
||||
Local code changes:
|
||||
|
||||
- `lib/agent_routing.py`: new pure router that maps a PR diff to one or two Hermes agents.
|
||||
- `lib/config.py`: adds `PHASE1B_AGENT_ROUTING_ENABLED`, default `false`.
|
||||
- `lib/evaluate.py`: adds a feature-flagged Phase 1b eval path.
|
||||
- `lib/llm.py`: adds `run_agent_review`.
|
||||
- `tests/test_agent_routing.py`: router tests.
|
||||
- `tests/test_evaluate_agent_routing.py`: mocked eval tests.
|
||||
- `tests/test_eval_parse.py`: all six `VERDICT:AGENT:*` parser coverage.
|
||||
|
||||
Spec/docs changes:
|
||||
|
||||
- `docs/phase1b-agent-routing-spec.md`
|
||||
- `docs/phase1b/README.md`
|
||||
- child specs under `docs/phase1b/`
|
||||
- `docs/phase1b/staging-blocker.json`
|
||||
|
||||
## What It Does Not Change
|
||||
|
||||
- It does not enable Phase 1b in production.
|
||||
- It does not touch the VPS.
|
||||
- It does not create or require six GitHub identities.
|
||||
- It does not solve the Forgejo-vs-GitHub cutover.
|
||||
- It does not fix unrelated full-suite failures.
|
||||
|
||||
## Current Safety Posture
|
||||
|
||||
The feature flag defaults off:
|
||||
|
||||
```text
|
||||
PHASE1B_AGENT_ROUTING_ENABLED=false
|
||||
```
|
||||
|
||||
With the flag off, the legacy eval path remains available. The Phase 1b path should only run in staging or a controlled daemon after explicit env config.
|
||||
|
||||
The local review hardening pass removed changes to `lib/domains.py` so the legacy domain map is not changed by this branch.
|
||||
|
||||
## Local Proof
|
||||
|
||||
Focused proof that currently passes:
|
||||
|
||||
```bash
|
||||
.venv/bin/python -m pytest tests/test_agent_routing.py tests/test_evaluate_agent_routing.py tests/test_eval_parse.py
|
||||
.venv/bin/ruff check lib/agent_routing.py lib/domains.py lib/evaluate.py lib/llm.py lib/config.py tests/test_agent_routing.py tests/test_evaluate_agent_routing.py
|
||||
git diff --check
|
||||
```
|
||||
|
||||
Latest focused result:
|
||||
|
||||
```text
|
||||
61 passed
|
||||
ruff: all checks passed
|
||||
git diff --check: passed
|
||||
```
|
||||
|
||||
Full-suite status:
|
||||
|
||||
```text
|
||||
406 passed, 12 failed, 3 errors
|
||||
```
|
||||
|
||||
Known full-suite failure groups:
|
||||
|
||||
- `db.migrate` fresh-fixture rebuild error: `prs_new has no column named auto_merge`
|
||||
- contributor test fixture missing `submitted_by`
|
||||
- date/frontmatter expectations in `test_post_extract.py`
|
||||
- search threshold expectation in `test_search.py`
|
||||
- missing `python-telegram-bot` imports for X content tests
|
||||
|
||||
Those failures mean this branch should not be called repo-green or PR-ready.
|
||||
|
||||
## How To Review Locally
|
||||
|
||||
Stay local:
|
||||
|
||||
```bash
|
||||
git switch phase1b-agent-routing-local
|
||||
git status --short --branch
|
||||
git diff main...HEAD --stat
|
||||
git diff main...HEAD -- lib/agent_routing.py lib/evaluate.py lib/llm.py lib/config.py
|
||||
```
|
||||
|
||||
Review the behavior in this order:
|
||||
|
||||
1. `lib/agent_routing.py`
|
||||
2. `tests/test_agent_routing.py`
|
||||
3. `lib/evaluate.py`
|
||||
4. `tests/test_evaluate_agent_routing.py`
|
||||
5. `docs/phase1b/staging-blocker.json`
|
||||
|
||||
## Before Any PR
|
||||
|
||||
Do not open a PR until at least one of these is true:
|
||||
|
||||
- full-suite failures are triaged into accepted unrelated failures with issue links, or fixed;
|
||||
- staging access is available and a sandbox proof path is ready;
|
||||
- m3taversal/Fwaz explicitly accept a local-only draft review without staging proof.
|
||||
|
||||
## Before Production
|
||||
|
||||
Production requires:
|
||||
|
||||
- staging proof against sandbox `decision-engine`;
|
||||
- exact reviewed SHA;
|
||||
- Leo signoff;
|
||||
- no direct VPS self-upgrades;
|
||||
- `PHASE1B_AGENT_ROUTING_ENABLED` enabled only after cutover plan is written;
|
||||
- rollback path to flag-off behavior.
|
||||
Loading…
Reference in a new issue