338 lines
9.9 KiB
Markdown
338 lines
9.9 KiB
Markdown
# Phase 1b Child Spec: Agent Identity Router
|
|
|
|
Created: 2026-05-29
|
|
Status: active draft
|
|
Parent spec: `docs/phase1b-agent-routing-spec.md`
|
|
|
|
## Product Outcome Contract
|
|
|
|
The router decides which Hermes agent identity should review a `decision-engine` KB PR. It must route by agent ownership, with file paths as strong evidence but not the only source of truth.
|
|
|
|
## Goal
|
|
|
|
Implement a pure, deterministic, evidence-bearing route scorer that returns one or two required reviewer agents for a PR.
|
|
|
|
## Non-Goals
|
|
|
|
- Do not call paid LLMs for routing.
|
|
- Do not post PR comments.
|
|
- Do not mutate pipeline DB state.
|
|
- Do not deploy to VPS.
|
|
- Do not implement general user-input routing outside PR evaluation.
|
|
|
|
## Current Implementation Audit
|
|
|
|
Current relevant code:
|
|
|
|
- `lib/domains.py` contains `DOMAIN_AGENT_MAP`, `agent_for_domain`, `detect_domain_from_diff`, and `detect_domain_from_branch`.
|
|
- `lib/agent_routing.py` now owns the Phase 1b identity-scored route contract.
|
|
- The obsolete local `DomainRoute` folder-first draft and its draft tests were removed before this branch was committed.
|
|
- Cross-domain PRs now require the top 2 routed agents locally, with `route_kind="escalated"` when more than two agents scored.
|
|
|
|
Existing implementation truth:
|
|
|
|
- The repo already has domain detection that can be reused for path signals.
|
|
- The new route tests cover six primary agents, broadened ownership domains, top-2 cross-domain routing, fallback, and deterministic repeat behavior.
|
|
- The existing map includes adjacent domains such as `mechanisms`, `living-capital`, `living-agents`, `critical-systems`, `collective-intelligence`, `teleological-economics`, and `cultural-dynamics`.
|
|
- The product owner clarified that Phase 1b should use agent identities to route, not only folder names.
|
|
|
|
## Existing-Spec Inventory
|
|
|
|
| Existing doc | Relevance | Decision |
|
|
| --- | --- | --- |
|
|
| `docs/phase1b-agent-routing-spec.md` | Umbrella source of truth. | Reuse. |
|
|
| `docs/queue.md` | Notes `ai-alignment` domain evolution. | Reuse as a signal for Theseus ownership. |
|
|
| `docs/ARCHITECTURE.md` | Describes eval stage shape. | Context only. |
|
|
|
|
## Goal-Vs-Repo-Truth Diff
|
|
|
|
Goal:
|
|
|
|
- Return `AgentRoute` with `primary_agent`, `required_agents`, `route_kind`, `scores`, and `evidence`.
|
|
- Cap cross-domain routes at top 2 agents.
|
|
- Treat folders as evidence, not the complete classifier.
|
|
- Be testable without network, DB, GitHub, or LLM calls.
|
|
|
|
Repo truth:
|
|
|
|
- Existing classifier returns one folder-domain string or `None`.
|
|
- No scores, evidence, or top-2 agent set exist.
|
|
- Existing tests do not cover identity-broadened ownership.
|
|
|
|
## Completion Percent And Remaining Delta
|
|
|
|
Current completion on this branch: 100 percent for local route logic, 0 percent for staging route calibration.
|
|
|
|
Remaining delta:
|
|
|
|
1. Review the route weights against real recent `decision-engine` PRs.
|
|
2. Calibrate ambiguous keyword cases from staging evidence.
|
|
3. Decide whether escalated routes should remain top-2 total or become Leo plus top-2 later.
|
|
|
|
## Closure, Endpoint, And Deployment Truth
|
|
|
|
Local closure:
|
|
|
|
- Route tests pass.
|
|
- No network or DB dependency exists in route tests.
|
|
|
|
Staging closure:
|
|
|
|
- Staging proof artifact records route scores and evidence for seven sandbox PRs.
|
|
|
|
Production closure:
|
|
|
|
- Live PR audit rows show route evidence and required agents.
|
|
|
|
This child spec alone cannot prove staging or production behavior.
|
|
|
|
## Critical Assumptions And Invalidators
|
|
|
|
Assumptions:
|
|
|
|
- `decision-engine` file layout is close enough to current local clone for path signals to apply.
|
|
- Agent identity ownership from m3taversal is authoritative.
|
|
- Top-2 cap is acceptable for cross-domain cases.
|
|
|
|
Invalidators:
|
|
|
|
- Product owner changes cross-domain rule from top 2 to all touched agents.
|
|
- Agent ownership boundaries change materially.
|
|
- Production PR metadata lacks branch or changed-file data.
|
|
|
|
## State And Truth Contract
|
|
|
|
Route output schema:
|
|
|
|
```python
|
|
AgentRoute(
|
|
primary_agent="Rio",
|
|
required_agents=("Rio",),
|
|
route_kind="single",
|
|
scores={"Leo": 0, "Theseus": 1, "Rio": 9, "Vida": 0, "Clay": 0, "Astra": 0},
|
|
evidence=[
|
|
{"agent": "Rio", "signal": "path", "weight": 8, "value": "domains/internet-finance/foo.md"}
|
|
],
|
|
fallback=False,
|
|
)
|
|
```
|
|
|
|
`route_kind` values:
|
|
|
|
- `single`
|
|
- `multi`
|
|
- `fallback`
|
|
- `escalated`
|
|
|
|
`required_agents` must never contain more than two agents in Phase 1b.
|
|
|
|
## Measurement Contract
|
|
|
|
Required route fixture cases:
|
|
|
|
| Fixture | Expected |
|
|
| --- | --- |
|
|
| `domains/grand-strategy/foo.md` | Leo |
|
|
| `domains/ai-alignment/foo.md` | Theseus |
|
|
| `domains/internet-finance/foo.md` | Rio |
|
|
| `domains/health/foo.md` | Vida |
|
|
| `domains/entertainment/foo.md` | Clay |
|
|
| `domains/space-development/foo.md` | Astra |
|
|
| `domains/energy/foo.md` | Astra |
|
|
| `domains/robotics/foo.md` | Astra |
|
|
| `domains/manufacturing/foo.md` | Astra |
|
|
| `core/living-capital/foo.md` | Rio |
|
|
| `core/living-agents/foo.md` | Theseus |
|
|
| `foundations/cultural-dynamics/foo.md` | Clay |
|
|
| AI plus x402 diff | Theseus and Rio |
|
|
| collective AI goals diff | Leo and Theseus |
|
|
|
|
Minimum quality metrics:
|
|
|
|
- `route_fixture_pass_rate = 100 percent`
|
|
- `fallback_count = 0` for known fixtures
|
|
- deterministic repeat count: same input returns same result 100 times
|
|
|
|
## Backend Work Required
|
|
|
|
Owned files:
|
|
|
|
- `lib/agent_routing.py`
|
|
- `lib/domains.py`
|
|
- `tests/test_agent_routing.py`
|
|
|
|
Implementation steps:
|
|
|
|
1. Move new identity routing into `lib/agent_routing.py`.
|
|
2. Keep `lib/domains.py` as compatibility for domain-oriented callers.
|
|
3. Define `AGENT_ORDER = ("Leo", "Theseus", "Rio", "Vida", "Clay", "Astra")`.
|
|
4. Define identity signals per agent.
|
|
5. Add path signal extraction for `domains`, `entities`, `core`, `foundations`, and `agents`.
|
|
6. Add branch prefix signal extraction.
|
|
7. Add capped keyword scoring from filenames and diff text.
|
|
8. Add top-2 selection rule.
|
|
9. Add fallback to Leo.
|
|
10. Add tests.
|
|
|
|
Forbidden files:
|
|
|
|
- `lib/evaluate.py`
|
|
- `lib/llm.py`
|
|
- deploy scripts
|
|
- secrets or runtime config outside route feature flag wiring
|
|
|
|
## Frontend Work Required
|
|
|
|
None.
|
|
|
|
## Expected Runtime And User-Visible Behavior
|
|
|
|
The router itself has no user-visible UI. Its behavior becomes visible through audit logs, PR comment reviewer selection, and proof artifacts.
|
|
|
|
Example:
|
|
|
|
```text
|
|
input: domains/internet-finance/x402-agent-payments.md
|
|
output: required_agents = ["Rio"]
|
|
```
|
|
|
|
Cross-domain example:
|
|
|
|
```text
|
|
input: ai systems claim plus x402 payment claim
|
|
output: required_agents = ["Theseus", "Rio"]
|
|
```
|
|
|
|
## Validation And Test Matrix
|
|
|
|
Commands:
|
|
|
|
```bash
|
|
python3 -m pytest tests/test_agent_routing.py
|
|
python3 -m ruff check lib/agent_routing.py lib/domains.py tests/test_agent_routing.py
|
|
git diff --check
|
|
```
|
|
|
|
Test classes:
|
|
|
|
- primary ownership routes
|
|
- broadened ownership routes
|
|
- branch fallback routes
|
|
- keyword routes
|
|
- top-2 cross-domain routes
|
|
- fallback routes
|
|
- deterministic tie-breaking
|
|
- compatibility wrapper behavior
|
|
|
|
## CI/CD, Release, And Pre-Push Gate Contract
|
|
|
|
Before PR:
|
|
|
|
- Route tests pass locally.
|
|
- No production config defaults change.
|
|
- No network dependency enters route tests.
|
|
|
|
Before staging:
|
|
|
|
- Eval integration spec consumes the route result without modifying route internals.
|
|
|
|
Before production:
|
|
|
|
- Route evidence appears in staging proof artifact.
|
|
|
|
## Independent CLI Audit Contract
|
|
|
|
Reviewer commands:
|
|
|
|
```bash
|
|
git diff -- lib/agent_routing.py lib/domains.py tests/test_agent_routing.py
|
|
python3 -m pytest tests/test_agent_routing.py
|
|
```
|
|
|
|
Reviewer checks:
|
|
|
|
- Route function is pure.
|
|
- Scores are explainable.
|
|
- Top-2 cap is enforced.
|
|
- Folder paths are not the only signal.
|
|
- Old callers still work or have a clear migration path.
|
|
|
|
## Outside-The-Box Fix Paths
|
|
|
|
If keyword scoring is noisy:
|
|
|
|
- Disable diff keyword scoring and use path plus branch only.
|
|
- Use LLM classifier in shadow mode only.
|
|
- Add explicit PR label or frontmatter hint later.
|
|
|
|
If identity boundaries are ambiguous:
|
|
|
|
- Prefer top-2 over fallback when two agents have meaningful scores.
|
|
- Log route evidence for later calibration.
|
|
|
|
## Maintenance Capture
|
|
|
|
Beneficial now:
|
|
|
|
- Keep route logic out of `lib/evaluate.py`.
|
|
- Keep compatibility wrappers narrow.
|
|
|
|
Avoid now:
|
|
|
|
- Large domain taxonomy rewrite.
|
|
- Dashboard UI changes.
|
|
- Paid classifier calls.
|
|
|
|
## Parallelization And Fanout
|
|
|
|
Classification: local_owner.
|
|
|
|
Do not fan out implementation. This module is a root contract consumed by eval integration.
|
|
|
|
Worker-ready prompt:
|
|
|
|
```text
|
|
implement the phase 1b agent identity router in teleo-infrastructure. own lib/agent_routing.py, lib/domains.py compatibility wrappers, and route tests only. make the route function pure, deterministic, evidence-bearing, and capped at top 2 required agents. do not touch eval integration or deploy code.
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- All required route fixtures pass.
|
|
- Route result includes primary agent, required agents, route kind, scores, evidence, and fallback status.
|
|
- Cross-domain route never requires more than two agents.
|
|
- No LLM, network, DB, or GitHub calls occur in the router.
|
|
|
|
## Readiness And Claim Boundaries
|
|
|
|
Allowed claim:
|
|
|
|
- "Agent identity routing is locally implemented and unit-tested."
|
|
|
|
Forbidden claim:
|
|
|
|
- "Phase 1b eval is complete."
|
|
|
|
## Spec Quality Self-Audit
|
|
|
|
Required headings present:
|
|
|
|
- Current Implementation Audit: present.
|
|
- Goal-Vs-Repo-Truth Diff: present.
|
|
- Completion Percent And Remaining Delta: present.
|
|
- Closure, Endpoint, And Deployment Truth: present.
|
|
- Critical Assumptions And Invalidators: present.
|
|
- State And Truth Contract: present.
|
|
- Measurement Contract: present.
|
|
- Backend Work Required: present.
|
|
- Frontend Work Required: present.
|
|
- Expected Runtime And User-Visible Behavior: present.
|
|
- Validation And Test Matrix: present.
|
|
- CI/CD, Release, And Pre-Push Gate Contract: present.
|
|
- Independent CLI Audit Contract: present.
|
|
- Outside-The-Box Fix Paths: present.
|
|
- Maintenance Capture: present.
|
|
- Parallelization And Fanout: present.
|
|
|
|
## Assistant-Added Caveats
|
|
|
|
This child spec intentionally keeps routing deterministic and no-spend. That may be less semantically smart than an LLM classifier, but it is the right first implementation for Phase 1b because it is testable, cheap, and auditable.
|