teleo-infrastructure/docs/phase1b/agent-identity-router-spec.md

# Phase 1b Child Spec: Agent Identity Router

Created: 2026-05-29
Status: active draft
Parent spec: `docs/phase1b-agent-routing-spec.md`

## Product Outcome Contract

The router decides which Hermes agent identity should review a `decision-engine` KB PR. It must route by agent ownership, with file paths as strong evidence but not the only source of truth.

## Goal

Implement a pure, deterministic, evidence-bearing route scorer that returns one or two required reviewer agents for a PR.

## Non-Goals

- Do not call paid LLMs for routing.
- Do not post PR comments.
- Do not mutate pipeline DB state.
- Do not deploy to VPS.
- Do not implement general user-input routing outside PR evaluation.

## Current Implementation Audit

Current relevant code:

- `lib/domains.py` contains `DOMAIN_AGENT_MAP`, `agent_for_domain`, `detect_domain_from_diff`, and `detect_domain_from_branch`.
- `lib/agent_routing.py` now owns the Phase 1b identity-scored route contract.
- The obsolete local `DomainRoute` folder-first draft and its draft tests were removed before this branch was committed.
- Cross-domain PRs now require the top 2 routed agents locally, with `route_kind="escalated"` when more than two agents scored.

Existing implementation truth:

- The repo already has domain detection that can be reused for path signals.
- The new route tests cover six primary agents, broadened ownership domains, top-2 cross-domain routing, fallback, and deterministic repeat behavior.
- The existing map includes adjacent domains such as `mechanisms`, `living-capital`, `living-agents`, `critical-systems`, `collective-intelligence`, `teleological-economics`, and `cultural-dynamics`.
- The product owner clarified that Phase 1b should use agent identities to route, not only folder names.

## Existing-Spec Inventory

| Existing doc | Relevance | Decision |
| --- | --- | --- |
| `docs/phase1b-agent-routing-spec.md` | Umbrella source of truth. | Reuse. |
| `docs/queue.md` | Notes `ai-alignment` domain evolution. | Reuse as a signal for Theseus ownership. |
| `docs/ARCHITECTURE.md` | Describes eval stage shape. | Context only. |

## Goal-Vs-Repo-Truth Diff

Goal:

- Return `AgentRoute` with `primary_agent`, `required_agents`, `route_kind`, `scores`, and `evidence`.
- Cap cross-domain routes at top 2 agents.
- Treat folders as evidence, not the complete classifier.
- Be testable without network, DB, GitHub, or LLM calls.

Repo truth:

- Existing classifier returns one folder-domain string or `None`.
- No scores, evidence, or top-2 agent set exist.
- Existing tests do not cover identity-broadened ownership.

## Completion Percent And Remaining Delta

Current completion on this branch: 100 percent for local route logic, 0 percent for staging route calibration.

Remaining delta:

1. Review the route weights against real recent `decision-engine` PRs.
2. Calibrate ambiguous keyword cases from staging evidence.
3. Decide whether escalated routes should remain top-2 total or become Leo plus top-2 later.

## Closure, Endpoint, And Deployment Truth

Local closure:

- Route tests pass.
- No network or DB dependency exists in route tests.

Staging closure:

- Staging proof artifact records route scores and evidence for seven sandbox PRs.

Production closure:

- Live PR audit rows show route evidence and required agents.

This child spec alone cannot prove staging or production behavior.

## Critical Assumptions And Invalidators

Assumptions:

- `decision-engine` file layout is close enough to current local clone for path signals to apply.
- Agent identity ownership from m3taversal is authoritative.
- Top-2 cap is acceptable for cross-domain cases.

Invalidators:

- Product owner changes cross-domain rule from top 2 to all touched agents.
- Agent ownership boundaries change materially.
- Production PR metadata lacks branch or changed-file data.

## State And Truth Contract

Route output schema:

```python
AgentRoute(
    primary_agent="Rio",
    required_agents=("Rio",),
    route_kind="single",
    scores={"Leo": 0, "Theseus": 1, "Rio": 9, "Vida": 0, "Clay": 0, "Astra": 0},
    evidence=[
        {"agent": "Rio", "signal": "path", "weight": 8, "value": "domains/internet-finance/foo.md"}
    ],
    fallback=False,
)
```

`route_kind` values:

- `single`
- `multi`
- `fallback`
- `escalated`

`required_agents` must never contain more than two agents in Phase 1b.

## Measurement Contract

Required route fixture cases:

| Fixture | Expected |
| --- | --- |
| `domains/grand-strategy/foo.md` | Leo |
| `domains/ai-alignment/foo.md` | Theseus |
| `domains/internet-finance/foo.md` | Rio |
| `domains/health/foo.md` | Vida |
| `domains/entertainment/foo.md` | Clay |
| `domains/space-development/foo.md` | Astra |
| `domains/energy/foo.md` | Astra |
| `domains/robotics/foo.md` | Astra |
| `domains/manufacturing/foo.md` | Astra |
| `core/living-capital/foo.md` | Rio |
| `core/living-agents/foo.md` | Theseus |
| `foundations/cultural-dynamics/foo.md` | Clay |
| AI plus x402 diff | Theseus and Rio |
| collective AI goals diff | Leo and Theseus |

Minimum quality metrics:

- `route_fixture_pass_rate = 100 percent`
- `fallback_count = 0` for known fixtures
- deterministic repeat count: same input returns same result 100 times

## Backend Work Required

Owned files:

- `lib/agent_routing.py`
- `lib/domains.py`
- `tests/test_agent_routing.py`

Implementation steps:

1. Move new identity routing into `lib/agent_routing.py`.
2. Keep `lib/domains.py` as compatibility for domain-oriented callers.
3. Define `AGENT_ORDER = ("Leo", "Theseus", "Rio", "Vida", "Clay", "Astra")`.
4. Define identity signals per agent.
5. Add path signal extraction for `domains`, `entities`, `core`, `foundations`, and `agents`.
6. Add branch prefix signal extraction.
7. Add capped keyword scoring from filenames and diff text.
8. Add top-2 selection rule.
9. Add fallback to Leo.
10. Add tests.

Forbidden files:

- `lib/evaluate.py`
- `lib/llm.py`
- deploy scripts
- secrets or runtime config outside route feature flag wiring

## Frontend Work Required

None.

## Expected Runtime And User-Visible Behavior

The router itself has no user-visible UI. Its behavior becomes visible through audit logs, PR comment reviewer selection, and proof artifacts.

Example:

```text
input: domains/internet-finance/x402-agent-payments.md
output: required_agents = ["Rio"]
```

Cross-domain example:

```text
input: ai systems claim plus x402 payment claim
output: required_agents = ["Theseus", "Rio"]
```

## Validation And Test Matrix

Commands:

```bash
python3 -m pytest tests/test_agent_routing.py
python3 -m ruff check lib/agent_routing.py lib/domains.py tests/test_agent_routing.py
git diff --check
```

Test classes:

- primary ownership routes
- broadened ownership routes
- branch fallback routes
- keyword routes
- top-2 cross-domain routes
- fallback routes
- deterministic tie-breaking
- compatibility wrapper behavior

## CI/CD, Release, And Pre-Push Gate Contract

Before PR:

- Route tests pass locally.
- No production config defaults change.
- No network dependency enters route tests.

Before staging:

- Eval integration spec consumes the route result without modifying route internals.

Before production:

- Route evidence appears in staging proof artifact.

## Independent CLI Audit Contract

Reviewer commands:

```bash
git diff -- lib/agent_routing.py lib/domains.py tests/test_agent_routing.py
python3 -m pytest tests/test_agent_routing.py
```

Reviewer checks:

- Route function is pure.
- Scores are explainable.
- Top-2 cap is enforced.
- Folder paths are not the only signal.
- Old callers still work or have a clear migration path.

## Outside-The-Box Fix Paths

If keyword scoring is noisy:

- Disable diff keyword scoring and use path plus branch only.
- Use LLM classifier in shadow mode only.
- Add explicit PR label or frontmatter hint later.

If identity boundaries are ambiguous:

- Prefer top-2 over fallback when two agents have meaningful scores.
- Log route evidence for later calibration.

## Maintenance Capture

Beneficial now:

- Keep route logic out of `lib/evaluate.py`.
- Keep compatibility wrappers narrow.

Avoid now:

- Large domain taxonomy rewrite.
- Dashboard UI changes.
- Paid classifier calls.

## Parallelization And Fanout

Classification: local_owner.

Do not fan out implementation. This module is a root contract consumed by eval integration.

Worker-ready prompt:

```text
implement the phase 1b agent identity router in teleo-infrastructure. own lib/agent_routing.py, lib/domains.py compatibility wrappers, and route tests only. make the route function pure, deterministic, evidence-bearing, and capped at top 2 required agents. do not touch eval integration or deploy code.
```

## Acceptance Criteria

- All required route fixtures pass.
- Route result includes primary agent, required agents, route kind, scores, evidence, and fallback status.
- Cross-domain route never requires more than two agents.
- No LLM, network, DB, or GitHub calls occur in the router.

## Readiness And Claim Boundaries

Allowed claim:

- "Agent identity routing is locally implemented and unit-tested."

Forbidden claim:

- "Phase 1b eval is complete."

## Spec Quality Self-Audit

Required headings present:

- Current Implementation Audit: present.
- Goal-Vs-Repo-Truth Diff: present.
- Completion Percent And Remaining Delta: present.
- Closure, Endpoint, And Deployment Truth: present.
- Critical Assumptions And Invalidators: present.
- State And Truth Contract: present.
- Measurement Contract: present.
- Backend Work Required: present.
- Frontend Work Required: present.
- Expected Runtime And User-Visible Behavior: present.
- Validation And Test Matrix: present.
- CI/CD, Release, And Pre-Push Gate Contract: present.
- Independent CLI Audit Contract: present.
- Outside-The-Box Fix Paths: present.
- Maintenance Capture: present.
- Parallelization And Fanout: present.

## Assistant-Added Caveats

This child spec intentionally keeps routing deterministic and no-spend. That may be less semantically smart than an LLM classifier, but it is the right first implementation for Phase 1b because it is testable, cheap, and auditable.