Gate 3 in batch-extract-50.sh: query pipeline.db for closed PRs before
re-extracting. Sources with >=3 closed PRs are skipped (zombie protection).
Cost tracking: openrouter_call() now returns (text, usage) tuple with
prompt_tokens and completion_tokens from the OpenRouter API response.
All callers updated to unpack and pass tokens to costs.record_usage().
Added missing triage cost recording. Fixed batch domain review recording
cost once per batch instead of once per PR.
Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Opus was ignoring the valid tag list and generating custom tags like
schema-enrichment-slug-mismatch, which fall through to 'unknown' in
disposition logic. All three prompts (domain, Leo standard, Leo deep)
now explicitly say "do not invent new tags" alongside the valid tag list.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
- Domain review → GPT-4o (OpenRouter), Leo STANDARD → Sonnet (OpenRouter),
Leo DEEP → Opus (Claude Max). Two model families = no correlated blind spots.
- Opus reserved for DEEP eval only — protects rate limit for overnight research.
- Review prompts calibrated: require per-criterion evidence, blocking-vs-observation
verdict rules. Moved from 100% rubber-stamp approval to 12% pass rate.
- OpenRouter failures classified as openrouter_failed (not rate_limited) to avoid
spurious 15-min Opus backoff.
- merge.py: pre-check PR state before merge API call (prevents 405 on re-merge).
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>