Add research_tracking.py to diagnostics (Phase 1 consolidation)

Argus's research lifecycle tracking module. Was in root diagnostics/ only — missing from both repos. Completes Phase 1 file inventory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add vitality modules + upgrade alerting with SQL injection protection
2026-04-13 10:15:27 +02:00 · 2026-04-13 10:12:53 +02:00
112 changed files with 4363 additions and 13413 deletions
--- a/.gitignore
+++ b/.gitignore
@ -30,6 +30,3 @@ build/

 # OS
 .DS_Store
-
-# Hermes session artifacts
-ops/sessions/
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
--- a/79
+++ b/79
@ -1,79 +0,0 @@
-# teleo-infrastructure ownership map
-# Each path has ONE owning agent. Owner = accountable for correctness + reviews changes.
-# Format: <pattern> <owner>
-
-# Pipeline daemon — entry points
-/teleo-pipeline.py          @ship
-/reweave.py                 @ship
-
-# Pipeline library — shared Python package
-/lib/config.py              @ship
-/lib/db.py                  @ship
-/lib/connect.py             @ship
-/lib/log.py                 @ship
-/lib/forgejo.py             @ship
-/lib/breaker.py             @ship
-/lib/worktree_lock.py       @ship
-/lib/domains.py             @ship
-/lib/costs.py               @ship
-/lib/llm.py                 @ship
-/lib/merge.py               @ship
-/lib/cascade.py             @ship
-/lib/cross_domain.py        @ship
-/lib/validate.py            @ship
-/lib/stale_pr.py            @ship
-/lib/watchdog.py            @ship
-/lib/feedback.py            @ship
-/lib/fixer.py               @ship
-/lib/substantive_fixer.py   @ship
-/lib/dedup.py               @ship
-
-/lib/extract.py             @epimetheus
-/lib/extraction_prompt.py   @epimetheus
-/lib/post_extract.py        @epimetheus
-/lib/pre_screen.py          @epimetheus
-/lib/entity_batch.py        @epimetheus
-/lib/entity_queue.py        @epimetheus
-
-/lib/evaluate.py            @leo
-/lib/analytics.py           @leo
-/lib/attribution.py         @leo
-
-/lib/health.py              @argus
-/lib/search.py              @argus
-/lib/claim_index.py         @argus
-/lib/digest.py              @argus
-
-# Diagnostics — monitoring dashboard
-/diagnostics/               @argus
-
-# Telegram bot
-/telegram/                  @ship
-
-# Deployment automation
-/deploy/                    @ship
-
-# Systemd service definitions
-/systemd/                   @ship
-
-# Agent state management
-/agent-state/               @ship
-
-# Research orchestration
-/research/                  @ship
-
-# Hermes agent
-/hermes-agent/              @ship
-
-# One-off scripts and migrations
-/scripts/                   @ship
-
-# Test suite
-/tests/                     @ganymede
-
-# Documentation
-/docs/                      shared
-
-# Config
-/pyproject.toml             @ship
-/.gitignore                 @ship
--- a/docs/DIAGNOSTICS-AGENT-SPEC.md
+++ b/docs/DIAGNOSTICS-AGENT-SPEC.md
--- a/docs/INFRASTRUCTURE.md
+++ b/docs/INFRASTRUCTURE.md
--- a/docs/PIPELINE-AGENT-SPEC.md
+++ b/docs/PIPELINE-AGENT-SPEC.md
--- a/README.md
+++ b/README.md
@ -1,65 +0,0 @@
-# teleo-infrastructure
-
-Pipeline infrastructure for the Teleo collective knowledge base. Async Python daemon that extracts, validates, evaluates, and merges claims via Forgejo PRs.
-
-## Directory Structure
-
-```
-teleo-infrastructure/
-├── teleo-pipeline.py        # Daemon entry point
-├── reweave.py               # Reciprocal edge maintenance
-├── lib/                     # Pipeline modules (Python package)
-├── diagnostics/             # Monitoring dashboard (port 8081)
-├── telegram/                # Telegram bot interface
-├── deploy/                  # Deployment + mirror scripts
-├── systemd/                 # Service definitions
-├── agent-state/             # Cross-session agent state
-├── research/                # Nightly research orchestration
-├── hermes-agent/            # Hermes agent setup
-├── scripts/                 # One-off backfills + migrations
-├── tests/                   # Test suite
-└── docs/                    # Operational documentation
-```
-
-## Ownership
-
-Each directory has one owning agent. The owner is accountable for correctness and reviews all changes to their section. See `CODEOWNERS` for per-file detail.
-
-| Directory | Owner | What it does |
-|-----------|-------|-------------|
-| `lib/` (core) | **Ship** | Config, DB, merge, cascade, validation, LLM calls |
-| `lib/` (extraction) | **Epimetheus** | Source extraction, entity processing, pre-screening |
-| `lib/` (evaluation) | **Leo** | Claim evaluation, analytics, attribution |
-| `lib/` (health) | **Argus** | Health checks, search, claim index |
-| `diagnostics/` | **Argus** | 4-page dashboard, alerting, vitality metrics |
-| `telegram/` | **Ship** | Telegram bot, X integration, retrieval |
-| `deploy/` | **Ship** | rsync deploy, GitHub-Forgejo mirror |
-| `systemd/` | **Ship** | teleo-pipeline, teleo-diagnostics, teleo-agent@ |
-| `agent-state/` | **Ship** | Bootstrap, state library, cascade inbox processor |
-| `research/` | **Ship** | Nightly research sessions, prompt templates |
-| `scripts/` | **Ship** | Backfills, migrations, one-off maintenance |
-| `tests/` | **Ganymede** | pytest suite, integration tests |
-| `docs/` | Shared | Architecture, specs, protocols |
-
-## VPS Layout
-
-Runs on Hetzner CAX31 (77.42.65.182) as user `teleo`.
-
-| VPS Path | Repo Source | Service |
-|----------|-------------|---------|
-| `/opt/teleo-eval/pipeline/` | `lib/`, `teleo-pipeline.py`, `reweave.py` | teleo-pipeline |
-| `/opt/teleo-eval/diagnostics/` | `diagnostics/` | teleo-diagnostics |
-| `/opt/teleo-eval/telegram/` | `telegram/` | (manual) |
-| `/opt/teleo-eval/agent-state/` | `agent-state/` | (used by research-session.sh) |
-
-## Quick Start
-
-```bash
-# Run tests
-pip install -e ".[dev]"
-pytest
-
-# Deploy to VPS
-./deploy/deploy.sh --dry-run   # preview
-./deploy/deploy.sh             # deploy
-```
--- a/scripts/backfill-ci.py
+++ b/scripts/backfill-ci.py
--- a/scripts/backfill-domains.py
+++ b/scripts/backfill-domains.py
--- a/scripts/backfill-source-authors.py
+++ b/scripts/backfill-source-authors.py
--- a/scripts/backfill-sources.py
+++ b/scripts/backfill-sources.py
@ -104,22 +104,14 @@ def main():
                claims_count = 0

            if rel_path in existing:
-                # Update status if different — but never regress from terminal states.
-                # If DB says 'extracted' or 'null_result' and file happens to be in queue/
-                # (e.g., failed archive push, zombie file), the DB is authoritative.
-                # Downgrading to 'unprocessed' triggers the runaway re-extraction loop.
+                # Update status if different
                current = conn.execute("SELECT status FROM sources WHERE path = ?", (rel_path,)).fetchone()
-                TERMINAL_STATUSES = {"extracted", "null_result", "error", "ghost_no_file"}
                if current and current["status"] != status:
-                    if current["status"] in TERMINAL_STATUSES and status == "unprocessed":
-                        # Don't regress terminal → unprocessed. DB wins.
-                        pass
-                    else:
-                        conn.execute(
-                            "UPDATE sources SET status = ?, updated_at = datetime('now') WHERE path = ?",
-                            (status, rel_path),
-                        )
-                        updated += 1
+                    conn.execute(
+                        "UPDATE sources SET status = ?, updated_at = datetime('now') WHERE path = ?",
+                        (status, rel_path),
+                    )
+                    updated += 1
            else:
                conn.execute(
                    """INSERT INTO sources (path, status, priority, claims_count, created_at, updated_at)
--- a/batch-extract-50.sh
+++ b/batch-extract-50.sh
@ -0,0 +1,283 @@
+#!/bin/bash
+# Batch extract sources from inbox/queue/ — v3 with two-gate skip logic
+#
+# Uses separate extract/ worktree (not main/ — prevents daemon race condition).
+# Skip logic uses two checks instead of local marker files (Ganymede v3 review):
+#   Gate 1: Is source already in archive/{domain}/? → already processed, dedup
+#   Gate 2: Does extraction branch exist on Forgejo? → extraction in progress
+#   Gate 3: Does pipeline.db show ≥3 closed PRs for this source? → zombie, skip
+#   Gate 4: Does pipeline.db show active OR recently closed PR? → skip (4h cooldown)
+#   All gates pass → extract
+#
+# Architecture: Ganymede (two-gate) + Rhea (separate worktrees)
+
+REPO=/opt/teleo-eval/workspaces/extract
+MAIN_REPO=/opt/teleo-eval/workspaces/main
+EXTRACT=/opt/teleo-eval/openrouter-extract-v2.py
+CLEANUP=/opt/teleo-eval/post-extract-cleanup.py
+LOG=/opt/teleo-eval/logs/batch-extract-50.log
+DB=/opt/teleo-eval/pipeline/pipeline.db
+TOKEN=$(cat /opt/teleo-eval/secrets/forgejo-leo-token)
+FORGEJO_URL="http://localhost:3000"
+MAX=50
+MAX_CLOSED=3  # zombie retry limit: skip source after this many closed PRs
+COUNT=0
+SUCCESS=0
+FAILED=0
+SKIPPED=0
+
+# Lockfile to prevent concurrent runs
+LOCKFILE="/tmp/batch-extract.lock"
+if [ -f "$LOCKFILE" ]; then
+    pid=$(cat "$LOCKFILE" 2>/dev/null)
+    if kill -0 "$pid" 2>/dev/null; then
+        echo "[$(date)] SKIP: batch extract already running (pid $pid)" >> $LOG
+        exit 0
+    fi
+    rm -f "$LOCKFILE"
+fi
+echo $$ > "$LOCKFILE"
+trap 'rm -f "$LOCKFILE"' EXIT
+
+echo "[$(date)] Starting batch extraction of $MAX sources" >> $LOG
+
+cd $REPO || exit 1
+
+# Bug fix: don't swallow errors on critical git commands (Ganymede review)
+git fetch origin main >> $LOG 2>&1 || { echo "[$(date)] FATAL: fetch origin main failed" >> $LOG; exit 1; }
+git checkout -f main >> $LOG 2>&1 || { echo "[$(date)] FATAL: checkout main failed" >> $LOG; exit 1; }
+git reset --hard origin/main >> $LOG 2>&1 || { echo "[$(date)] FATAL: reset --hard failed" >> $LOG; exit 1; }
+
+# SHA canary: verify extract worktree matches origin/main (Ganymede review)
+LOCAL_SHA=$(git rev-parse HEAD)
+REMOTE_SHA=$(git rev-parse origin/main)
+if [ "$LOCAL_SHA" != "$REMOTE_SHA" ]; then
+    echo "[$(date)] FATAL: extract worktree diverged from main ($LOCAL_SHA vs $REMOTE_SHA)" >> $LOG
+    exit 1
+fi
+
+# Pre-extraction cleanup: remove queue files that already exist in archive
+# This runs on the MAIN worktree (not extract/) so deletions are committed to git.
+# Prevents the "queue duplicate reappears after reset --hard" problem.
+CLEANED=0
+for qfile in $MAIN_REPO/inbox/queue/*.md; do
+    [ -f "$qfile" ] || continue
+    qbase=$(basename "$qfile")
+    if find "$MAIN_REPO/inbox/archive" -name "$qbase" 2>/dev/null | grep -q .; then
+        rm -f "$qfile"
+        CLEANED=$((CLEANED + 1))
+    fi
+done
+if [ "$CLEANED" -gt 0 ]; then
+    echo "[$(date)] Cleaned $CLEANED stale queue duplicates" >> $LOG
+    cd $MAIN_REPO
+    git add -A inbox/queue/ 2>/dev/null
+    git commit -m "pipeline: clean $CLEANED stale queue duplicates
+
+Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>" 2>/dev/null
+    # Push with retry
+    for attempt in 1 2 3; do
+        git pull --rebase origin main 2>/dev/null
+        git push origin main 2>/dev/null && break
+        sleep 2
+    done
+    cd $REPO
+    git fetch origin main 2>/dev/null
+    git reset --hard origin/main 2>/dev/null
+fi
+
+# Get sources in queue
+SOURCES=$(ls inbox/queue/*.md 2>/dev/null | head -$MAX)
+
+# Batch fetch all remote branches once (Ganymede: 1 call instead of 84)
+REMOTE_BRANCHES=$(git ls-remote --heads origin 2>/dev/null)
+if [ $? -ne 0 ]; then
+    echo "[$(date)] ABORT: git ls-remote failed — remote unreachable, skipping cycle" >> $LOG
+    exit 0
+fi
+
+for SOURCE in $SOURCES; do
+    COUNT=$((COUNT + 1))
+    BASENAME=$(basename "$SOURCE" .md)
+    BRANCH="extract/$BASENAME"
+
+    # Skip conversation archives — valuable content enters through standalone sources,
+    # inline tags (SOURCE:/CLAIM:), and transcript review. Raw conversations produce
+    # low-quality claims with schema failures. (Epimetheus session 4)
+    if grep -q "^format: conversation" "$SOURCE" 2>/dev/null; then
+        # Move to archive instead of leaving in queue (prevents re-processing)
+        mv "$SOURCE" "$MAIN_REPO/inbox/archive/telegram/" 2>/dev/null
+        echo "[$(date)] [$COUNT/$MAX] ARCHIVE $BASENAME (conversation — skipped extraction)" >> $LOG
+        SKIPPED=$((SKIPPED + 1))
+        continue
+    fi
+
+    # Gate 1: Already in archive? Source was already processed — dedup (Ganymede)
+    if find "$MAIN_REPO/inbox/archive" -name "$BASENAME.md" 2>/dev/null | grep -q .; then
+        echo "[$(date)] [$COUNT/$MAX] SKIP $BASENAME (already in archive)" >> $LOG
+        # Delete the queue duplicate
+        rm -f "$MAIN_REPO/inbox/queue/$BASENAME.md" 2>/dev/null
+        SKIPPED=$((SKIPPED + 1))
+        continue
+    fi
+
+    # Gate 2: Branch exists on Forgejo? Extraction already in progress (cached lookup)
+    # Enhancement: 2-hour staleness check (Ganymede review) — if branch is >2h old
+    # and PR is unmergeable, close PR + delete branch and re-extract
+    if echo "$REMOTE_BRANCHES" | grep -q "refs/heads/$BRANCH$"; then
+        # Check branch age
+        BRANCH_SHA=$(echo "$REMOTE_BRANCHES" | grep "refs/heads/$BRANCH$" | awk '{print $1}')
+        BRANCH_AGE_EPOCH=$(git log -1 --format='%ct' "$BRANCH_SHA" 2>/dev/null || echo 0)
+        NOW_EPOCH=$(date +%s)
+        AGE_HOURS=$(( (NOW_EPOCH - BRANCH_AGE_EPOCH) / 3600 ))
+
+        if [ "$AGE_HOURS" -ge 2 ]; then
+            # Branch is stale — check if PR is mergeable
+            # Note: Forgejo head= filter is unreliable. Fetch all open PRs and filter locally.
+            PR_NUM=$(curl -sf "$FORGEJO_URL/api/v1/repos/teleo/teleo-codex/pulls?state=open&limit=50" \
+                -H "Authorization: token $TOKEN" | python3 -c "
+import sys,json
+prs=json.load(sys.stdin)
+branch='$BRANCH'
+matches=[p for p in prs if p['head']['ref']==branch]
+print(matches[0]['number'] if matches else '')
+" 2>/dev/null)
+            if [ -n "$PR_NUM" ]; then
+                PR_MERGEABLE=$(curl -sf "$FORGEJO_URL/api/v1/repos/teleo/teleo-codex/pulls/$PR_NUM" \
+                    -H "Authorization: token $TOKEN" | python3 -c 'import sys,json; print(json.load(sys.stdin).get("mergeable","true"))' 2>/dev/null)
+                if [ "$PR_MERGEABLE" = "False" ] || [ "$PR_MERGEABLE" = "false" ]; then
+                    echo "[$(date)] [$COUNT/$MAX] STALE: $BASENAME (${AGE_HOURS}h old, unmergeable PR #$PR_NUM) — closing + re-extracting" >> $LOG
+                    # Close PR with audit comment
+                    curl -sf -X POST "$FORGEJO_URL/api/v1/repos/teleo/teleo-codex/issues/$PR_NUM/comments" \
+                        -H "Authorization: token $TOKEN" -H "Content-Type: application/json" \
+                        -d '{"body":"Auto-closed: extraction branch stale >2h, conflict unresolvable. Source will be re-extracted from current main."}' > /dev/null 2>&1
+                    curl -sf -X PATCH "$FORGEJO_URL/api/v1/repos/teleo/teleo-codex/pulls/$PR_NUM" \
+                        -H "Authorization: token $TOKEN" -H "Content-Type: application/json" \
+                        -d '{"state":"closed"}' > /dev/null 2>&1
+                    # Delete remote branch
+                    git push origin --delete "$BRANCH" 2>/dev/null
+                    # Fall through to extraction below
+                else
+                    echo "[$(date)] [$COUNT/$MAX] SKIP $BASENAME (branch exists ${AGE_HOURS}h, PR #$PR_NUM mergeable — waiting)" >> $LOG
+                    SKIPPED=$((SKIPPED + 1))
+                    continue
+                fi
+            else
+                # No PR found but branch exists — orphan branch, clean up
+                echo "[$(date)] [$COUNT/$MAX] STALE: $BASENAME (orphan branch ${AGE_HOURS}h, no PR) — deleting" >> $LOG
+                git push origin --delete "$BRANCH" 2>/dev/null
+                # Fall through to extraction
+            fi
+        else
+            echo "[$(date)] [$COUNT/$MAX] SKIP $BASENAME (branch exists — in progress, ${AGE_HOURS}h old)" >> $LOG
+            SKIPPED=$((SKIPPED + 1))
+            continue
+        fi
+    fi
+
+    # Gate 3: Check pipeline.db for zombie sources — too many closed PRs means
+    # the source keeps failing eval. Skip after MAX_CLOSED rejections. (Epimetheus)
+    if [ -f "$DB" ]; then
+        CLOSED_COUNT=$(sqlite3 "$DB" "SELECT COUNT(*) FROM prs WHERE branch = 'extract/$BASENAME' AND status = 'closed'" 2>/dev/null || echo 0)
+        if [ "$CLOSED_COUNT" -ge "$MAX_CLOSED" ]; then
+            echo "[$(date)] [$COUNT/$MAX] SKIP $BASENAME (zombie: $CLOSED_COUNT closed PRs >= $MAX_CLOSED limit)" >> $LOG
+            SKIPPED=$((SKIPPED + 1))
+            continue
+        fi
+    fi
+
+    # Gate 4: Check pipeline.db for active or recently closed PRs — prevents
+    # re-extraction waste when eval closes a PR and batch-extract runs again
+    # before the source is manually reviewed. 4h cooldown after closure.
+    if [ -f "$DB" ]; then
+        ACTIVE_COUNT=$(sqlite3 "$DB" "SELECT COUNT(*) FROM prs WHERE branch = 'extract/$BASENAME' AND status IN ('extracting','approved','merging')" 2>/dev/null || echo 0)
+        if [ "$ACTIVE_COUNT" -ge 1 ]; then
+            echo "[$(date)] [$COUNT/$MAX] SKIP $BASENAME (active PR exists)" >> $LOG
+            SKIPPED=$((SKIPPED + 1))
+            continue
+        fi
+        RECENT_CLOSED=$(sqlite3 "$DB" "SELECT COUNT(*) FROM prs WHERE branch = 'extract/$BASENAME' AND status = 'closed' AND created_at > datetime('now', '-4 hours')" 2>/dev/null || echo 0)
+        if [ "$RECENT_CLOSED" -ge 1 ]; then
+            echo "[$(date)] [$COUNT/$MAX] SKIP $BASENAME (recently closed PR — 4h cooldown)" >> $LOG
+            SKIPPED=$((SKIPPED + 1))
+            continue
+        fi
+    fi
+
+    echo "[$(date)] [$COUNT/$MAX] Processing $BASENAME" >> $LOG
+
+    # Reset to main (log errors — don't swallow)
+    git checkout -f main >> $LOG 2>&1 || { echo "  -> SKIP (checkout main failed)" >> $LOG; SKIPPED=$((SKIPPED + 1)); continue; }
+    git fetch origin main >> $LOG 2>&1
+    git reset --hard origin/main >> $LOG 2>&1 || { echo "  -> SKIP (reset failed)" >> $LOG; SKIPPED=$((SKIPPED + 1)); continue; }
+
+    # Clean stale remote branch (Leo's catch — prevents checkout conflicts)
+    git push origin --delete "$BRANCH" 2>/dev/null
+
+    # Create fresh branch
+    git branch -D "$BRANCH" 2>/dev/null
+    git checkout -b "$BRANCH" 2>/dev/null
+    if [ $? -ne 0 ]; then
+        echo "  -> SKIP (branch creation failed)" >> $LOG
+        SKIPPED=$((SKIPPED + 1))
+        continue
+    fi
+
+    # Run extraction
+    python3 $EXTRACT "$SOURCE" --no-review >> $LOG 2>&1
+    EXTRACT_RC=$?
+
+
+
+    if [ $EXTRACT_RC -ne 0 ]; then
+        FAILED=$((FAILED + 1))
+        echo "  -> FAILED (extract rc=$EXTRACT_RC)" >> $LOG
+        continue
+    fi
+
+    # Post-extraction cleanup
+    python3 $CLEANUP $REPO >> $LOG 2>&1
+
+    # Check if any files were created/modified
+    CHANGED=$(git status --porcelain | wc -l | tr -d " ")
+    if [ "$CHANGED" -eq 0 ]; then
+        echo "  -> No changes (enrichment/null-result only)" >> $LOG
+        continue
+    fi
+
+    # Commit
+    git add -A
+    git commit -m "extract: $BASENAME
+
+Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>" >> $LOG 2>&1
+
+    # Push
+    git push "http://leo:${TOKEN}@localhost:3000/teleo/teleo-codex.git" "$BRANCH" --force >> $LOG 2>&1
+
+    # Create PR (include prior art sidecar if available)
+    PRIOR_ART_FILE="${SOURCE}.prior-art"
+    PR_BODY=""
+    if [ -f "$PRIOR_ART_FILE" ]; then
+        # Escape JSON special chars in prior art content
+        PR_BODY=$(cat "$PRIOR_ART_FILE" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))')
+        PR_BODY=${PR_BODY:1:-1}  # Strip outer quotes from json.dumps
+    fi
+    curl -sf -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
+        -H "Authorization: token $TOKEN" \
+        -H "Content-Type: application/json" \
+        -d "{\"title\":\"extract: $BASENAME\",\"head\":\"$BRANCH\",\"base\":\"main\",\"body\":\"$PR_BODY\"}" >> /dev/null 2>&1
+
+    SUCCESS=$((SUCCESS + 1))
+    echo "  -> SUCCESS ($CHANGED files)" >> $LOG
+
+    # Back to main
+    git checkout -f main >> $LOG 2>&1
+
+    # Rate limit
+    sleep 2
+done
+
+echo "[$(date)] Batch complete: $SUCCESS success, $FAILED failed, $SKIPPED skipped (already attempted)" >> $LOG
+
+git checkout -f main >> $LOG 2>&1
+git reset --hard origin/main >> $LOG 2>&1
--- a/scripts/bootstrap-contributors.py
+++ b/scripts/bootstrap-contributors.py
--- a/docs/deploy-manifest.md
+++ b/docs/deploy-manifest.md
--- a/deploy/deploy.sh
+++ b/deploy/deploy.sh
@ -41,7 +41,7 @@ echo ""
 # Syntax check all Python files before deploying
 echo "=== Pre-deploy syntax check ==="
 ERRORS=0
-for f in "$REPO_ROOT/lib/"*.py "$REPO_ROOT/"*.py "$REPO_ROOT/diagnostics/"*.py "$REPO_ROOT/telegram/"*.py; do
+for f in "$REPO_ROOT/ops/pipeline-v2/lib/"*.py "$REPO_ROOT/ops/pipeline-v2/"*.py "$REPO_ROOT/ops/diagnostics/"*.py; do
  [ -f "$f" ] || continue
  if ! python3 -c "import ast, sys; ast.parse(open(sys.argv[1]).read())" "$f" 2>/dev/null; then
    echo "SYNTAX ERROR: $f"
@ -55,41 +55,33 @@ fi
 echo "All files pass syntax check."
 echo ""

-RSYNC_OPTS=(-avz --exclude __pycache__ --exclude '*.pyc' --exclude '*.bak*')
+RSYNC_FLAGS="-avz --exclude='__pycache__' --exclude='*.pyc' --exclude='*.bak*'"
 if $DRY_RUN; then
-  RSYNC_OPTS+=(--dry-run)
+  RSYNC_FLAGS="$RSYNC_FLAGS --dry-run"
  echo "=== DRY RUN ==="
 fi

 echo "=== Pipeline lib/ ==="
-rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/lib/" "$VPS_HOST:$VPS_PIPELINE/lib/"
+rsync $RSYNC_FLAGS "$REPO_ROOT/ops/pipeline-v2/lib/" "$VPS_HOST:$VPS_PIPELINE/lib/"
 echo ""

 echo "=== Pipeline top-level ==="
-for f in teleo-pipeline.py reweave.py fetch_coins.py; do
-  [ -f "$REPO_ROOT/$f" ] || continue
-  rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/$f" "$VPS_HOST:$VPS_PIPELINE/$f"
+for f in teleo-pipeline.py reweave.py batch-extract-50.sh; do
+  [ -f "$REPO_ROOT/ops/pipeline-v2/$f" ] || continue
+  rsync $RSYNC_FLAGS "$REPO_ROOT/ops/pipeline-v2/$f" "$VPS_HOST:$VPS_PIPELINE/$f"
 done
 echo ""

-echo "=== Telegram bot ==="
-rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/telegram/" "$VPS_HOST:$VPS_PIPELINE/telegram/"
-echo ""
-
-echo "=== Tests ==="
-rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/tests/" "$VPS_HOST:$VPS_PIPELINE/tests/"
-echo ""
-
 echo "=== Diagnostics ==="
-rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/diagnostics/" "$VPS_HOST:$VPS_DIAGNOSTICS/"
+rsync $RSYNC_FLAGS "$REPO_ROOT/ops/diagnostics/" "$VPS_HOST:$VPS_DIAGNOSTICS/"
 echo ""

 echo "=== Agent state ==="
-rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/agent-state/" "$VPS_HOST:$VPS_AGENT_STATE/"
+rsync $RSYNC_FLAGS "$REPO_ROOT/ops/agent-state/" "$VPS_HOST:$VPS_AGENT_STATE/"
 echo ""

 echo "=== Research session ==="
-rsync "${RSYNC_OPTS[@]}" "$REPO_ROOT/research/research-session.sh" "$VPS_HOST:/opt/teleo-eval/research-session.sh"
+rsync $RSYNC_FLAGS "$REPO_ROOT/ops/research-session.sh" "$VPS_HOST:/opt/teleo-eval/research-session.sh"
 echo ""

 if $DRY_RUN; then
--- a/deploy/auto-deploy.sh
+++ b/deploy/auto-deploy.sh
@ -1,144 +0,0 @@
-#!/usr/bin/env bash
-# auto-deploy.sh — Pull from Forgejo, sync to working dirs, restart if needed.
-# Runs as systemd timer (teleo-auto-deploy.timer) every 2 minutes.
-# Exits silently when nothing has changed.
-set -euo pipefail
-
-LOCK_FILE="/tmp/teleo-auto-deploy.lock"
-exec 9>"$LOCK_FILE"
-if ! flock -n 9; then
-  logger -t "auto-deploy" "Another deploy is already running. Skipping."
-  exit 0
-fi
-
-DEPLOY_CHECKOUT="/opt/teleo-eval/workspaces/deploy-infra"
-PIPELINE_DIR="/opt/teleo-eval/pipeline"
-DIAGNOSTICS_DIR="/opt/teleo-eval/diagnostics"
-AGENT_STATE_DIR="/opt/teleo-eval/ops/agent-state"
-STAMP_FILE="/opt/teleo-eval/.last-deploy-sha"
-LOG_TAG="auto-deploy"
-
-log() { logger -t "$LOG_TAG" "$1"; echo "$(date '+%Y-%m-%d %H:%M:%S') $1"; }
-
-if [ ! -d "$DEPLOY_CHECKOUT/.git" ]; then
-  log "ERROR: Deploy checkout not found at $DEPLOY_CHECKOUT. Run setup first."
-  exit 1
-fi
-
-cd "$DEPLOY_CHECKOUT"
-if ! git fetch origin main --quiet 2>&1; then
-  log "ERROR: git fetch failed"
-  exit 1
-fi
-
-NEW_SHA=$(git rev-parse origin/main)
-OLD_SHA=$(cat "$STAMP_FILE" 2>/dev/null || echo "none")
-
-if [ "$NEW_SHA" = "$OLD_SHA" ]; then
-  exit 0
-fi
-
-log "New commits: ${OLD_SHA:0:8} -> ${NEW_SHA:0:8}"
-
-if ! git checkout main --quiet 2>&1; then
-  log "ERROR: git checkout main failed — dirty tree or corrupted index"
-  exit 1
-fi
-if ! git pull --ff-only --quiet 2>&1; then
-  log "ERROR: git pull --ff-only failed. Manual intervention needed."
-  exit 1
-fi
-
-# Syntax check all Python files before copying
-ERRORS=0
-for f in lib/*.py *.py diagnostics/*.py telegram/*.py tests/*.py; do
-  [ -f "$f" ] || continue
-  if ! python3 -c "import ast, sys; ast.parse(open(sys.argv[1]).read())" "$f" 2>&1; then
-    log "SYNTAX ERROR: $f"
-    ERRORS=$((ERRORS + 1))
-  fi
-done
-if [ "$ERRORS" -gt 0 ]; then
-  log "ERROR: $ERRORS syntax errors. Deploy aborted. Fix and push again."
-  exit 1
-fi
-log "Syntax check passed"
-
-# Sync to working directories
-RSYNC_OPTS=(-az --exclude __pycache__ --exclude '*.pyc' --exclude '*.bak*')
-
-rsync "${RSYNC_OPTS[@]}" lib/ "$PIPELINE_DIR/lib/"
-
-for f in teleo-pipeline.py reweave.py fetch_coins.py; do
-  [ -f "$f" ] && rsync "${RSYNC_OPTS[@]}" "$f" "$PIPELINE_DIR/$f"
-done
-
-rsync "${RSYNC_OPTS[@]}" telegram/ "$PIPELINE_DIR/telegram/"
-rsync "${RSYNC_OPTS[@]}" diagnostics/ "$DIAGNOSTICS_DIR/"
-rsync "${RSYNC_OPTS[@]}" agent-state/ "$AGENT_STATE_DIR/"
-rsync "${RSYNC_OPTS[@]}" tests/ "$PIPELINE_DIR/tests/"
-[ -f research/research-session.sh ] && rsync "${RSYNC_OPTS[@]}" research/research-session.sh /opt/teleo-eval/research-session.sh
-
-# Safety net: ensure all .sh files are executable after rsync
-find /opt/teleo-eval -maxdepth 3 -name '*.sh' -not -perm -u+x -exec chmod +x {} +
-
-log "Files synced"
-
-# Restart services only if Python files changed
-RESTART=""
-if [ "$OLD_SHA" != "none" ]; then
-  if git diff --name-only "$OLD_SHA" "$NEW_SHA" -- lib/ teleo-pipeline.py reweave.py telegram/ 2>/dev/null | grep -q '\.py$'; then
-    RESTART="$RESTART teleo-pipeline"
-  fi
-  if git diff --name-only "$OLD_SHA" "$NEW_SHA" -- diagnostics/ 2>/dev/null | grep -q '\.py$'; then
-    RESTART="$RESTART teleo-diagnostics"
-  fi
-else
-  RESTART="teleo-pipeline teleo-diagnostics"
-fi
-
-if [ -n "$RESTART" ]; then
-  log "Restarting:$RESTART"
-  sudo systemctl restart $RESTART
-  sleep 30
-
-  FAIL=0
-  for svc in $RESTART; do
-    if systemctl is-active --quiet "$svc"; then
-      log "$svc: active"
-    else
-      log "ERROR: $svc failed to start"
-      journalctl -u "$svc" -n 5 --no-pager 2>/dev/null || true
-      FAIL=1
-    fi
-  done
-
-  if echo "$RESTART" | grep -q "teleo-pipeline"; then
-    HEALTH_CODE=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3 http://localhost:8080/health 2>/dev/null || echo "000")
-    if [ "$HEALTH_CODE" = "200" ] || [ "$HEALTH_CODE" = "503" ]; then
-      log "pipeline health: OK (HTTP $HEALTH_CODE)"
-    else
-      log "WARNING: pipeline health check failed (HTTP $HEALTH_CODE)"
-      FAIL=1
-    fi
-  fi
-
-  if echo "$RESTART" | grep -q "teleo-diagnostics"; then
-    if curl -sf --connect-timeout 3 http://localhost:8081/ops > /dev/null 2>&1; then
-      log "diagnostics health: OK"
-    else
-      log "WARNING: diagnostics health check failed"
-      FAIL=1
-    fi
-  fi
-
-  if [ "$FAIL" -gt 0 ]; then
-    log "WARNING: Smoke test failures. NOT updating stamp. Will retry next cycle. Push a fix."
-    exit 1
-  fi
-else
-  log "No Python changes — services not restarted"
-fi
-
-echo "$NEW_SHA" > "$STAMP_FILE"
-log "Deploy complete: $(git log --oneline -1 "$NEW_SHA")"
--- a/deploy/sync-mirror.sh
+++ b/deploy/sync-mirror.sh
@ -1,282 +0,0 @@
-#!/bin/bash
-# Bidirectional sync: Forgejo (authoritative) <-> GitHub (public mirror)
-# Forgejo wins on conflict. Runs every 2 minutes via cron.
-#
-# Security note: GitHub->Forgejo path is for external contributor convenience.
-# Never auto-process branches arriving via this path without a PR.
-# Eval pipeline and extract cron only act on PRs, not raw branches.
-
-set -euo pipefail
-
-REPO_DIR="/opt/teleo-eval/mirror/teleo-codex.git"
-LOG="/opt/teleo-eval/logs/sync.log"
-LOCKFILE="/tmp/sync-mirror.lock"
-PIPELINE_DB="/opt/teleo-eval/pipeline/pipeline.db"
-GITHUB_PAT_FILE="/opt/teleo-eval/secrets/github-pat"
-GITHUB_REPO="living-ip/teleo-codex"
-
-log() { echo "[$(date -Iseconds)] $1" >> "$LOG"; }
-
-# Lockfile — prevent concurrent runs
-if [ -f "$LOCKFILE" ]; then
-    pid=$(cat "$LOCKFILE" 2>/dev/null)
-    if kill -0 "$pid" 2>/dev/null; then
-        exit 0
-    fi
-    rm -f "$LOCKFILE"
-fi
-echo $$ > "$LOCKFILE"
-trap 'rm -f "$LOCKFILE"' EXIT
-
-# Pre-flight: fix permissions if another user touched the mirror dir (Rhea)
-BAD_PERMS=$(find "$REPO_DIR" ! -user teleo 2>/dev/null | head -1 || true)
-if [ -n "$BAD_PERMS" ]; then
-    log "Fixing mirror permissions (found: $BAD_PERMS)"
-    chown -R teleo:teleo "$REPO_DIR" 2>/dev/null
-fi
-cd "$REPO_DIR" || { log "ERROR: cannot cd to $REPO_DIR"; exit 1; }
-
-# Step 1: Fetch from Forgejo (must succeed — it's authoritative)
-log "Fetching from Forgejo..."
-if ! git fetch forgejo --prune >> "$LOG" 2>&1; then
-    log "ERROR: Forgejo fetch failed — aborting"
-    exit 1
-fi
-
-# Step 2: Fetch from GitHub (warn on failure, don't abort)
-log "Fetching from GitHub..."
-git fetch origin --prune >> "$LOG" 2>&1 || log "WARN: GitHub fetch failed"
-
-# Step 2.1: Fetch GitHub fork PR refs
-# Fork-based PRs don't create branches on origin — they create refs/pull/N/head
-# Fetch these so we can push them to Forgejo for evaluation
-GITHUB_PAT_STEP2=$(cat "$GITHUB_PAT_FILE" 2>/dev/null | tr -d '[:space:]')
-if [ -n "$GITHUB_PAT_STEP2" ]; then
-    OPEN_PRS=$(curl -sf "https://api.github.com/repos/$GITHUB_REPO/pulls?state=open&per_page=100" \
-        -H "Authorization: token $GITHUB_PAT_STEP2" 2>/dev/null || echo "[]")
-    echo "$OPEN_PRS" | python3 -c "
-import sys, json
-prs = json.load(sys.stdin)
-for pr in prs:
-    head = pr.get('head', {})
-    # Only process fork PRs (repo differs from base repo)
-    base_repo = pr.get('base', {}).get('repo', {}).get('full_name', '')
-    head_repo = head.get('repo', {}) or {}
-    head_full = head_repo.get('full_name', '')
-    if head_full and head_full != base_repo:
-        print(f\"{pr['number']} {head.get('ref', '')} {head.get('sha', '')}\")
-" 2>/dev/null | while read pr_num branch_name head_sha; do
-        if [ -z "$pr_num" ] || [ -z "$branch_name" ]; then continue; fi
-        PR_BRANCH="gh-pr-${pr_num}/${branch_name}"
-        # Check if we already have this ref at the right SHA
-        EXISTING=$(git rev-parse "refs/heads/$PR_BRANCH" 2>/dev/null || true)
-        if [ "$EXISTING" = "$head_sha" ]; then continue; fi
-        # Fetch the PR ref and create a local branch
-        git fetch origin "refs/pull/${pr_num}/head:refs/heads/$PR_BRANCH" >> "$LOG" 2>&1 && \
-            log "Fetched fork PR #$pr_num -> $PR_BRANCH" || \
-            log "WARN: Failed to fetch fork PR #$pr_num"
-    done
-fi
-
-# Step 2.5: GitHub main -> Forgejo main (ff-only)
-# If a PR was merged on GitHub, GitHub main is ahead of Forgejo main.
-# Fast-forward Forgejo main to match — safe because ff-only guarantees no divergence.
-GITHUB_MAIN_FF=$(git rev-parse refs/remotes/origin/main 2>/dev/null || true)
-FORGEJO_MAIN_FF=$(git rev-parse refs/remotes/forgejo/main 2>/dev/null || true)
-if [ -n "$GITHUB_MAIN_FF" ] && [ -n "$FORGEJO_MAIN_FF" ]; then
-    if [ "$GITHUB_MAIN_FF" != "$FORGEJO_MAIN_FF" ]; then
-        if git merge-base --is-ancestor "$FORGEJO_MAIN_FF" "$GITHUB_MAIN_FF"; then
-            log "GitHub main ($GITHUB_MAIN_FF) ahead of Forgejo main ($FORGEJO_MAIN_FF) — fast-forwarding"
-            git push forgejo "refs/remotes/origin/main:refs/heads/main" >> "$LOG" 2>&1 && \
-                log "Forgejo main fast-forwarded to $GITHUB_MAIN_FF" || \
-                log "WARN: Failed to fast-forward Forgejo main"
-        fi
-    fi
-fi
-
-# Step 3: Forgejo -> GitHub (primary direction)
-# Update local refs from Forgejo remote refs using process substitution (avoids subshell)
-log "Syncing Forgejo -> GitHub..."
-while read branch; do
-    [ "$branch" = "HEAD" ] && continue
-    git update-ref "refs/heads/$branch" "refs/remotes/forgejo/$branch" 2>/dev/null || \
-        log "WARN: Failed to update ref $branch"
-done < <(git for-each-ref --format="%(refname:lstrip=3)" refs/remotes/forgejo/)
-
-# Safety: verify Forgejo main descends from GitHub main before force-pushing
-GITHUB_MAIN=$(git rev-parse refs/remotes/origin/main 2>/dev/null || true)
-FORGEJO_MAIN=$(git rev-parse refs/remotes/forgejo/main 2>/dev/null || true)
-PUSH_MAIN=true
-if [ -n "$GITHUB_MAIN" ] && [ -n "$FORGEJO_MAIN" ]; then
-    if ! git merge-base --is-ancestor "$GITHUB_MAIN" "$FORGEJO_MAIN"; then
-        log "CRITICAL: Forgejo main is NOT a descendant of GitHub main — skipping main push"
-        log "CRITICAL: GitHub main: $GITHUB_MAIN, Forgejo main: $FORGEJO_MAIN"
-        PUSH_MAIN=false
-    fi
-fi
-
-if [ "$PUSH_MAIN" = true ]; then
-    git push origin --all --force >> "$LOG" 2>&1 || log "WARN: Push to GitHub failed"
-else
-    # Push all branches except main
-    while read branch; do
-        [ "$branch" = "main" ] && continue
-        [ "$branch" = "HEAD" ] && continue
-        git push origin --force "refs/heads/$branch:refs/heads/$branch" >> "$LOG" 2>&1 || \
-            log "WARN: Failed to push $branch to GitHub"
-    done < <(git for-each-ref --format="%(refname:lstrip=2)" refs/heads/)
-fi
-git push origin --tags --force >> "$LOG" 2>&1 || log "WARN: Tag push to GitHub failed"
-
-# Step 4: GitHub -> Forgejo (external contributions only)
-# Only push branches that exist on GitHub but NOT on Forgejo
-log "Checking GitHub-only branches..."
-GITHUB_ONLY=$(comm -23 \
-    <(git for-each-ref --format="%(refname:lstrip=3)" refs/remotes/origin/ | grep -v HEAD | sort) \
-    <(git for-each-ref --format="%(refname:lstrip=3)" refs/remotes/forgejo/ | grep -v HEAD | sort))
-
-if [ -n "$GITHUB_ONLY" ]; then
-    FORGEJO_TOKEN=$(cat /opt/teleo-eval/secrets/forgejo-admin-token 2>/dev/null)
-    for branch in $GITHUB_ONLY; do
-        log "New from GitHub: $branch -> Forgejo"
-        # Fork PR branches live as local refs (from Step 2.1), not on origin remote
-        if [[ "$branch" == gh-pr-* ]]; then
-            git push forgejo "refs/heads/$branch:refs/heads/$branch" >> "$LOG" 2>&1 || {
-                log "WARN: Failed to push fork PR branch $branch to Forgejo"
-                continue
-            }
-        else
-            git push forgejo "refs/remotes/origin/$branch:refs/heads/$branch" >> "$LOG" 2>&1 || {
-                log "WARN: Failed to push $branch to Forgejo"
-                continue
-            }
-        fi
-        # Auto-create PR on Forgejo for mirrored branches (external contributor path)
-        # Skip pipeline-internal branches
-        case "$branch" in
-            extract/*|ingestion/*) continue ;;
-        esac
-        if [ -n "$FORGEJO_TOKEN" ]; then
-            # Check if PR already exists for this branch (open or closed)
-            # NOTE: Forgejo ?head= filter is broken (ignores head value, returns all PRs).
-            # Workaround: fetch open+closed PRs, pipe to Python, check head.ref.
-            HAS_PR=$( {
-                curl -sf "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls?state=open&limit=50" \
-                    -H "Authorization: token $FORGEJO_TOKEN" 2>/dev/null || echo "[]"
-                echo ""
-                curl -sf "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls?state=closed&sort=created&limit=50" \
-                    -H "Authorization: token $FORGEJO_TOKEN" 2>/dev/null || echo "[]"
-            } | python3 -c "
-import sys, json
-branch = sys.argv[1]
-for line in sys.stdin:
-    line = line.strip()
-    if not line or line == '[]': continue
-    try:
-        for pr in json.loads(line):
-            if pr.get('head', {}).get('ref') == branch:
-                print('yes'); sys.exit(0)
-    except: pass
-print('no')
-" "$branch" 2>/dev/null || echo "no")
-            if [ "$HAS_PR" = "no" ]; then
-                # Build PR title — for fork PRs, use the GitHub PR title
-                if [[ "$branch" == gh-pr-* ]]; then
-                    FORK_GH_NUM=$(echo "$branch" | sed 's|gh-pr-\([0-9]*\)/.*|\1|')
-                    GITHUB_PAT_T=$(cat "$GITHUB_PAT_FILE" 2>/dev/null | tr -d '[:space:]')
-                    PR_TITLE=$(curl -sf "https://api.github.com/repos/$GITHUB_REPO/pulls/$FORK_GH_NUM" \
-                        -H "Authorization: token $GITHUB_PAT_T" 2>/dev/null | \
-                        python3 -c "import sys,json; print(json.load(sys.stdin).get('title',''))" 2>/dev/null || true)
-                    [ -z "$PR_TITLE" ] && PR_TITLE=$(echo "$branch" | sed 's|/|: |;s/-/ /g')
-                else
-                    PR_TITLE=$(echo "$branch" | sed 's|/|: |;s/-/ /g')
-                fi
-                PAYLOAD=$(python3 -c "import sys,json; print(json.dumps({'title':sys.argv[1],'head':sys.argv[2],'base':'main'}))" "$PR_TITLE" "$branch")
-                RESULT=$(curl -sf -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
-                    -H "Authorization: token $FORGEJO_TOKEN" \
-                    -H "Content-Type: application/json" \
-                    -d "$PAYLOAD" 2>/dev/null || echo "")
-                PR_NUM=$(echo "$RESULT" | grep -o '"number":[0-9]*' | head -1 | grep -o "[0-9]*" || true)
-                if [ -n "$PR_NUM" ]; then
-                    log "Auto-created PR #$PR_NUM on Forgejo for $branch"
-                    # Step 4.5: Link GitHub PR to Forgejo PR in pipeline DB
-                    if [[ "$branch" == gh-pr-* ]]; then
-                        GH_PR_NUM=$(echo "$branch" | sed 's|gh-pr-\([0-9]*\)/.*|\1|')
-                    else
-                        GITHUB_PAT=$(cat "$GITHUB_PAT_FILE" 2>/dev/null | tr -d '[:space:]')
-                        GH_PR_NUM=""
-                        if [ -n "$GITHUB_PAT" ]; then
-                            GH_PR_NUM=$(curl -sf "https://api.github.com/repos/$GITHUB_REPO/pulls?head=living-ip:$branch&state=all" \
-                                -H "Authorization: token $GITHUB_PAT" 2>/dev/null | \
-                                python3 -c "import sys,json; prs=json.load(sys.stdin); print(prs[0]['number'] if prs else '')" 2>/dev/null || true)
-                        fi
-                    fi
-                    if [[ "$GH_PR_NUM" =~ ^[0-9]+$ ]] && [[ "$PR_NUM" =~ ^[0-9]+$ ]]; then
-                        sqlite3 "$PIPELINE_DB" "UPDATE prs SET github_pr = $GH_PR_NUM WHERE number = $PR_NUM;" 2>/dev/null && \
-                            log "Linked GitHub PR #$GH_PR_NUM -> Forgejo PR #$PR_NUM" || \
-                            log "WARN: Failed to link GitHub PR #$GH_PR_NUM to Forgejo PR #$PR_NUM in DB"
-                    fi
-                else
-                    log "WARN: Failed to auto-create PR for $branch"
-                fi
-            fi
-        fi
-    done
-else
-    log "No new GitHub-only branches"
-fi
-
-# Step 6: Divergence alerting
-# After all sync steps, check if GitHub and Forgejo main still differ.
-# 2 consecutive divergent cycles (4 min) triggers a one-shot Telegram alert.
-DIVERGENCE_FILE="/opt/teleo-eval/logs/.divergence-count"
-git fetch forgejo main --quiet 2>/dev/null || true
-git fetch origin main --quiet 2>/dev/null || true
-GH_MAIN_FINAL=$(git rev-parse refs/remotes/origin/main 2>/dev/null || true)
-FG_MAIN_FINAL=$(git rev-parse refs/remotes/forgejo/main 2>/dev/null || true)
-
-if [ -n "$GH_MAIN_FINAL" ] && [ -n "$FG_MAIN_FINAL" ] && [ "$GH_MAIN_FINAL" != "$FG_MAIN_FINAL" ]; then
-    PREV=$(cat "$DIVERGENCE_FILE" 2>/dev/null || echo "0")
-    if [ "$PREV" = "alerted" ]; then
-        log "DIVERGENCE: still diverged (already alerted)"
-    else
-        COUNT=$((PREV + 1))
-        echo "$COUNT" > "$DIVERGENCE_FILE"
-        log "DIVERGENCE: cycle $COUNT — GitHub=$GH_MAIN_FINAL Forgejo=$FG_MAIN_FINAL"
-        if [ "$COUNT" -ge 2 ]; then
-            BOT_TOKEN=$(cat /opt/teleo-eval/secrets/telegram-bot-token 2>/dev/null || true)
-            ADMIN_CHAT=$(cat /opt/teleo-eval/secrets/admin-chat-id 2>/dev/null || true)
-            if [ -n "$BOT_TOKEN" ] && [ -n "$ADMIN_CHAT" ]; then
-                ALERT_MSG=$(python3 -c "
-import json, sys
-msg = '⚠️ Mirror divergence detected\\n\\n'
-msg += f'GitHub main: {sys.argv[1][:8]}\\n'
-msg += f'Forgejo main: {sys.argv[2][:8]}\\n'
-msg += f'Diverged for {sys.argv[3]} consecutive cycles ({int(sys.argv[3])*2} min)\\n\\n'
-msg += 'Check sync-mirror.sh logs: /opt/teleo-eval/logs/sync.log'
-print(json.dumps({'chat_id': sys.argv[4], 'text': msg, 'parse_mode': 'HTML'}))
-" "$GH_MAIN_FINAL" "$FG_MAIN_FINAL" "$COUNT" "$ADMIN_CHAT")
-                if curl -sf -X POST "https://api.telegram.org/bot${BOT_TOKEN}/sendMessage" \
-                    -H "Content-Type: application/json" \
-                    -d "$ALERT_MSG" >> "$LOG" 2>&1; then
-                    log "DIVERGENCE: alert sent to admin"
-                    echo "alerted" > "$DIVERGENCE_FILE"
-                else
-                    log "WARN: Failed to send divergence alert (will retry next cycle)"
-                fi
-            else
-                log "WARN: Cannot send divergence alert — missing bot token or admin chat ID"
-            fi
-        fi
-    fi
-else
-    if [ -f "$DIVERGENCE_FILE" ]; then
-        PREV=$(cat "$DIVERGENCE_FILE" 2>/dev/null || echo "0")
-        if [ "$PREV" != "0" ]; then
-            log "DIVERGENCE: resolved — repos back in sync"
-        fi
-        rm -f "$DIVERGENCE_FILE"
-    fi
-fi
-
-log "Sync complete"
--- a/diagnostics/CONSOLIDATION-DIFF-LOG.md
+++ b/diagnostics/CONSOLIDATION-DIFF-LOG.md
@ -0,0 +1,47 @@
+# Diagnostics Consolidation Diff Log
+# Branch: epimetheus/consolidate-infra
+# Date: 2026-04-13
+
+## Files with multiple copies — resolution
+
+### alerting.py
+- ROOT diagnostics/alerting.py (22320 bytes) — KEPT (newer: has _ALLOWED_DIM_EXPRS SQL injection protection, stricter dim_expr validation)
+- ops/diagnostics/alerting.py (22039 bytes) — OVERWRITTEN (missing SQL injection guards)
+- VPS /opt/teleo-eval/diagnostics/alerting.py (22039 bytes) — matches ops/ version, needs deploy
+
+### alerting_routes.py
+- ROOT diagnostics/alerting_routes.py (4216 bytes) — KEPT (newer: proper try/finally/conn.close, ValueError catch on hours param)
+- ops/diagnostics/alerting_routes.py (4043 bytes) — OVERWRITTEN (missing error handling, missing conn.close)
+- VPS /opt/teleo-eval/diagnostics/alerting_routes.py (4043 bytes) — matches ops/ version, needs deploy
+
+### vitality.py
+- ROOT diagnostics/vitality.py (25548 bytes) — KEPT (only copy in repo, larger than VPS)
+- VPS /opt/teleo-eval/diagnostics/vitality.py (18539 bytes) — older version, needs deploy
+- MOVED TO: ops/diagnostics/vitality.py
+
+### vitality_routes.py
+- ROOT diagnostics/vitality_routes.py (10824 bytes) — KEPT (only copy in repo, larger than VPS)
+- VPS /opt/teleo-eval/diagnostics/vitality_routes.py (9729 bytes) — older version, needs deploy
+- MOVED TO: ops/diagnostics/vitality_routes.py
+
+## Files moved
+
+| From | To | Reason |
+|------|-----|--------|
+| diagnostics/vitality.py | ops/diagnostics/vitality.py | Consolidate to canonical location |
+| diagnostics/vitality_routes.py | ops/diagnostics/vitality_routes.py | Consolidate to canonical location |
+| diagnostics/alerting.py | ops/diagnostics/alerting.py | Newer version overwrites older |
+| diagnostics/alerting_routes.py | ops/diagnostics/alerting_routes.py | Newer version overwrites older |
+
+## Root diagnostics/ after consolidation
+- PATCH_INSTRUCTIONS.md — kept (documentation, not code)
+- evolution.md — kept (documentation)
+- weekly/2026-03-25-week3.md — kept (report)
+- ops/sessions/*.json — kept (session data)
+- All .py files REMOVED from root diagnostics/
+
+## VPS .bak files inventory (30+ files)
+All in /opt/teleo-eval/diagnostics/. Git is the backup now. Safe to delete after consolidation verified.
+
+## VPS deploy needed after merge
+alerting.py, alerting_routes.py, vitality.py, vitality_routes.py — all local versions are newer than VPS.
--- a/diagnostics/activity_endpoint.py
+++ b/diagnostics/activity_endpoint.py
@ -28,9 +28,12 @@ import sqlite3
 import json


-# Non-merged statuses map directly to operation — no semantic classification yet.
-NON_MERGED_STATUS_TO_OPERATION = {
-    'approved': 'new',         # about to become knowledge
+# Map PR status to Clay's operation color palette
+# extract (cyan), new (green), enrich (amber), challenge (red-orange),
+# decision (violet), infra (grey)
+STATUS_TO_OPERATION = {
+    'merged': 'new',           # green — new knowledge merged
+    'approved': 'enrich',      # amber — approved, enriching KB
    'open': 'extract',         # cyan — new extraction in progress
    'validating': 'extract',   # cyan — being validated
    'reviewing': 'extract',    # cyan — under review
@ -40,51 +43,6 @@ NON_MERGED_STATUS_TO_OPERATION = {
    'conflict': 'challenge',   # red-orange — conflict detected
 }

-# Maintenance commit_types that land on main but don't represent new knowledge.
-_MAINTENANCE_COMMIT_TYPES = {'fix', 'pipeline', 'reweave'}
-
-
-def classify_pr_operation(status, commit_type, branch, description=None):
-    """Derive a Timeline operation from a PR row.
-
-    Priority order for MERGED PRs (commit_type wins over branch prefix —
-    extract/* branches with commit_type='enrich' or 'challenge' classify
-    by commit_type, matching the contributor-role wiring fix):
-      1. commit_type == 'challenge' OR branch.startswith('challenge/') OR
-         description contains 'challenged_by' → 'challenge'
-      2. commit_type == 'enrich' OR branch.startswith('enrich/' | 'reweave/')
-         → 'enrich'
-      3. commit_type in _MAINTENANCE_COMMIT_TYPES → 'infra'
-      4. default (commit_type='knowledge'|'extract'|'research'|'entity' or
-         anything else) → 'new'
-
-    For non-merged PRs, falls back to NON_MERGED_STATUS_TO_OPERATION.
-    """
-    commit_type = (commit_type or '').lower()
-    branch = branch or ''
-    description_lower = (description or '').lower()
-
-    if status != 'merged':
-        return NON_MERGED_STATUS_TO_OPERATION.get(status, 'infra')
-
-    # Challenge takes precedence — the signal is inherently more specific.
-    if (commit_type == 'challenge'
-            or branch.startswith('challenge/')
-            or 'challenged_by' in description_lower):
-        return 'challenge'
-
-    if (commit_type == 'enrich'
-            or branch.startswith('enrich/')
-            or branch.startswith('reweave/')):
-        return 'enrich'
-
-    if commit_type in _MAINTENANCE_COMMIT_TYPES:
-        return 'infra'
-
-    # Default: legacy 'knowledge', new 'extract', 'research', 'entity',
-    # unknown/null commit_type → treat as new knowledge.
-    return 'new'
-
 # Map audit_log stage to operation type
 STAGE_TO_OPERATION = {
    'ingest': 'extract',
@ -160,8 +118,6 @@ async def handle_activity(request):
    Query params:
        limit (int, default 100, max 500): number of events to return
        cursor (ISO timestamp): return events older than this timestamp
-        type (str, optional): comma-separated operation types to include
-            (extract|new|enrich|challenge|infra). If absent, returns all types.

    Derives events from two sources:
        1. prs table — per-PR events with domain, agent, status
@ -175,13 +131,6 @@ async def handle_activity(request):
        limit = 100

    cursor = request.query.get('cursor')
-    type_param = request.query.get('type', '').strip()
-    allowed_ops = None
-    if type_param:
-        allowed_ops = {t.strip() for t in type_param.split(',') if t.strip()}
-        if not allowed_ops:
-            allowed_ops = None
-
    db_path = request.app['db_path']

    try:
@ -194,27 +143,22 @@ async def handle_activity(request):
        # Each PR generates events at created_at and merged_at timestamps
        pr_query = """
            SELECT number, status, domain, agent, branch, source_path,
-                   created_at, merged_at, source_channel, commit_type,
-                   description
+                   created_at, merged_at
            FROM prs
            WHERE {where_clause}
            ORDER BY COALESCE(merged_at, created_at) DESC
            LIMIT ?
        """

-        # Over-fetch when filtering by type so we have enough matching rows after
-        # post-build filtering. Cap at 2000 to avoid runaway queries.
-        fetch_limit = min(2000, limit * 5) if allowed_ops else limit + 1
-
        if cursor:
            rows = conn.execute(
                pr_query.format(where_clause="COALESCE(merged_at, created_at) < ?"),
-                (cursor, fetch_limit)
+                (cursor, limit + 1)
            ).fetchall()
        else:
            rows = conn.execute(
                pr_query.format(where_clause="1=1"),
-                (fetch_limit,)
+                (limit + 1,)
            ).fetchall()

        # Known knowledge agents for branch-prefix inference
@ -222,14 +166,7 @@ async def handle_activity(request):

        for row in rows:
            row_dict = dict(row)
-            operation = classify_pr_operation(
-                row_dict['status'],
-                row_dict.get('commit_type'),
-                row_dict.get('branch'),
-                row_dict.get('description'),
-            )
-            if allowed_ops and operation not in allowed_ops:
-                continue
+            operation = STATUS_TO_OPERATION.get(row_dict['status'], 'infra')
            description = pr_description(row_dict)

            # Use merged_at if available (more interesting event), else created_at
@ -252,7 +189,6 @@ async def handle_activity(request):
                'description': description,
                'status': row_dict['status'],
                'pr_number': row_dict['number'],
-                'source_channel': row_dict.get('source_channel') or 'unknown',
            })

        # Source 2: Audit log events (secondary — pipeline-level)
@ -281,8 +217,6 @@ async def handle_activity(request):
            for row in audit_rows:
                row_dict = dict(row)
                operation = STAGE_TO_OPERATION.get(row_dict['stage'], 'infra')
-                if allowed_ops and operation not in allowed_ops:
-                    continue
                description = audit_description(row_dict)

                events.append({
@ -294,7 +228,6 @@ async def handle_activity(request):
                    'description': description,
                    'status': None,
                    'pr_number': None,
-                    'source_channel': None,  # audit events not tied to a PR
                })

        conn.close()
--- a/diagnostics/activity_feed_api.py
+++ b/diagnostics/activity_feed_api.py
@ -1,214 +0,0 @@
-"""Activity feed API — serves contribution events from pipeline.db."""
-import re
-import sqlite3
-import math
-import time
-from aiohttp import web
-
-DB_PATH = "/opt/teleo-eval/pipeline/pipeline.db"
-_cache = {"data": None, "ts": 0}
-CACHE_TTL = 60  # 1 minute — activity should feel fresh
-
-
-def _get_conn():
-    conn = sqlite3.connect(DB_PATH)
-    conn.row_factory = sqlite3.Row
-    conn.execute("PRAGMA busy_timeout = 10000")
-    return conn
-
-
-def _classify_event(branch, description, commit_type):
-    if commit_type != "knowledge":
-        return None
-    if branch and branch.startswith("extract/"):
-        return "create"
-    if branch and branch.startswith("reweave/"):
-        return "enrich"
-    if branch and branch.startswith("challenge/"):
-        return "challenge"
-    if description and "challenged_by" in description.lower():
-        return "challenge"
-    if branch and branch.startswith("enrich/"):
-        return "enrich"
-    return "create"
-
-
-def _normalize_contributor(submitted_by, agent):
-    if submitted_by and submitted_by.strip():
-        name = submitted_by.strip().lstrip("@")
-        return name
-    if agent and agent.strip() and agent != "pipeline":
-        return agent.strip()
-    return "pipeline"
-
-
-def _summary_from_branch(branch):
-    if not branch:
-        return ""
-    parts = branch.split("/", 1)
-    if len(parts) < 2:
-        return ""
-    slug = parts[1]
-    slug = re.sub(r"^[\d-]+-", "", slug)  # strip date prefix
-    slug = re.sub(r"-[a-f0-9]{4}$", "", slug)  # strip hash suffix
-    return slug.replace("-", " ").strip().capitalize()
-
-
-def _extract_claim_slugs(description, branch=None):
-    if not description:
-        if branch:
-            parts = branch.split("/", 1)
-            if len(parts) > 1:
-                return [parts[1][:120]]
-        return []
-    titles = [t.strip() for t in description.split("|") if t.strip()]
-    slugs = []
-    for title in titles:
-        slug = title.lower().strip()
-        slug = "".join(c if c.isalnum() or c in (" ", "-") else "" for c in slug)
-        slug = slug.replace(" ", "-").strip("-")
-        if len(slug) > 10:
-            slugs.append(slug[:120])
-    return slugs
-
-
-def _hot_score(challenge_count, enrich_count, signal_count, hours_since):
-    numerator = challenge_count * 3 + enrich_count * 2 + signal_count
-    denominator = max(hours_since, 0.5) ** 1.5
-    return numerator / denominator
-
-
-def _build_events():
-    conn = _get_conn()
-    try:
-        rows = conn.execute("""
-            SELECT p.number, p.branch, p.domain, p.agent, p.submitted_by,
-                   p.merged_at, p.description, p.commit_type, p.cost_usd,
-                   p.source_channel
-            FROM prs p
-            WHERE p.status = 'merged'
-              AND p.commit_type = 'knowledge'
-              AND p.merged_at IS NOT NULL
-            ORDER BY p.merged_at DESC
-            LIMIT 2000
-        """).fetchall()
-
-        events = []
-        claim_activity = {}  # slug -> {challenges, enriches, signals, first_seen}
-
-        for row in rows:
-            event_type = _classify_event(row["branch"], row["description"], row["commit_type"])
-            if not event_type:
-                continue
-
-            contributor = _normalize_contributor(row["submitted_by"], row["agent"])
-            slugs = _extract_claim_slugs(row["description"], row["branch"])
-            merged_at = row["merged_at"] or ""
-
-            ci_map = {"create": 0.35, "enrich": 0.25, "challenge": 0.40}
-            ci_earned = ci_map.get(event_type, 0)
-
-            for slug in slugs:
-                if slug not in claim_activity:
-                    claim_activity[slug] = {
-                        "challenges": 0, "enriches": 0, "signals": 0,
-                        "first_seen": merged_at,
-                    }
-                if event_type == "challenge":
-                    claim_activity[slug]["challenges"] += 1
-                elif event_type == "enrich":
-                    claim_activity[slug]["enriches"] += 1
-                else:
-                    claim_activity[slug]["signals"] += 1
-
-            summary_text = ""
-            if row["description"]:
-                first_title = row["description"].split("|")[0].strip()
-                if len(first_title) > 120:
-                    first_title = first_title[:117] + "..."
-                summary_text = first_title
-            elif row["branch"]:
-                summary_text = _summary_from_branch(row["branch"])
-
-            for slug in (slugs[:1] if slugs else [""]):
-                events.append({
-                    "type": event_type,
-                    "claim_slug": slug,
-                    "domain": row["domain"] or "unknown",
-                    "contributor": contributor,
-                    "timestamp": merged_at,
-                    "ci_earned": round(ci_earned, 2),
-                    "summary": summary_text,
-                    "pr_number": row["number"],
-                    "source_channel": row["source_channel"] or "unknown",
-                })
-
-        return events, claim_activity
-    finally:
-        conn.close()
-
-
-def _sort_events(events, claim_activity, sort_mode, now_ts):
-    if sort_mode == "recent":
-        events.sort(key=lambda e: e["timestamp"], reverse=True)
-    elif sort_mode == "hot":
-        def hot_key(e):
-            slug = e["claim_slug"]
-            ca = claim_activity.get(slug, {"challenges": 0, "enriches": 0, "signals": 0})
-            try:
-                from datetime import datetime
-                evt_time = datetime.fromisoformat(e["timestamp"].replace("Z", "+00:00"))
-                hours = (now_ts - evt_time.timestamp()) / 3600
-            except (ValueError, AttributeError):
-                hours = 9999
-            return _hot_score(ca["challenges"], ca["enriches"], ca["signals"], hours)
-        events.sort(key=hot_key, reverse=True)
-    elif sort_mode == "important":
-        type_rank = {"challenge": 0, "enrich": 1, "create": 2}
-        events.sort(key=lambda e: (type_rank.get(e["type"], 3), -len(e["summary"])))
-    return events
-
-
-async def handle_activity_feed(request):
-    sort_mode = request.query.get("sort", "recent")
-    if sort_mode not in ("hot", "recent", "important"):
-        sort_mode = "recent"
-    domain = request.query.get("domain", "")
-    contributor = request.query.get("contributor", "")
-    try:
-        limit = min(int(request.query.get("limit", "20")), 100)
-    except ValueError:
-        limit = 20
-    try:
-        offset = max(int(request.query.get("offset", "0")), 0)
-    except ValueError:
-        offset = 0
-
-    now = time.time()
-    if _cache["data"] is None or (now - _cache["ts"]) > CACHE_TTL:
-        _cache["data"] = _build_events()
-        _cache["ts"] = now
-
-    events, claim_activity = _cache["data"]
-
-    filtered = events
-    if domain:
-        filtered = [e for e in filtered if e["domain"] == domain]
-    if contributor:
-        filtered = [e for e in filtered if e["contributor"] == contributor]
-
-    sorted_events = _sort_events(list(filtered), claim_activity, sort_mode, now)
-    total = len(sorted_events)
-    page = sorted_events[offset:offset + limit]
-
-    return web.json_response({
-        "events": page,
-        "total": total,
-        "sort": sort_mode,
-        "offset": offset,
-        "limit": limit,
-    }, headers={"Access-Control-Allow-Origin": "*"})
-
-
-def register(app):
-    app.router.add_get("/api/activity-feed", handle_activity_feed)
--- a/diagnostics/alerting.py
+++ b/diagnostics/alerting.py
@ -67,8 +67,6 @@ def check_agent_health(conn: sqlite3.Connection) -> list[dict]:
    now = datetime.now(timezone.utc)
    for r in rows:
        agent = r["agent"]
-        if agent in ("unknown", None):
-            continue
        latest = r["latest"]
        if not latest:
            continue
@ -268,22 +266,24 @@ def check_rejection_spike(conn: sqlite3.Connection) -> list[dict]:
    """Detect single rejection reason exceeding REJECTION_SPIKE_RATIO of recent rejections."""
    alerts = []

-    # Total rejected PRs in 24h (prs.eval_issues is the canonical source — Epimetheus 2026-04-02)
+    # Total rejections in 24h
    total = conn.execute(
-        """SELECT COUNT(*) as n FROM prs
-           WHERE eval_issues IS NOT NULL AND eval_issues != '[]'
-           AND created_at > datetime('now', '-24 hours')"""
+        """SELECT COUNT(*) as n FROM audit_log
+           WHERE stage='evaluate'
+           AND event IN ('changes_requested','domain_rejected','tier05_rejected')
+           AND timestamp > datetime('now', '-24 hours')"""
    ).fetchone()["n"]

    if total < 10:
        return alerts  # Not enough data

-    # Count by rejection tag from prs.eval_issues
+    # Count by rejection tag
    tags = conn.execute(
        """SELECT value as tag, COUNT(*) as cnt
-           FROM prs, json_each(prs.eval_issues)
-           WHERE eval_issues IS NOT NULL AND eval_issues != '[]'
-           AND created_at > datetime('now', '-24 hours')
+           FROM audit_log, json_each(json_extract(detail, '$.issues'))
+           WHERE stage='evaluate'
+           AND event IN ('changes_requested','domain_rejected','tier05_rejected')
+           AND timestamp > datetime('now', '-24 hours')
           GROUP BY tag ORDER BY cnt DESC"""
    ).fetchall()

@ -315,13 +315,16 @@ def check_stuck_loops(conn: sqlite3.Connection) -> list[dict]:
    """Detect agents repeatedly failing on the same rejection reason."""
    alerts = []

-    # Agent + rejection reason from prs table directly (Epimetheus correction 2026-04-02)
+    # COALESCE: rejection events use $.agent, eval events use $.domain_agent (Epimetheus 2026-03-28)
    rows = conn.execute(
-        """SELECT agent, value as tag, COUNT(*) as cnt
-           FROM prs, json_each(prs.eval_issues)
-           WHERE eval_issues IS NOT NULL AND eval_issues != '[]'
-           AND agent IS NOT NULL
-           AND created_at > datetime('now', '-6 hours')
+        """SELECT COALESCE(json_extract(detail, '$.agent'), json_extract(detail, '$.domain_agent')) as agent,
+                  value as tag,
+                  COUNT(*) as cnt
+           FROM audit_log, json_each(json_extract(detail, '$.issues'))
+           WHERE stage='evaluate'
+           AND event IN ('changes_requested','domain_rejected','tier05_rejected')
+           AND timestamp > datetime('now', '-6 hours')
+           AND COALESCE(json_extract(detail, '$.agent'), json_extract(detail, '$.domain_agent')) IS NOT NULL
           GROUP BY agent, tag
           HAVING cnt > ?""",
        (STUCK_LOOP_THRESHOLD,),
@ -409,13 +412,16 @@ def check_domain_rejection_patterns(conn: sqlite3.Connection) -> list[dict]:
    """Track rejection reason shift per domain — surfaces domain maturity issues."""
    alerts = []

-    # Per-domain rejection breakdown in 24h from prs table (Epimetheus correction 2026-04-02)
+    # Per-domain rejection breakdown in 24h
    rows = conn.execute(
-        """SELECT domain, value as tag, COUNT(*) as cnt
-           FROM prs, json_each(prs.eval_issues)
-           WHERE eval_issues IS NOT NULL AND eval_issues != '[]'
-           AND domain IS NOT NULL
-           AND created_at > datetime('now', '-24 hours')
+        """SELECT json_extract(detail, '$.domain') as domain,
+                  value as tag,
+                  COUNT(*) as cnt
+           FROM audit_log, json_each(json_extract(detail, '$.issues'))
+           WHERE stage='evaluate'
+           AND event IN ('changes_requested','domain_rejected','tier05_rejected')
+           AND timestamp > datetime('now', '-24 hours')
+           AND json_extract(detail, '$.domain') IS NOT NULL
           GROUP BY domain, tag
           ORDER BY domain, cnt DESC"""
    ).fetchall()
@ -467,11 +473,12 @@ def generate_failure_report(conn: sqlite3.Connection, agent: str, hours: int = 2
    hours = int(hours)  # defensive — callers should pass int, but enforce it
    rows = conn.execute(
        """SELECT value as tag, COUNT(*) as cnt,
-                  GROUP_CONCAT(DISTINCT number) as pr_numbers
-           FROM prs, json_each(prs.eval_issues)
-           WHERE eval_issues IS NOT NULL AND eval_issues != '[]'
-           AND agent = ?
-           AND created_at > datetime('now', ? || ' hours')
+                  GROUP_CONCAT(DISTINCT json_extract(detail, '$.pr')) as pr_numbers
+           FROM audit_log, json_each(json_extract(detail, '$.issues'))
+           WHERE stage='evaluate'
+           AND event IN ('changes_requested','domain_rejected','tier05_rejected')
+           AND json_extract(detail, '$.agent') = ?
+           AND timestamp > datetime('now', ? || ' hours')
           GROUP BY tag ORDER BY cnt DESC
           LIMIT 5""",
        (agent, f"-{hours}"),
--- a/diagnostics/app.py
+++ b/diagnostics/app.py
@ -42,7 +42,7 @@ API_KEY_FILE = Path(os.environ.get("ARGUS_API_KEY_FILE", "/opt/teleo-eval/secret

 # Endpoints that skip auth (dashboard is public for now, can lock later)
 _PUBLIC_PATHS = frozenset({"/", "/prs", "/ops", "/health", "/agents", "/epistemic", "/legacy", "/audit", "/api/metrics", "/api/snapshots", "/api/vital-signs",
-                           "/api/contributors", "/api/domains", "/api/audit", "/api/yield", "/api/cost-per-claim", "/api/fix-rates", "/api/compute-profile", "/api/review-queue", "/api/daily-digest", "/api/search"})
+                           "/api/contributors", "/api/domains", "/api/audit", "/api/yield", "/api/cost-per-claim", "/api/fix-rates", "/api/compute-profile", "/api/review-queue", "/api/daily-digest"})


 def _get_db() -> sqlite3.Connection:
@ -663,115 +663,38 @@ async def handle_api_domains(request):
    return web.json_response({"domains": breakdown})


-def _qdrant_hits_to_results(hits, include_expanded=False):
-    """Shape raw Qdrant hits into Ship's chat-API contract."""
-    results = []
-    for h in hits:
-        payload = h.get("payload", {}) or {}
-        path = payload.get("claim_path", "") or ""
-        slug = path.rsplit("/", 1)[-1]
-        if slug.endswith(".md"):
-            slug = slug[:-3]
-        results.append({
-            "slug": slug,
-            "path": path,
-            "title": payload.get("claim_title", ""),
-            "domain": payload.get("domain"),
-            "confidence": payload.get("confidence"),
-            "score": round(float(h.get("score", 0.0) or 0.0), 4),
-            "body_excerpt": payload.get("snippet", "") or "",
-        })
-    return results
-
-
 async def handle_api_search(request):
-    """Semantic search over claims via Qdrant.
+    """GET /api/search — semantic search over claims via Qdrant + graph expansion.

-    POST contract (Ship's chat API):
-      body: {"query": str, "limit": int, "min_score": float?, "domain": str?, "confidence": str?, "exclude": [str]?}
-      response: {"query": str, "results": [{"slug","path","title","domain","confidence","score","body_excerpt"}], "total": int}
-
-    GET (legacy + hackathon debug):
-      q: search query (required)
-      limit, domain, confidence, exclude, expand
-      min_score: if set, bypasses two-pass lib threshold (default lib behavior otherwise)
+    Query params:
+      q:          search query (required)
+      domain:     filter by domain (optional)
+      confidence: filter by confidence level (optional)
+      limit:      max results, default 10 (optional)
+      exclude:    comma-separated claim paths to exclude (optional)
+      expand:     enable graph expansion, default true (optional)
    """
-    if request.method == "POST":
-        try:
-            body = await request.json()
-        except Exception:
-            return web.json_response({"error": "invalid JSON body"}, status=400)
-
-        query = (body.get("query") or "").strip()
-        if not query:
-            return web.json_response({"error": "query required"}, status=400)
-
-        try:
-            limit = min(int(body.get("limit") or 5), 50)
-        except (TypeError, ValueError):
-            return web.json_response({"error": "limit must be int"}, status=400)
-        try:
-            min_score = float(body.get("min_score") if body.get("min_score") is not None else 0.25)
-        except (TypeError, ValueError):
-            return web.json_response({"error": "min_score must be float"}, status=400)
-
-        domain = body.get("domain")
-        confidence = body.get("confidence")
-        exclude = body.get("exclude") or None
-
-        vector = embed_query(query)
-        if vector is None:
-            return web.json_response({"error": "embedding failed"}, status=502)
-
-        hits = search_qdrant(vector, limit=limit, domain=domain,
-                             confidence=confidence, exclude=exclude,
-                             score_threshold=min_score)
-        results = _qdrant_hits_to_results(hits)
-        return web.json_response({"query": query, "results": results, "total": len(results)})
-
-    # GET path
    query = request.query.get("q", "").strip()
    if not query:
        return web.json_response({"error": "q parameter required"}, status=400)

    domain = request.query.get("domain")
    confidence = request.query.get("confidence")
-    try:
-        limit = min(int(request.query.get("limit", "10")), 50)
-    except ValueError:
-        return web.json_response({"error": "limit must be int"}, status=400)
+    limit = min(int(request.query.get("limit", "10")), 50)
    exclude_raw = request.query.get("exclude", "")
    exclude = [p.strip() for p in exclude_raw.split(",") if p.strip()] if exclude_raw else None
    expand = request.query.get("expand", "true").lower() != "false"
-    min_score_raw = request.query.get("min_score")

-    if min_score_raw is not None:
-        try:
-            min_score = float(min_score_raw)
-        except ValueError:
-            return web.json_response({"error": "min_score must be float"}, status=400)
-        vector = embed_query(query)
-        if vector is None:
-            return web.json_response({"error": "embedding failed"}, status=502)
-        hits = search_qdrant(vector, limit=limit, domain=domain,
-                             confidence=confidence, exclude=exclude,
-                             score_threshold=min_score)
-        direct = _qdrant_hits_to_results(hits)
-        return web.json_response({
-            "query": query,
-            "direct_results": direct,
-            "expanded_results": [],
-            "total": len(direct),
-        })
-
-    # Default GET: Layer 1 + Layer 2 via lib
+    # Use shared search library (Layer 1 + Layer 2)
    result = kb_search(query, expand=expand,
                       domain=domain, confidence=confidence, exclude=exclude)
+
    if "error" in result:
        error = result["error"]
        if error == "embedding_failed":
            return web.json_response({"error": "embedding failed"}, status=502)
        return web.json_response({"error": error}, status=500)
+
    return web.json_response(result)


@ -2345,7 +2268,6 @@ def create_app() -> web.Application:
    app.router.add_get("/api/contributors", handle_api_contributors)
    app.router.add_get("/api/domains", handle_api_domains)
    app.router.add_get("/api/search", handle_api_search)
-    app.router.add_post("/api/search", handle_api_search)
    app.router.add_get("/api/audit", handle_api_audit)
    app.router.add_get("/audit", handle_audit_page)
    app.router.add_post("/api/usage", handle_api_usage)
@ -2355,24 +2277,9 @@ def create_app() -> web.Application:
    register_dashboard_routes(app, lambda: _conn_from_app(app))
    register_review_queue_routes(app)
    register_daily_digest_routes(app, db_path=str(DB_PATH))
-    # Portfolio
-    from dashboard_portfolio import register_portfolio_routes
-    register_portfolio_routes(app, lambda: _conn_from_app(app))
    # Response audit - cost tracking + reasoning traces
    app["db_path"] = str(DB_PATH)
    register_response_audit_routes(app)
-    # Timeline activity feed (per-PR + audit_log events for dashboard v2)
-    from activity_endpoint import handle_activity
-    app.router.add_get("/api/activity", handle_activity)
-    # Gamification activity feed (hot/recent/important sort)
-    from activity_feed_api import register as register_activity_feed
-    register_activity_feed(app)
-    # Claims browser + detail
-    from claims_api import register_claims_routes
-    register_claims_routes(app)
-    # Contributor profile (handle lookup, leaderboard with action CI)
-    from contributor_profile_api import register_contributor_routes
-    register_contributor_routes(app)
    app.on_cleanup.append(_cleanup)
    return app

--- a/diagnostics/claims_api.py
+++ b/diagnostics/claims_api.py
@ -1,161 +0,0 @@
-"""Claims API endpoint — serves claim data from the codex filesystem."""
-import os
-import re
-import time
-import yaml
-from pathlib import Path
-from aiohttp import web
-
-CODEX_ROOT = Path("/opt/teleo-eval/workspaces/main/domains")
-_cache = {"data": None, "ts": 0}
-CACHE_TTL = 300  # 5 minutes
-
-def _parse_frontmatter(filepath):
-    try:
-        text = filepath.read_text(encoding="utf-8")
-        if not text.startswith("---"):
-            return None
-        end = text.index("---", 3)
-        fm = yaml.safe_load(text[3:end])
-        if not fm or fm.get("type") != "claim":
-            return None
-        body = text[end+3:].strip()
-        # Count wiki-links
-        links = re.findall(r"\[\[([^\]]+)\]\]", body)
-        # Extract first paragraph as summary
-        paragraphs = [p.strip() for p in body.split("\n\n") if p.strip() and not p.strip().startswith("#")]
-        summary = paragraphs[0][:300] if paragraphs else ""
-        return {
-            "slug": filepath.stem,
-            "title": fm.get("title", filepath.stem.replace("-", " ")),
-            "domain": fm.get("domain", "unknown"),
-            "confidence": fm.get("confidence", "unknown"),
-            "agent": fm.get("agent"),
-            "scope": fm.get("scope"),
-            "created": str(fm.get("created", "")),
-            "source": fm.get("source", "") if isinstance(fm.get("source"), str) else "",
-            "sourcer": fm.get("sourcer", ""),
-            "wiki_link_count": len(links),
-            "summary": summary,
-            "challenged_by": fm.get("challenged_by"),
-            "related_claims": fm.get("related_claims", []),
-        }
-    except Exception:
-        return None
-
-
-def _load_all_claims():
-    now = time.time()
-    if _cache["data"] and now - _cache["ts"] < CACHE_TTL:
-        return _cache["data"]
-
-    claims = []
-    for domain_dir in sorted(CODEX_ROOT.iterdir()):
-        if not domain_dir.is_dir():
-            continue
-        for f in sorted(domain_dir.glob("*.md")):
-            if f.name == "_map.md":
-                continue
-            c = _parse_frontmatter(f)
-            if c:
-                claims.append(c)
-
-    _cache["data"] = claims
-    _cache["ts"] = now
-    return claims
-
-
-async def handle_claims(request):
-    claims = _load_all_claims()
-
-    # Filters
-    domain = request.query.get("domain")
-    search = request.query.get("q", "").lower()
-    confidence = request.query.get("confidence")
-    agent = request.query.get("agent")
-    sort = request.query.get("sort", "recent")  # recent, alpha, domain
-
-    filtered = claims
-    if domain:
-        filtered = [c for c in filtered if c["domain"] == domain]
-    if confidence:
-        filtered = [c for c in filtered if c["confidence"] == confidence]
-    if agent:
-        filtered = [c for c in filtered if c["agent"] == agent]
-    if search:
-        filtered = [c for c in filtered if search in c["title"].lower() or search in c["summary"].lower()]
-
-    # Sort
-    if sort == "recent":
-        filtered.sort(key=lambda c: c["created"], reverse=True)
-    elif sort == "alpha":
-        filtered.sort(key=lambda c: c["title"].lower())
-    elif sort == "domain":
-        filtered.sort(key=lambda c: (c["domain"], c["title"].lower()))
-
-    # Pagination
-    limit = min(int(request.query.get("limit", "50")), 200)
-    offset = int(request.query.get("offset", "0"))
-    page = filtered[offset:offset+limit]
-
-    # Domain counts for sidebar
-    domain_counts = {}
-    for c in claims:
-        domain_counts[c["domain"]] = domain_counts.get(c["domain"], 0) + 1
-
-    return web.json_response({
-        "claims": page,
-        "total": len(filtered),
-        "offset": offset,
-        "limit": limit,
-        "domains": dict(sorted(domain_counts.items(), key=lambda x: -x[1])),
-        "confidence_levels": sorted(set(c["confidence"] for c in claims)),
-        "agents": sorted(set(c["agent"] for c in claims if c["agent"])),
-    }, headers={"Access-Control-Allow-Origin": "*"})
-
-
-async def handle_claim_detail(request):
-    slug = request.match_info["slug"]
-    claims = _load_all_claims()
-    for c in claims:
-        if c["slug"] == slug:
-            # Read full body for detail view
-            for domain_dir in CODEX_ROOT.iterdir():
-                if not domain_dir.is_dir():
-                    continue
-                f = domain_dir / f"{slug}.md"
-                if f.exists():
-                    text = f.read_text(encoding="utf-8")
-                    end = text.index("---", 3)
-                    body = text[end+3:].strip()
-                    c["body"] = body
-                    break
-            return web.json_response(c, headers={"Access-Control-Allow-Origin": "*"})
-    return web.json_response({"error": "claim not found"}, status=404)
-
-
-async def handle_domains(request):
-    claims = _load_all_claims()
-    domains = {}
-    for c in claims:
-        d = c["domain"]
-        if d not in domains:
-            domains[d] = {"name": d, "count": 0, "agents": set(), "confidence_dist": {}}
-        domains[d]["count"] += 1
-        if c["agent"]:
-            domains[d]["agents"].add(c["agent"])
-        conf = c["confidence"]
-        domains[d]["confidence_dist"][conf] = domains[d]["confidence_dist"].get(conf, 0) + 1
-
-    result = []
-    for d in sorted(domains.values(), key=lambda x: -x["count"]):
-        d["agents"] = sorted(d["agents"])
-        result.append(d)
-
-    return web.json_response(result, headers={"Access-Control-Allow-Origin": "*"})
-
-
-def register_claims_routes(app):
-    app.router.add_get("/api/claims", handle_claims)
-    app.router.add_get("/api/claims/{slug}", handle_claim_detail)
-    app.router.add_get("/api/domains", handle_domains)
--- a/diagnostics/contributor_profile_api.py
+++ b/diagnostics/contributor_profile_api.py
@ -1,365 +0,0 @@
-"""Contributor profile API — GET /api/contributors/{handle}"""
-
-import sqlite3
-import json
-import os
-import re
-import subprocess
-from datetime import datetime
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-SYSTEM_ACCOUNTS = {"pipeline", "unknown", "teleo-agents", "teleo pipeline"}
-CODEX_PATH = "/opt/teleo-eval/workspaces/main"
-
-CI_WEIGHTS = {
-    "sourcer": 0.15,
-    "extractor": 0.05,
-    "challenger": 0.35,
-    "synthesizer": 0.25,
-    "reviewer": 0.20,
-}
-
-FOUNDING_CUTOFF = "2026-03-15"
-
-BADGE_DEFS = {
-    "FOUNDING CONTRIBUTOR": {"rarity": "limited", "desc": "Contributed during pre-launch phase"},
-    "BELIEF MOVER": {"rarity": "rare", "desc": "Challenge that led to a claim revision"},
-    "KNOWLEDGE SOURCER": {"rarity": "uncommon", "desc": "Source that generated 3+ claims"},
-    "DOMAIN SPECIALIST": {"rarity": "rare", "desc": "Top 3 CI contributor in a domain"},
-    "VETERAN": {"rarity": "uncommon", "desc": "10+ accepted contributions"},
-    "FIRST BLOOD": {"rarity": "common", "desc": "First contribution of any kind"},
-    "CONTRIBUTOR": {"rarity": "common", "desc": "Account created + first accepted contribution"},
-}
-
-
-def _get_conn():
-    conn = sqlite3.connect(DB_PATH)
-    conn.row_factory = sqlite3.Row
-    return conn
-
-
-def _compute_ci(row):
-    total = 0
-    for role, weight in CI_WEIGHTS.items():
-        total += (row.get(f"{role}_count", 0) or 0) * weight
-    return round(total, 2)
-
-
-def _compute_badges(handle, row, domain_breakdown, conn):
-    badges = []
-    first = row.get("first_contribution", "")
-
-    if first and first <= FOUNDING_CUTOFF:
-        badges.append("FOUNDING CONTRIBUTOR")
-
-    claims = row.get("claims_merged", 0) or 0
-    if claims > 0:
-        badges.append("CONTRIBUTOR")
-        badges.append("FIRST BLOOD")
-
-    if claims >= 10:
-        badges.append("VETERAN")
-
-    challenger = row.get("challenger_count", 0) or 0
-    challenge_ci = row.get("_challenge_count_from_scores", 0)
-    if challenger > 0 or challenge_ci > 0:
-        badges.append("BELIEF MOVER")
-
-    sourcer = row.get("sourcer_count", 0) or 0
-    if sourcer >= 3:
-        badges.append("KNOWLEDGE SOURCER")
-
-    return badges
-
-
-def _get_domain_breakdown(handle, conn):
-    rows = conn.execute("""
-        SELECT domain, COUNT(*) as cnt
-        FROM prs
-        WHERE status='merged' AND (LOWER(agent)=LOWER(?) OR LOWER(submitted_by)=LOWER(?))
-        AND domain IS NOT NULL
-        GROUP BY domain ORDER BY cnt DESC
-    """, (handle, handle)).fetchall()
-    return {r["domain"]: r["cnt"] for r in rows}
-
-
-def _get_contribution_timeline(handle, conn, limit=20):
-    rows = conn.execute("""
-        SELECT number, domain, status, created_at, description, commit_type, source_path
-        FROM prs
-        WHERE status='merged' AND (LOWER(agent)=LOWER(?) OR LOWER(submitted_by)=LOWER(?))
-        ORDER BY created_at DESC LIMIT ?
-    """, (handle, handle, limit)).fetchall()
-
-    timeline = []
-    for r in rows:
-        desc = r["description"] or ""
-        if not desc and r["source_path"]:
-            desc = os.path.basename(r["source_path"]).replace("-", " ").replace(".md", "")
-        timeline.append({
-            "pr_number": r["number"],
-            "domain": r["domain"],
-            "date": r["created_at"][:10] if r["created_at"] else None,
-            "type": _classify_commit(r["commit_type"]),
-            "summary": desc[:200] if desc else None,
-        })
-    return timeline
-
-
-def _classify_commit(commit_type):
-    if not commit_type:
-        return "create"
-    ct = commit_type.lower()
-    if "challenge" in ct:
-        return "challenge"
-    if "enrich" in ct or "update" in ct or "reweave" in ct:
-        return "enrich"
-    return "create"
-
-
-def _get_review_stats(handle, conn):
-    rows = conn.execute("""
-        SELECT outcome, COUNT(*) as cnt
-        FROM review_records
-        WHERE LOWER(agent) = LOWER(?)
-        GROUP BY outcome
-    """, (handle,)).fetchall()
-    stats = {}
-    for r in rows:
-        stats[r["outcome"]] = r["cnt"]
-    return stats
-
-
-def _get_action_ci(handle, conn):
-    """Get action-type CI from contribution_scores table.
-
-    Checks both exact handle and common variants (with/without suffix).
-    """
-    h = handle.lower()
-    base = re.sub(r"[-_]\w+\d+$", "", h)
-    variants = list({h, base}) if base and base != h else [h]
-    try:
-        placeholders = ",".join("?" for _ in variants)
-        rows = conn.execute(f"""
-            SELECT event_type, SUM(ci_earned) as total, COUNT(*) as cnt
-            FROM contribution_scores
-            WHERE LOWER(contributor) IN ({placeholders})
-            GROUP BY event_type
-        """, variants).fetchall()
-    except Exception:
-        return None
-
-    if not rows:
-        return None
-
-    breakdown = {}
-    total = 0.0
-    for r in rows:
-        breakdown[r["event_type"]] = {
-            "count": r["cnt"],
-            "ci": round(r["total"], 4),
-        }
-        total += r["total"]
-
-    return {
-        "total": round(total, 4),
-        "breakdown": breakdown,
-    }
-
-
-def _get_git_contributor(handle):
-    """Fallback: check git log for contributors not in pipeline.db."""
-    try:
-        result = subprocess.run(
-            ["git", "log", "--all", "--format=%H|%an|%ae|%aI", "--diff-filter=A", "--", "domains/"],
-            capture_output=True, text=True, cwd=CODEX_PATH, timeout=30
-        )
-        if result.returncode != 0:
-            return None
-
-        claims = []
-        for line in result.stdout.strip().split("\n"):
-            if not line:
-                continue
-            parts = line.split("|", 3)
-            if len(parts) < 4:
-                continue
-            sha, name, email, date = parts
-            if handle.lower() in name.lower() or handle.lower() in email.lower():
-                claims.append({"sha": sha, "author": name, "email": email, "date": date[:10]})
-
-        if not claims:
-            return None
-
-        return {
-            "handle": handle,
-            "display_name": claims[0]["author"],
-            "email": claims[0]["email"],
-            "first_contribution": min(c["date"] for c in claims),
-            "last_contribution": max(c["date"] for c in claims),
-            "claims_merged": len(claims),
-            "sourcer_count": 0,
-            "extractor_count": 0,
-            "challenger_count": 0,
-            "synthesizer_count": 0,
-            "reviewer_count": 0,
-        }
-    except Exception:
-        return None
-
-
-def get_contributor_profile(handle):
-    conn = _get_conn()
-    try:
-        row = conn.execute(
-            "SELECT * FROM contributors WHERE LOWER(handle) = LOWER(?)", (handle,)
-        ).fetchone()
-
-        if row:
-            data = dict(row)
-        else:
-            git_data = _get_git_contributor(handle)
-            if git_data:
-                data = git_data
-            else:
-                return None
-
-        ci_score = _compute_ci(data)
-        action_ci = _get_action_ci(handle, conn)
-        domain_breakdown = _get_domain_breakdown(handle, conn)
-        timeline = _get_contribution_timeline(handle, conn)
-        review_stats = _get_review_stats(handle, conn)
-        if action_ci and "challenge" in action_ci.get("breakdown", {}):
-            data["_challenge_count_from_scores"] = action_ci["breakdown"]["challenge"]["count"]
-        badges = _compute_badges(handle, data, domain_breakdown, conn)
-
-        # For git-only contributors, build domain breakdown from git
-        if not domain_breakdown and not row:
-            domain_breakdown = _git_domain_breakdown(handle)
-
-        hero_badge = None
-        rarity_order = ["limited", "rare", "uncommon", "common"]
-        for rarity in rarity_order:
-            for b in badges:
-                if BADGE_DEFS.get(b, {}).get("rarity") == rarity:
-                    hero_badge = b
-                    break
-            if hero_badge:
-                break
-
-        role_breakdown = {
-            "sourcer": data.get("sourcer_count", 0) or 0,
-            "extractor": data.get("extractor_count", 0) or 0,
-            "challenger": data.get("challenger_count", 0) or 0,
-            "synthesizer": data.get("synthesizer_count", 0) or 0,
-            "reviewer": data.get("reviewer_count", 0) or 0,
-        }
-        total_roles = sum(role_breakdown.values())
-        role_pct = {}
-        for k, v in role_breakdown.items():
-            role_pct[k] = round(v / total_roles * 100) if total_roles > 0 else 0
-
-        return {
-            "handle": data.get("handle", handle),
-            "display_name": data.get("display_name"),
-            "ci_score": ci_score,
-            "action_ci": action_ci,
-            "primary_ci": action_ci["total"] if action_ci else ci_score,
-            "hero_badge": hero_badge,
-            "badges": [{"name": b, **BADGE_DEFS.get(b, {})} for b in badges],
-            "joined": data.get("first_contribution"),
-            "last_active": data.get("last_contribution"),
-            "claims_merged": data.get("claims_merged", 0) or 0,
-            "principal": data.get("principal"),
-            "role_breakdown": role_breakdown,
-            "role_percentages": role_pct,
-            "domain_breakdown": domain_breakdown,
-            "review_stats": review_stats,
-            "contribution_timeline": timeline,
-            "active_domains": list(domain_breakdown.keys()),
-        }
-    finally:
-        conn.close()
-
-
-def _git_domain_breakdown(handle):
-    """For git-only contributors, count claims by domain from file paths."""
-    try:
-        result = subprocess.run(
-            ["git", "log", "--all", "--name-only", "--format=COMMIT|%an", "--diff-filter=A", "--", "domains/"],
-            capture_output=True, text=True, cwd=CODEX_PATH, timeout=30
-        )
-        if result.returncode != 0:
-            return {}
-
-        domains = {}
-        current_match = False
-        for line in result.stdout.strip().split("\n"):
-            if line.startswith("COMMIT|"):
-                author = line.split("|", 1)[1]
-                current_match = handle.lower() in author.lower()
-            elif current_match and line.startswith("domains/"):
-                parts = line.split("/")
-                if len(parts) >= 2:
-                    domain = parts[1]
-                    domains[domain] = domains.get(domain, 0) + 1
-
-        return domains
-    except Exception:
-        return {}
-
-
-async def handle_contributor_profile(request):
-    from aiohttp import web
-    handle = request.match_info["handle"]
-    profile = get_contributor_profile(handle)
-    if profile is None:
-        return web.json_response({"error": f"Contributor '{handle}' not found"}, status=404)
-    return web.json_response(profile)
-
-
-async def handle_contributors_list(request):
-    from aiohttp import web
-    conn = _get_conn()
-    try:
-        min_claims = int(request.query.get("min_claims", "1"))
-        rows = conn.execute("""
-            SELECT handle, display_name, first_contribution, last_contribution, 
-                   sourcer_count, extractor_count, challenger_count, synthesizer_count,
-                   reviewer_count, claims_merged, principal
-            FROM contributors
-            WHERE claims_merged >= ?
-            ORDER BY claims_merged DESC
-        """, (min_claims,)).fetchall()
-
-        contributors = []
-        for r in rows:
-            data = dict(r)
-            if data["handle"].lower() in SYSTEM_ACCOUNTS:
-                continue
-            ci = _compute_ci(data)
-            action_ci = _get_action_ci(data["handle"], conn)
-            action_total = action_ci["total"] if action_ci else 0.0
-            contributors.append({
-                "handle": data["handle"],
-                "display_name": data["display_name"],
-                "ci_score": ci,
-                "action_ci": action_total,
-                "primary_ci": action_total if action_total > 0 else ci,
-                "claims_merged": data["claims_merged"],
-                "first_contribution": data["first_contribution"],
-                "last_contribution": data["last_contribution"],
-                "principal": data["principal"],
-            })
-
-        return web.json_response({
-            "contributors": contributors,
-            "total": len(contributors),
-        })
-    finally:
-        conn.close()
-
-
-def register_contributor_routes(app):
-    app.router.add_get("/api/contributors/list", handle_contributors_list)
-    app.router.add_get("/api/contributors/{handle}", handle_contributor_profile)
--- a/diagnostics/dashboard_epistemic.py
+++ b/diagnostics/dashboard_epistemic.py
@ -74,7 +74,7 @@ def render_epistemic_page(vital_signs: dict, now: datetime) -> str:
    <div style="font-size:40px;margin-bottom:12px;opacity:0.3">&#9881;</div>
    <div style="color:#8b949e">
      Multi-model agreement rate requires the <code>model_evals</code> table.<br>
-      <span style="font-size:12px">Blocked on: model_evals table creation (Ship Phase 3)</span>
+      <span style="font-size:12px">Blocked on: model_evals table creation (Theseus 2 Phase 3)</span>
    </div>
    <div style="margin-top:16px;font-size:12px;color:#8b949e">
      Current eval models: Haiku (triage), GPT-4o (domain), Sonnet/Opus (Leo).<br>
@ -194,6 +194,12 @@ fetch('/api/review-summary?days=30')
    reasonRows += '<tr><td><code>' + esc(r.reason) + '</code></td><td>' + r.count + '</td></tr>';
  }}

+  // Disagreement types
+  let disagreeRows = '';
+  for (const d of (data.disagreement_types || [])) {{
+    disagreeRows += '<tr><td>' + esc(d.type) + '</td><td>' + d.count + '</td></tr>';
+  }}
+
  el.innerHTML = `
    <div class="grid">
      <div class="card"><div class="label">Total Reviews</div><div class="hero-value">${{data.total}}</div></div>
@ -209,6 +215,13 @@ fetch('/api/review-summary?days=30')
          ${{reasonRows || '<tr><td colspan="2" style="color:#8b949e">No rejections</td></tr>'}}
        </table>
      </div>
+      <div class="card">
+        <div style="font-weight:600;margin-bottom:8px">Disagreement Types</div>
+        <table>
+          <tr><th>Type</th><th>Count</th></tr>
+          ${{disagreeRows || '<tr><td colspan="2" style="color:#8b949e">No disagreements</td></tr>'}}
+        </table>
+      </div>
    </div>`;
 }}).catch(() => {{
  document.getElementById('review-container').innerHTML =
--- a/diagnostics/dashboard_portfolio.py
+++ b/diagnostics/dashboard_portfolio.py
@ -1,408 +0,0 @@
-"""Portfolio dashboard — fixes empty chart by:
-1. Computing NAV server-side in the history API (not client-side from nulls)
-2. Only returning dates with valid NAV data
-3. Showing data points when sparse
-"""
-
-import json
-import sqlite3
-import logging
-from html import escape as esc
-from datetime import datetime, timezone
-
-from aiohttp import web
-from shared_ui import render_page
-
-logger = logging.getLogger("argus.portfolio")
-
-CSS = """
-  .hero-chart { background: #161b22; border: 1px solid #30363d; border-radius: 8px; padding: 20px; margin-bottom: 20px; }
-  .hero-chart h2 { color: #c9d1d9; font-size: 18px; margin-bottom: 12px; }
-  .range-btns { display: flex; gap: 4px; margin-bottom: 12px; }
-  .range-btn { background: #21262d; border: 1px solid #30363d; color: #8b949e; padding: 5px 14px;
-               border-radius: 4px; cursor: pointer; font-size: 12px; }
-  .range-btn.active { background: #1f6feb33; border-color: #58a6ff; color: #58a6ff; }
-  .ptable-wrap { overflow-x: auto; margin-top: 20px; }
-  .ptable { width: 100%; border-collapse: collapse; font-size: 13px; }
-  .ptable th { background: #161b22; color: #8b949e; font-size: 11px; text-transform: uppercase;
-    letter-spacing: 0.5px; padding: 10px 12px; text-align: right; border-bottom: 1px solid #30363d;
-    cursor: pointer; user-select: none; white-space: nowrap; }
-  .ptable th:first-child { text-align: left; position: sticky; left: 0; background: #161b22; z-index: 1; }
-  .ptable th:hover { color: #c9d1d9; }
-  .ptable th.sorted-asc::after { content: ' \\25B2'; font-size: 9px; }
-  .ptable th.sorted-desc::after { content: ' \\25BC'; font-size: 9px; }
-  .ptable td { padding: 10px 12px; text-align: right; border-bottom: 1px solid #21262d; color: #c9d1d9; }
-  .ptable td:first-child { text-align: left; position: sticky; left: 0; background: #0d1117; z-index: 1; font-weight: 600; }
-  .ptable tr:hover td { background: #161b22; }
-  .ptable tr:hover td:first-child { background: #161b22; }
-  .summary-row td { font-weight: 700; border-top: 2px solid #30363d; background: #161b22 !important; }
-  .premium { color: #f85149; }
-  .discount { color: #3fb950; }
-  .near-nav { color: #d29922; }
-"""
-
-
-def _fmt_usd(v):
-    if v is None:
-        return '\u2014'
-    if abs(v) >= 1_000_000:
-        return f'${v / 1_000_000:.1f}M'
-    if abs(v) >= 1_000:
-        return f'${v / 1_000:.0f}K'
-    return f'${v:,.0f}'
-
-
-def _fmt_price(v):
-    if v is None:
-        return '\u2014'
-    if v >= 100:
-        return f'${v:,.0f}'
-    if v >= 1:
-        return f'${v:.2f}'
-    if v >= 0.01:
-        return f'${v:.4f}'
-    return f'${v:.6f}'
-
-
-def _fmt_ratio(v):
-    if v is None or v == 0:
-        return '\u2014'
-    return f'{v:.2f}x'
-
-
-def _ratio_class(v):
-    if v is None or v == 0:
-        return ''
-    if v > 1.5:
-        return 'premium'
-    if v < 0.9:
-        return 'discount'
-    if v <= 1.1:
-        return 'near-nav'
-    return ''
-
-
-def render_portfolio_page(coins: list[dict], now: datetime) -> str:
-    if not coins:
-        body = '<div style="padding:40px;text-align:center;color:#8b949e;">No coin data yet.</div>'
-        return render_page("Portfolio", "Ownership coin portfolio", "/portfolio", body,
-                           extra_css=CSS, timestamp=now.strftime("%Y-%m-%d %H:%M UTC"))
-
-    total_mcap = sum(c.get('market_cap_usd') or 0 for c in coins)
-    total_treasury = sum(c.get('treasury_usd') or 0 for c in coins)
-
-    hero_chart = """
-    <div class="hero-chart">
-      <h2>Price / NAV per Token</h2>
-      <div class="range-btns">
-        <button class="range-btn" onclick="setRange(this, 30)">30d</button>
-        <button class="range-btn active" onclick="setRange(this, 90)">90d</button>
-        <button class="range-btn" onclick="setRange(this, 180)">180d</button>
-        <button class="range-btn" onclick="setRange(this, 365)">All</button>
-      </div>
-      <canvas id="ratio-chart" height="320" style="max-height:320px"></canvas>
-    </div>
-    """
-
-    header = """<div class="ptable-wrap"><table class="ptable" id="coin-table">
-    <thead><tr>
-        <th data-col="name">Coin</th>
-        <th data-col="price">Price</th>
-        <th data-col="nav">NAV / Token</th>
-        <th data-col="ratio">Price / NAV</th>
-        <th data-col="treasury">Treasury</th>
-        <th data-col="mcap">Market Cap</th>
-    </tr></thead><tbody>"""
-
-    rows = ''
-    for c in coins:
-        name = c.get('name', '?')
-        ticker = c.get('ticker', '')
-        price = c.get('price_usd')
-        nav = c.get('nav_per_token')
-        ratio = c.get('price_nav_ratio')
-        treasury = c.get('treasury_usd')
-        mcap = c.get('market_cap_usd')
-
-        label = esc(name)
-        if ticker:
-            label += f' <span style="color:#8b949e;font-size:11px;">{esc(ticker)}</span>'
-
-        rows += f"""<tr>
-            <td>{label}</td>
-            <td>{_fmt_price(price)}</td>
-            <td>{_fmt_price(nav)}</td>
-            <td class="{_ratio_class(ratio)}">{_fmt_ratio(ratio)}</td>
-            <td>{_fmt_usd(treasury)}</td>
-            <td>{_fmt_usd(mcap)}</td>
-        </tr>"""
-
-    rows += f"""<tr class="summary-row">
-        <td>Total ({len(coins)})</td>
-        <td></td><td></td><td></td>
-        <td>{_fmt_usd(total_treasury)}</td>
-        <td>{_fmt_usd(total_mcap)}</td>
-    </tr>"""
-
-    table = header + rows + '</tbody></table></div>'
-
-    scripts = """<script>
-const COLORS = ['#58a6ff','#3fb950','#f0883e','#d29922','#f85149','#bc8cff','#39d353','#79c0ff','#ff7b72','#a5d6ff'];
-let chart = null;
-
-function setRange(btn, days) {
-    document.querySelectorAll('.range-btn').forEach(b => b.classList.remove('active'));
-    btn.classList.add('active');
-    loadChart(days);
-}
-
-function loadChart(days) {
-    fetch('/api/portfolio/nav-ratios?days=' + days)
-        .then(r => r.json())
-        .then(data => {
-            const dates = data.dates || [];
-            const series = data.series || {};
-
-            if (dates.length === 0) {
-                if (chart) chart.destroy();
-                chart = null;
-                const ctx = document.getElementById('ratio-chart').getContext('2d');
-                ctx.fillStyle = '#8b949e';
-                ctx.font = '14px sans-serif';
-                ctx.textAlign = 'center';
-                ctx.fillText('No NAV data yet — accumulating daily snapshots', ctx.canvas.width / 2, 160);
-                return;
-            }
-
-            const sparse = dates.length <= 10;
-            const datasets = [];
-            let i = 0;
-            for (const [name, ratios] of Object.entries(series)) {
-                const hasData = ratios.some(v => v !== null);
-                if (!hasData) { i++; continue; }
-                datasets.push({
-                    label: name,
-                    data: ratios,
-                    borderColor: COLORS[i % COLORS.length],
-                    backgroundColor: COLORS[i % COLORS.length] + '33',
-                    borderWidth: 2,
-                    tension: 0.3,
-                    spanGaps: true,
-                    pointRadius: sparse ? 4 : 0,
-                    pointHoverRadius: 6,
-                    fill: false,
-                });
-                i++;
-            }
-
-            if (chart) chart.destroy();
-            const ctx = document.getElementById('ratio-chart').getContext('2d');
-            chart = new Chart(ctx, {
-                type: 'line',
-                data: { labels: dates, datasets },
-                options: {
-                    responsive: true,
-                    maintainAspectRatio: false,
-                    interaction: { mode: 'index', intersect: false },
-                    plugins: {
-                        legend: { labels: { color: '#8b949e', font: { size: 11 }, usePointStyle: true, boxWidth: 8 }, position: 'top' },
-                        tooltip: { mode: 'index', intersect: false,
-                            callbacks: { label: ctx => ctx.dataset.label + ': ' + (ctx.parsed.y != null ? ctx.parsed.y.toFixed(2) + 'x' : 'n/a') }
-                        },
-                        annotation: {
-                            annotations: {
-                                navLine: {
-                                    type: 'line',
-                                    yMin: 1, yMax: 1,
-                                    borderColor: '#3fb95088',
-                                    borderWidth: 2,
-                                    borderDash: [6, 4],
-                                    label: {
-                                        display: true,
-                                        content: '1.0x = NAV',
-                                        position: 'end',
-                                        backgroundColor: '#3fb95033',
-                                        color: '#3fb950',
-                                        font: { size: 10 },
-                                    }
-                                }
-                            }
-                        }
-                    },
-                    scales: {
-                        x: { ticks: { color: '#8b949e', maxTicksLimit: 12 }, grid: { display: false } },
-                        y: { ticks: { color: '#8b949e', callback: v => v.toFixed(1) + 'x' }, grid: { color: '#21262d' },
-                             suggestedMin: 0 }
-                    }
-                }
-            });
-        });
-}
-
-// Table sorting
-function sortTable(col) {
-    const table = document.getElementById('coin-table');
-    const tbody = table.querySelector('tbody');
-    const rows = Array.from(tbody.querySelectorAll('tr:not(.summary-row)'));
-    const summaryRow = tbody.querySelector('.summary-row');
-    const th = table.querySelectorAll('th')[col];
-    const asc = th.classList.contains('sorted-asc');
-    table.querySelectorAll('th').forEach(h => h.classList.remove('sorted-asc','sorted-desc'));
-    th.classList.add(asc ? 'sorted-desc' : 'sorted-asc');
-    rows.sort((a, b) => {
-        let va = a.cells[col].textContent.replace(/[$,+%x\\u2014]/g,'').trim();
-        let vb = b.cells[col].textContent.replace(/[$,+%x\\u2014]/g,'').trim();
-        const na = parseFloat(va) || 0, nb = parseFloat(vb) || 0;
-        if (col === 0) return asc ? vb.localeCompare(va) : va.localeCompare(vb);
-        return asc ? na - nb : nb - na;
-    });
-    rows.forEach(r => tbody.appendChild(r));
-    if (summaryRow) tbody.appendChild(summaryRow);
-}
-document.querySelectorAll('#coin-table th').forEach((th, i) => {
-    th.addEventListener('click', () => sortTable(i));
-});
-
-loadChart(90);
-</script>"""
-
-    body = hero_chart + table
-    return render_page("Portfolio", "Ownership coin portfolio", "/portfolio", body,
-                       scripts=scripts, extra_css=CSS,
-                       timestamp=now.strftime("%Y-%m-%d %H:%M UTC"))
-
-
-# ── API handlers ────────────────────────────────────────────────────────────
-
-def _get_db(request):
-    return request.app["_portfolio_conn"]()
-
-
-def _compute_nav(row):
-    """Compute NAV per token and Price/NAV ratio from a snapshot row dict."""
-    treas = (row.get('treasury_multisig_usd') or 0) + (row.get('lp_usdc_total') or 0)
-    adj = row.get('adjusted_circulating_supply') or 0
-    price = row.get('price_usd') or 0
-    nav = treas / adj if adj > 0 else 0
-    ratio = price / nav if nav > 0 else 0
-    return treas, nav, ratio
-
-
-async def handle_portfolio_page(request):
-    conn = _get_db(request)
-    try:
-        rows = conn.execute("""
-            SELECT * FROM coin_snapshots
-            WHERE snapshot_date = (SELECT MAX(snapshot_date) FROM coin_snapshots)
-            ORDER BY market_cap_usd DESC
-        """).fetchall()
-        coins = []
-        for r in rows:
-            d = dict(r)
-            treas, nav, ratio = _compute_nav(d)
-            d['treasury_usd'] = treas
-            d['nav_per_token'] = nav
-            d['price_nav_ratio'] = ratio
-            coins.append(d)
-        now = datetime.now(timezone.utc)
-        html = render_portfolio_page(coins, now)
-        return web.Response(text=html, content_type='text/html')
-    finally:
-        conn.close()
-
-
-async def handle_nav_ratios(request):
-    """Server-side computed NAV ratios — only returns dates with valid data."""
-    conn = _get_db(request)
-    try:
-        try:
-            days = min(int(request.query.get('days', '90')), 365)
-        except (ValueError, TypeError):
-            days = 90
-        rows = conn.execute("""
-            SELECT name, snapshot_date, price_usd, treasury_multisig_usd,
-                   lp_usdc_total, adjusted_circulating_supply
-            FROM coin_snapshots
-            WHERE snapshot_date >= date('now', ? || ' days')
-              AND adjusted_circulating_supply IS NOT NULL
-              AND adjusted_circulating_supply > 0
-            ORDER BY name, snapshot_date
-        """, (f'-{days}',)).fetchall()
-
-        coin_ratios = {}
-        all_dates = set()
-        for r in rows:
-            d = dict(r)
-            name = d['name']
-            date = d['snapshot_date']
-            _, nav, ratio = _compute_nav(d)
-            if nav > 0 and ratio > 0:
-                if name not in coin_ratios:
-                    coin_ratios[name] = {}
-                coin_ratios[name][date] = round(ratio, 3)
-                all_dates.add(date)
-
-        sorted_dates = sorted(all_dates)
-        series = {}
-        for name, date_map in coin_ratios.items():
-            series[name] = [date_map.get(d) for d in sorted_dates]
-
-        return web.json_response({
-            'dates': sorted_dates,
-            'series': series,
-        })
-    finally:
-        conn.close()
-
-
-async def handle_portfolio_history(request):
-    conn = _get_db(request)
-    try:
-        try:
-            days = min(int(request.query.get('days', '90')), 365)
-        except (ValueError, TypeError):
-            days = 90
-        rows = conn.execute("""
-            SELECT * FROM coin_snapshots
-            WHERE snapshot_date >= date('now', ? || ' days')
-            ORDER BY name, snapshot_date
-        """, (f'-{days}',)).fetchall()
-        history = {}
-        for r in rows:
-            d = dict(r)
-            key = d['name']
-            if key not in history:
-                history[key] = []
-            history[key].append(d)
-        return web.json_response({'history': history})
-    finally:
-        conn.close()
-
-
-async def handle_portfolio_latest(request):
-    conn = _get_db(request)
-    try:
-        rows = conn.execute("""
-            SELECT * FROM coin_snapshots
-            WHERE snapshot_date = (SELECT MAX(snapshot_date) FROM coin_snapshots)
-            ORDER BY market_cap_usd DESC
-        """).fetchall()
-        coins = []
-        for r in rows:
-            d = dict(r)
-            treas, nav, ratio = _compute_nav(d)
-            d['treasury_usd'] = treas
-            d['nav_per_token'] = nav
-            d['price_nav_ratio'] = ratio
-            coins.append(d)
-        return web.json_response({'coins': coins, 'date': coins[0]['snapshot_date'] if coins else None})
-    finally:
-        conn.close()
-
-
-def register_portfolio_routes(app, get_conn):
-    app["_portfolio_conn"] = get_conn
-    app.router.add_get("/portfolio", handle_portfolio_page)
-    app.router.add_get("/api/portfolio/nav-ratios", handle_nav_ratios)
-    app.router.add_get("/api/portfolio/history", handle_portfolio_history)
-    app.router.add_get("/api/portfolio/latest", handle_portfolio_latest)
--- a/diagnostics/dashboard_prs.py
+++ b/diagnostics/dashboard_prs.py
@ -1,8 +1,8 @@
 """PR Lifecycle dashboard — single-page view of every PR through the pipeline.

-Sortable table: PR#, summary, claims, domain, outcome, evals, evaluator, cost, date.
-Click any row to expand: timeline, claim list, issues summary.
-Hero cards: total PRs, merge rate, median eval rounds, total claims, total cost.
+Sortable table: PR#, summary, claims, domain, contributor, outcome, evals, evaluator, cost, date.
+Click any row to expand: claim titles, eval chain, timeline, reviews, issues.
+Hero cards: total PRs, merge rate, total claims, est. cost.

 Data sources: prs table, audit_log (eval rounds), review_records.
 Owner: Ship
@ -14,7 +14,7 @@ from shared_ui import render_page


 EXTRA_CSS = """
-  .page-content { max-width: 1600px !important; }
+  .content-wrapper { max-width: 1600px !important; }
  .filters { display: flex; gap: 12px; flex-wrap: wrap; margin-bottom: 16px; }
  .filters select, .filters input {
    background: #161b22; color: #c9d1d9; border: 1px solid #30363d;
@ -22,14 +22,15 @@ EXTRA_CSS = """
  .filters select:focus, .filters input:focus { border-color: #58a6ff; outline: none; }
  .pr-table { width: 100%; border-collapse: collapse; font-size: 13px; table-layout: fixed; }
  .pr-table th:nth-child(1) { width: 50px; }    /* PR# */
-  .pr-table th:nth-child(2) { width: 30%; }     /* Summary */
+  .pr-table th:nth-child(2) { width: 28%; }     /* Summary */
  .pr-table th:nth-child(3) { width: 50px; }    /* Claims */
-  .pr-table th:nth-child(4) { width: 12%; }     /* Domain */
-  .pr-table th:nth-child(5) { width: 10%; }     /* Outcome */
-  .pr-table th:nth-child(6) { width: 50px; }    /* Evals */
-  .pr-table th:nth-child(7) { width: 16%; }     /* Evaluator */
-  .pr-table th:nth-child(8) { width: 70px; }    /* Cost */
-  .pr-table th:nth-child(9) { width: 90px; }    /* Date */
+  .pr-table th:nth-child(4) { width: 11%; }     /* Domain */
+  .pr-table th:nth-child(5) { width: 10%; }     /* Contributor */
+  .pr-table th:nth-child(6) { width: 10%; }     /* Outcome */
+  .pr-table th:nth-child(7) { width: 44px; }    /* Evals */
+  .pr-table th:nth-child(8) { width: 12%; }     /* Evaluator */
+  .pr-table th:nth-child(9) { width: 60px; }    /* Cost */
+  .pr-table th:nth-child(10) { width: 80px; }   /* Date */
  .pr-table td { overflow: hidden; text-overflow: ellipsis; white-space: nowrap; padding: 8px 6px; }
  .pr-table td:nth-child(2) { white-space: normal; overflow: visible; line-height: 1.4; }
  .pr-table th { cursor: pointer; user-select: none; position: relative; padding: 8px 18px 8px 6px; }
@ -48,22 +49,24 @@ EXTRA_CSS = """
  .pr-table .pr-link:hover { text-decoration: underline; }
  .pr-table td .summary-text { font-size: 12px; color: #c9d1d9; }
  .pr-table td .review-snippet { font-size: 11px; color: #f85149; margin-top: 2px; opacity: 0.8; }
-  .pr-table td .model-tag { font-size: 9px; color: #6e7681; background: #21262d; border-radius: 3px; padding: 1px 4px; display: inline-block; margin: 1px 0; }
+  .pr-table td .model-tag { font-size: 10px; color: #6e7681; background: #161b22; border-radius: 3px; padding: 1px 4px; }
+  .pr-table td .contributor-tag { font-size: 11px; color: #d2a8ff; }
+  .pr-table td .contributor-self { font-size: 11px; color: #6e7681; font-style: italic; }
  .pr-table td .expand-chevron { display: inline-block; width: 12px; color: #484f58; font-size: 10px; transition: transform 0.2s; }
  .pr-table tr.expanded .expand-chevron { transform: rotate(90deg); color: #58a6ff; }
-  .pr-table td .cost-val { font-size: 12px; color: #8b949e; }
-  .pr-table td .claims-count { font-size: 13px; color: #c9d1d9; text-align: center; }
-  .pr-table td .evals-count { font-size: 13px; text-align: center; }
  .trace-panel { background: #0d1117; border: 1px solid #30363d; border-radius: 8px;
    padding: 16px; margin: 4px 0 8px 0; font-size: 12px; display: none; }
  .trace-panel.open { display: block; }
-  .trace-panel .section-title { color: #58a6ff; font-size: 12px; font-weight: 600; margin: 12px 0 6px; }
-  .trace-panel .section-title:first-child { margin-top: 0; }
-  .trace-panel .claim-list { list-style: none; padding: 0; margin: 0; }
-  .trace-panel .claim-list li { padding: 4px 0; border-bottom: 1px solid #21262d; color: #c9d1d9; font-size: 12px; }
-  .trace-panel .claim-list li:last-child { border-bottom: none; }
-  .trace-panel .issues-box { background: #1c1017; border: 1px solid #f8514930; border-radius: 6px;
+  .trace-panel h4 { color: #58a6ff; font-size: 12px; margin: 12px 0 6px 0; }
+  .trace-panel h4:first-child { margin-top: 0; }
+  .claim-list { list-style: none; padding: 0; margin: 0; }
+  .claim-list li { padding: 4px 0 4px 16px; border-left: 2px solid #238636; color: #c9d1d9; font-size: 12px; line-height: 1.5; }
+  .claim-list li .claim-confidence { font-size: 10px; color: #8b949e; margin-left: 6px; }
+  .issues-box { background: #1c1210; border: 1px solid #f8514933; border-radius: 6px;
    padding: 8px 12px; margin: 4px 0; font-size: 12px; color: #f85149; }
+  .eval-chain { background: #161b22; border-radius: 6px; padding: 8px 12px; margin: 4px 0; font-size: 12px; }
+  .eval-chain .chain-step { display: inline-block; margin-right: 6px; }
+  .eval-chain .chain-arrow { color: #484f58; margin: 0 4px; }
  .trace-timeline { list-style: none; padding: 0; }
  .trace-timeline li { padding: 4px 0; border-left: 2px solid #30363d; padding-left: 12px; margin-left: 8px; }
  .trace-timeline li .ts { color: #484f58; font-size: 11px; }
@ -73,12 +76,6 @@ EXTRA_CSS = """
  .trace-timeline li.ev-changes .ev { color: #d29922; }
  .review-text { background: #161b22; padding: 8px 12px; border-radius: 4px;
    margin: 4px 0; white-space: pre-wrap; font-size: 11px; color: #8b949e; max-height: 200px; overflow-y: auto; }
-  .eval-chain { background: #161b22; border-radius: 6px; padding: 8px 12px; margin: 4px 0 8px;
-    font-size: 12px; display: flex; gap: 12px; flex-wrap: wrap; align-items: center; }
-  .eval-chain .step { display: flex; align-items: center; gap: 4px; }
-  .eval-chain .step-label { color: #8b949e; font-size: 11px; }
-  .eval-chain .step-model { color: #c9d1d9; font-size: 11px; font-weight: 600; }
-  .eval-chain .arrow { color: #484f58; }
  .pagination { display: flex; gap: 8px; align-items: center; justify-content: center; margin-top: 16px; }
  .pagination button { background: #161b22; color: #c9d1d9; border: 1px solid #30363d;
    border-radius: 4px; padding: 4px 12px; cursor: pointer; font-size: 12px; }
@ -96,7 +93,6 @@ def render_prs_page(now: datetime) -> str:
    <div class="grid" id="hero-cards">
      <div class="card"><div class="label">Total PRs</div><div class="value blue" id="kpi-total">--</div><div class="detail" id="kpi-total-detail"></div></div>
      <div class="card"><div class="label">Merge Rate</div><div class="value green" id="kpi-merge-rate">--</div><div class="detail" id="kpi-merge-detail"></div></div>
-      <div class="card"><div class="label">Median Eval Rounds</div><div class="value" id="kpi-rounds">--</div><div class="detail" id="kpi-rounds-detail"></div></div>
      <div class="card"><div class="label">Total Claims</div><div class="value blue" id="kpi-claims">--</div><div class="detail" id="kpi-claims-detail"></div></div>
      <div class="card"><div class="label">Est. Cost</div><div class="value" id="kpi-cost">--</div><div class="detail" id="kpi-cost-detail"></div></div>
    </div>
@ -104,6 +100,7 @@ def render_prs_page(now: datetime) -> str:
    <!-- Filters -->
    <div class="filters">
      <select id="filter-domain"><option value="">All Domains</option></select>
+      <select id="filter-contributor"><option value="">All Contributors</option></select>
      <select id="filter-outcome">
        <option value="">All Outcomes</option>
        <option value="merged">Merged</option>
@ -133,9 +130,10 @@ def render_prs_page(now: datetime) -> str:
            <th data-col="summary">Summary <span class="sort-arrow">&#9650;</span></th>
            <th data-col="claims_count">Claims <span class="sort-arrow">&#9650;</span></th>
            <th data-col="domain">Domain <span class="sort-arrow">&#9650;</span></th>
+            <th data-col="submitted_by">Contributor <span class="sort-arrow">&#9650;</span></th>
            <th data-col="status">Outcome <span class="sort-arrow">&#9650;</span></th>
            <th data-col="eval_rounds">Evals <span class="sort-arrow">&#9650;</span></th>
-            <th data-col="evaluator">Evaluator <span class="sort-arrow">&#9650;</span></th>
+            <th data-col="evaluator_label">Evaluator <span class="sort-arrow">&#9650;</span></th>
            <th data-col="est_cost">Cost <span class="sort-arrow">&#9650;</span></th>
            <th data-col="created_at">Date <span class="sort-arrow">&#9650;</span></th>
          </tr>
@ -152,42 +150,71 @@ def render_prs_page(now: datetime) -> str:
    </div>
    """

-    # Use single-quoted JS strings throughout to avoid Python/HTML escaping issues
    scripts = """<script>
-    const PAGE_SIZE = 50;
-    const FORGEJO = 'https://git.livingip.xyz/teleo/teleo-codex/pulls/';
-    let allData = [];
-    let filtered = [];
-    let sortCol = 'number';
-    let sortAsc = false;
-    let page = 0;
-    let expandedPr = null;
+    var PAGE_SIZE = 50;
+    var FORGEJO = 'https://git.livingip.xyz/teleo/teleo-codex/pulls/';
+    var allData = [];
+    var filtered = [];
+    var sortCol = 'number';
+    var sortAsc = false;
+    var page = 0;
+    var expandedPr = null;
+
+    // Tier-based cost estimates (per eval round)
+    var TIER_COSTS = {
+      'DEEP': 0.145,     // Haiku triage + Gemini Flash domain + Opus Leo
+      'STANDARD': 0.043, // Haiku triage + Gemini Flash domain + Sonnet Leo
+      'LIGHT': 0.027     // Haiku triage + Gemini Flash domain only
+    };
+
+    function estimateCost(pr) {
+      var tier = pr.tier || 'STANDARD';
+      var rounds = pr.eval_rounds || 1;
+      var baseCost = TIER_COSTS[tier] || TIER_COSTS['STANDARD'];
+      return baseCost * rounds;
+    }
+
+    function fmtCost(val) {
+      if (val == null || val === 0) return '--';
+      return '$' + val.toFixed(3);
+    }

    function loadData() {
      var days = document.getElementById('filter-days').value;
      var url = '/api/pr-lifecycle' + (days !== '0' ? '?days=' + days : '?days=9999');
      fetch(url).then(function(r) { return r.json(); }).then(function(data) {
        allData = data.prs || [];
+        // Compute derived fields
+        allData.forEach(function(p) {
+          p.est_cost = estimateCost(p);
+          // Evaluator label for sorting
+          p.evaluator_label = p.domain_agent || p.agent || '--';
+        });
        populateFilters(allData);
        updateKPIs(data);
        applyFilters();
      }).catch(function() {
        document.getElementById('pr-tbody').innerHTML =
-          '<tr><td colspan="9" style="text-align:center;color:#f85149;">Failed to load data</td></tr>';
+          '<tr><td colspan="10" style="text-align:center;color:#f85149;">Failed to load data</td></tr>';
      });
    }

    function populateFilters(prs) {
-      var domains = [], seenD = {};
+      var domains = [], contribs = [], seenD = {}, seenC = {};
      prs.forEach(function(p) {
        if (p.domain && !seenD[p.domain]) { seenD[p.domain] = 1; domains.push(p.domain); }
+        var c = p.submitted_by || 'unknown';
+        if (!seenC[c]) { seenC[c] = 1; contribs.push(c); }
      });
-      domains.sort();
+      domains.sort(); contribs.sort();
      var domSel = document.getElementById('filter-domain');
-      var curDom = domSel.value;
+      var conSel = document.getElementById('filter-contributor');
+      var curDom = domSel.value, curCon = conSel.value;
      domSel.innerHTML = '<option value="">All Domains</option>' +
        domains.map(function(d) { return '<option value="' + esc(d) + '">' + esc(d) + '</option>'; }).join('');
-      domSel.value = curDom;
+      conSel.innerHTML = '<option value="">All Contributors</option>' +
+        contribs.map(function(c) { return '<option value="' + esc(c) + '">' + esc(c) + '</option>'; }).join('');
+      domSel.value = curDom; conSel.value = curCon;
    }

    function updateKPIs(data) {
@ -199,47 +226,29 @@ def render_prs_page(now: datetime) -> str:
      document.getElementById('kpi-merge-rate').textContent = fmtPct(rate);
      document.getElementById('kpi-merge-detail').textContent = fmtNum(data.open) + ' open';

-      document.getElementById('kpi-rounds').textContent =
-        data.median_rounds != null ? data.median_rounds.toFixed(1) : '--';
-      document.getElementById('kpi-rounds-detail').textContent =
-        data.max_rounds != null ? 'max: ' + data.max_rounds : '';
-
-      var totalClaims = 0, mergedClaims = 0;
-      var totalCost = 0;
-      var actualCount = 0, estCount = 0;
+      var totalClaims = 0, mergedClaims = 0, totalCost = 0;
      (data.prs || []).forEach(function(p) {
        totalClaims += (p.claims_count || 1);
        if (p.status === 'merged') mergedClaims += (p.claims_count || 1);
-        totalCost += (p.cost || 0);
-        if (p.cost_is_actual) actualCount++; else estCount++;
+        totalCost += estimateCost(p);
      });
      document.getElementById('kpi-claims').textContent = fmtNum(totalClaims);
      document.getElementById('kpi-claims-detail').textContent = fmtNum(mergedClaims) + ' merged';

-      // Show actual DB total if available, otherwise sum from PRs
-      var costLabel = '';
-      if (data.actual_total_cost > 0) {
-        document.getElementById('kpi-cost').textContent = '$' + data.actual_total_cost.toFixed(2);
-        costLabel = 'from costs table';
-      } else if (actualCount > 0) {
-        document.getElementById('kpi-cost').textContent = '$' + totalCost.toFixed(2);
-        costLabel = actualCount + ' actual, ' + estCount + ' est.';
-      } else {
-        document.getElementById('kpi-cost').textContent = '$' + totalCost.toFixed(2);
-        costLabel = 'ALL ESTIMATED';
-      }
-      var costPerClaim = totalClaims > 0 ? totalCost / totalClaims : 0;
-      document.getElementById('kpi-cost-detail').textContent =
-        '$' + costPerClaim.toFixed(3) + '/claim \u00b7 ' + costLabel;
+      document.getElementById('kpi-cost').textContent = '$' + totalCost.toFixed(2);
+      var perClaim = totalClaims > 0 ? totalCost / totalClaims : 0;
+      document.getElementById('kpi-cost-detail').textContent = '$' + perClaim.toFixed(3) + '/claim';
    }

    function applyFilters() {
      var dom = document.getElementById('filter-domain').value;
+      var con = document.getElementById('filter-contributor').value;
      var out = document.getElementById('filter-outcome').value;
      var tier = document.getElementById('filter-tier').value;

      filtered = allData.filter(function(p) {
        if (dom && p.domain !== dom) return false;
+        if (con && (p.submitted_by || 'unknown') !== con) return false;
        if (out && p.status !== out) return false;
        if (tier && p.tier !== tier) return false;
        return true;
@ -269,19 +278,6 @@ def render_prs_page(now: datetime) -> str:
      return s.length > n ? s.substring(0, n) + '...' : s;
    }

-    function shortModel(m) {
-      if (!m) return '';
-      // Shorten model names for display
-      if (m.indexOf('gemini-2.5-flash') !== -1) return 'Gemini Flash';
-      if (m.indexOf('claude-sonnet') !== -1 || m.indexOf('sonnet-4') !== -1) return 'Sonnet';
-      if (m.indexOf('claude-opus') !== -1 || m.indexOf('opus') !== -1) return 'Opus';
-      if (m.indexOf('haiku') !== -1) return 'Haiku';
-      if (m.indexOf('gpt-4o') !== -1) return 'GPT-4o';
-      // fallback: strip provider prefix
-      var parts = m.split('/');
-      return parts[parts.length - 1];
-    }
-
    function renderTable() {
      var tbody = document.getElementById('pr-tbody');
      var start = page * PAGE_SIZE;
@ -289,7 +285,7 @@ def render_prs_page(now: datetime) -> str:
      var totalPages = Math.ceil(filtered.length / PAGE_SIZE);

      if (slice.length === 0) {
-        tbody.innerHTML = '<tr><td colspan="9" style="text-align:center;color:#8b949e;">No PRs match filters</td></tr>';
+        tbody.innerHTML = '<tr><td colspan="10" style="text-align:center;color:#8b949e;">No PRs match filters</td></tr>';
        return;
      }

@ -301,40 +297,37 @@ def render_prs_page(now: datetime) -> str:
                        (p.tier || '').toLowerCase() === 'standard' ? 'tier-standard' : 'tier-light';
        var date = p.created_at ? p.created_at.substring(0, 10) : '--';

-        // Summary
+        // Summary: first claim title
        var summary = p.summary || '--';
-        var reviewSnippet = '';
-        if (p.status === 'closed' && p.review_snippet) {
-          reviewSnippet = '<div class="review-snippet">' + esc(truncate(p.review_snippet, 120)) + '</div>';
-        }

        // Outcome with tier badge
-        var outcomeLabel = esc(p.status || '--');
        var tierBadge = p.tier ? ' <span class="' + tierClass + '" style="font-size:10px;">' + esc(p.tier) + '</span>' : '';

-        // Evaluator column: domain agent + model
+        // Review snippet for issues
+        var reviewSnippet = '';
+        if (p.review_snippet) {
+          reviewSnippet = '<div class="review-snippet">' + esc(truncate(p.review_snippet, 100)) + '</div>';
+        }
+
+        // Contributor display
+        var contributor = p.submitted_by || '--';
+        var contribClass = 'contributor-tag';
+        if (contributor.indexOf('self-directed') >= 0 || contributor === 'unknown') {
+          contribClass = 'contributor-self';
+        }
+
+        // Evaluator: domain agent + model tag
        var evaluator = '';
        if (p.domain_agent) {
-          evaluator = '<div style="font-size:12px;color:#c9d1d9;">' + esc(p.domain_agent) + '</div>';
-        }
-        if (p.domain_model) {
-          evaluator += '<div class="model-tag">' + esc(shortModel(p.domain_model)) + '</div>';
-        }
-        if (p.leo_model) {
-          evaluator += '<div class="model-tag">' + esc(shortModel(p.leo_model)) + '</div>';
-        }
-        if (!evaluator) evaluator = '<span style="color:#484f58;">--</span>';
-
-        // Cost — actual from DB or estimated (flagged)
-        var costStr;
-        if (p.cost != null && p.cost > 0) {
-          if (p.cost_is_actual) {
-            costStr = '<span class="cost-val">$' + p.cost.toFixed(3) + '</span>';
-          } else {
-            costStr = '<span class="cost-val" style="opacity:0.5;" title="Estimated — no actual cost tracked">~$' + p.cost.toFixed(3) + '</span>';
+          var modelShort = '';
+          if (p.domain_model) {
+            var m = p.domain_model;
+            if (m.indexOf('gemini') >= 0) modelShort = 'Gemini Flash';
+            else if (m.indexOf('gpt-4o') >= 0) modelShort = 'GPT-4o';
+            else if (m.indexOf('sonnet') >= 0) modelShort = 'Sonnet';
+            else modelShort = m.split('/').pop();
          }
-        } else {
-          costStr = '<span style="color:#484f58;">--</span>';
+          evaluator = esc(p.domain_agent) + (modelShort ? ' <span class="model-tag">' + esc(modelShort) + '</span>' : '');
        }

        rows.push(
@ -342,16 +335,17 @@ def render_prs_page(now: datetime) -> str:
          '<td><span class="expand-chevron">&#9654;</span> ' +
            '<a class="pr-link" href="' + FORGEJO + p.number + '" target="_blank" rel="noopener" onclick="event.stopPropagation();">#' + p.number + '</a></td>' +
          '<td style="white-space:normal;"><span class="summary-text">' + esc(summary) + '</span>' + reviewSnippet + '</td>' +
-          '<td style="text-align:center;">' + (p.claims_count || '--') + '</td>' +
+          '<td style="text-align:center;">' + (p.claims_count || 1) + '</td>' +
          '<td>' + esc(p.domain || '--') + '</td>' +
-          '<td class="' + outClass + '">' + outcomeLabel + tierBadge + '</td>' +
+          '<td><span class="' + contribClass + '">' + esc(truncate(contributor, 20)) + '</span></td>' +
+          '<td class="' + outClass + '">' + esc(p.status || '--') + tierBadge + '</td>' +
          '<td style="text-align:center;">' + (p.eval_rounds || '--') + '</td>' +
          '<td>' + evaluator + '</td>' +
-          '<td>' + costStr + '</td>' +
+          '<td>' + fmtCost(p.est_cost) + '</td>' +
          '<td>' + date + '</td>' +
          '</tr>' +
-          '<tr id="trace-' + p.number + '" style="display:none;"><td colspan="9" style="padding:0;">' +
-          '<div class="trace-panel" id="panel-' + p.number + '">Loading trace...</div>' +
+          '<tr id="trace-' + p.number + '" style="display:none;"><td colspan="10" style="padding:0;">' +
+          '<div class="trace-panel" id="panel-' + p.number + '">Loading...</div>' +
          '</td></tr>'
        );
      });
@ -414,46 +408,34 @@ def render_prs_page(now: datetime) -> str:
    });

    function loadTrace(pr, panel) {
-      // Also find this PR in allData for claim list
+      // Find the PR data for claim titles
      var prData = null;
-      allData.forEach(function(p) { if (p.number == pr) prData = p; });
+      for (var i = 0; i < allData.length; i++) {
+        if (allData[i].number == pr) { prData = allData[i]; break; }
+      }

      fetch('/api/trace/' + pr).then(function(r) { return r.json(); }).then(function(data) {
        var html = '';

-        // --- Claims contained in this PR ---
-        if (prData && prData.claim_titles && prData.claim_titles.length > 0) {
-          html += '<div class="section-title">Claims (' + prData.claim_titles.length + ')</div>';
-          html += '<ul class="claim-list">';
-          prData.claim_titles.forEach(function(t) {
-            html += '<li>' + esc(t) + '</li>';
-          });
-          html += '</ul>';
+        // ─── Claims contained in this PR ───
+        if (prData && prData.description) {
+          var titles = prData.description.split('|').map(function(t) { return t.trim(); }).filter(Boolean);
+          if (titles.length > 0) {
+            html += '<h4>Claims (' + titles.length + ')</h4>';
+            html += '<ul class="claim-list">';
+            titles.forEach(function(t) {
+              html += '<li>' + esc(t) + '</li>';
+            });
+            html += '</ul>';
+          }
        }

-        // --- Issues summary ---
-        var issues = [];
-        if (data.timeline) {
-          data.timeline.forEach(function(ev) {
-            if (ev.detail && ev.detail.issues) {
-              var iss = ev.detail.issues;
-              if (typeof iss === 'string') { try { iss = JSON.parse(iss); } catch(e) { iss = [iss]; } }
-              if (Array.isArray(iss)) {
-                iss.forEach(function(i) {
-                  var label = String(i).replace(/_/g, ' ');
-                  if (issues.indexOf(label) === -1) issues.push(label);
-                });
-              }
-            }
-          });
-        }
+        // ─── Issues (if any) ───
        if (prData && prData.review_snippet) {
          html += '<div class="issues-box">' + esc(prData.review_snippet) + '</div>';
-        } else if (issues.length > 0) {
-          html += '<div class="issues-box">Issues: ' + issues.map(esc).join(', ') + '</div>';
        }

-        // --- Eval chain (who reviewed with what model) ---
+        // ─── Eval chain with models ───
        var models = {};
        if (data.timeline) {
          data.timeline.forEach(function(ev) {
@ -464,23 +446,38 @@ def render_prs_page(now: datetime) -> str:
            }
          });
        }
-        if (Object.keys(models).length > 0) {
-          html += '<div class="eval-chain">';
-          html += '<strong style="color:#58a6ff;">Eval chain:</strong> ';
-          var parts = [];
-          if (models['triage.haiku_triage'] || models['triage.deterministic_triage'])
-            parts.push('<span class="step"><span class="step-label">Triage</span> <span class="step-model">' + shortModel(models['triage.haiku_triage'] || 'deterministic') + '</span></span>');
-          if (models['domain_review'])
-            parts.push('<span class="step"><span class="step-label">Domain</span> <span class="step-model">' + shortModel(models['domain_review']) + '</span></span>');
-          if (models['leo_review'])
-            parts.push('<span class="step"><span class="step-label">Leo</span> <span class="step-model">' + shortModel(models['leo_review']) + '</span></span>');
-          html += parts.length > 0 ? parts.join(' <span class="arrow">&#8594;</span> ') : '<span style="color:#484f58;">No model data</span>';
+
+        html += '<div class="eval-chain"><strong style="color:#58a6ff;">Eval Chain:</strong> ';
+        var chain = [];
+        if (models['triage.haiku_triage'] || models['triage.deterministic_triage']) {
+          chain.push('<span class="chain-step">Triage <span class="model-tag">' +
+            esc(models['triage.haiku_triage'] || 'deterministic') + '</span></span>');
+        }
+        if (models['domain_review']) {
+          chain.push('<span class="chain-step">Domain <span class="model-tag">' +
+            esc(models['domain_review']) + '</span></span>');
+        }
+        if (models['leo_review']) {
+          chain.push('<span class="chain-step">Leo <span class="model-tag">' +
+            esc(models['leo_review']) + '</span></span>');
+        }
+        html += chain.length > 0 ? chain.join('<span class="chain-arrow">&#8594;</span>') :
+          '<span style="color:#484f58;">No model data</span>';
+        html += '</div>';
+
+        // ─── Source + contributor metadata ───
+        if (data.pr) {
+          html += '<div style="margin:8px 0;font-size:12px;color:#8b949e;">';
+          if (data.pr.source_path) html += 'Source: <span style="color:#c9d1d9;">' + esc(data.pr.source_path) + '</span> &middot; ';
+          if (prData && prData.submitted_by) html += 'Contributor: <span style="color:#d2a8ff;">' + esc(prData.submitted_by) + '</span> &middot; ';
+          if (data.pr.tier) html += 'Tier: <span style="color:#c9d1d9;">' + esc(data.pr.tier) + '</span> &middot; ';
+          html += '<a class="pr-link" href="' + FORGEJO + pr + '" target="_blank">View on Forgejo</a>';
          html += '</div>';
        }

-        // --- Timeline ---
+        // ─── Timeline ───
        if (data.timeline && data.timeline.length > 0) {
-          html += '<div class="section-title">Timeline</div>';
+          html += '<h4>Timeline</h4>';
          html += '<ul class="trace-timeline">';
          data.timeline.forEach(function(ev) {
            var cls = ev.event === 'approved' ? 'ev-approved' :
@ -491,7 +488,7 @@ def render_prs_page(now: datetime) -> str:
            if (ev.detail) {
              if (ev.detail.tier) detail += ' tier=' + ev.detail.tier;
              if (ev.detail.reason) detail += ' &#8212; ' + esc(ev.detail.reason);
-              if (ev.detail.model) detail += ' [' + esc(shortModel(ev.detail.model)) + ']';
+              if (ev.detail.model) detail += ' [' + esc(ev.detail.model) + ']';
              if (ev.detail.review_text) {
                detail += '<div class="review-text">' + esc(ev.detail.review_text).substring(0, 2000) + '</div>';
              }
@ -509,19 +506,19 @@ def render_prs_page(now: datetime) -> str:
          });
          html += '</ul>';
        } else {
-          html += '<div style="color:#484f58;font-size:12px;margin-top:8px;">No timeline events</div>';
+          html += '<div style="color:#484f58;font-size:12px;margin:8px 0;">No timeline events</div>';
        }

-        // --- Reviews ---
+        // ─── Reviews ───
        if (data.reviews && data.reviews.length > 0) {
-          html += '<div class="section-title">Reviews</div>';
+          html += '<h4>Reviews</h4>';
          data.reviews.forEach(function(r) {
            var cls = r.outcome === 'approved' ? 'badge-green' :
                      r.outcome === 'rejected' ? 'badge-red' : 'badge-yellow';
            html += '<div style="margin:4px 0;">' +
              '<span class="badge ' + cls + '">' + esc(r.outcome) + '</span> ' +
              '<span style="color:#8b949e;font-size:11px;">' + esc(r.reviewer || '') + ' ' +
-              (r.model ? '[' + esc(shortModel(r.model)) + ']' : '') + ' ' +
+              (r.model ? '[' + esc(r.model) + ']' : '') + ' ' +
              (r.reviewed_at || '').substring(0, 19) + '</span>';
            if (r.rejection_reason) {
              html += ' <code>' + esc(r.rejection_reason) + '</code>';
@ -540,7 +537,7 @@ def render_prs_page(now: datetime) -> str:
    }

    // Filter listeners
-    ['filter-domain', 'filter-outcome', 'filter-tier'].forEach(function(id) {
+    ['filter-domain', 'filter-contributor', 'filter-outcome', 'filter-tier'].forEach(function(id) {
      document.getElementById(id).addEventListener('change', applyFilters);
    });
    document.getElementById('filter-days').addEventListener('change', loadData);
--- a/diagnostics/dashboard_routes.py
+++ b/diagnostics/dashboard_routes.py
--- a/diagnostics/research_routes.py
+++ b/diagnostics/research_routes.py
@ -1,279 +0,0 @@
-"""Dashboard API routes for research session + cost tracking.
-
-Argus-side read-only endpoints. These query the data that
-research_tracking.py writes to pipeline.db.
-
-Add to app.py after alerting_routes setup.
-"""
-
-import json
-import sqlite3
-from aiohttp import web
-
-
-def _conn(app):
-    """Read-only connection to pipeline.db."""
-    db_path = app["db_path"]
-    conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
-    conn.row_factory = sqlite3.Row
-    return conn
-
-
-async def handle_api_research_sessions(request):
-    """GET /api/research-sessions?agent=&domain=&days=7
-
-    Returns research sessions with linked sources and cost data.
-    """
-    agent = request.query.get("agent")
-    domain = request.query.get("domain")
-    try:
-        days = int(request.query.get("days", 7))
-    except (ValueError, TypeError):
-        days = 7
-
-    conn = _conn(request.app)
-    try:
-        where = ["rs.started_at >= datetime('now', ?)"]
-        params = [f"-{days} days"]
-
-        if agent:
-            where.append("rs.agent = ?")
-            params.append(agent)
-        if domain:
-            where.append("rs.domain = ?")
-            params.append(domain)
-
-        where_clause = " AND ".join(where)
-
-        sessions = conn.execute(f"""
-            SELECT rs.*,
-                   GROUP_CONCAT(s.path, '||') as source_paths,
-                   GROUP_CONCAT(s.status, '||') as source_statuses,
-                   GROUP_CONCAT(s.claims_count, '||') as source_claims,
-                   GROUP_CONCAT(COALESCE(s.cost_usd, 0), '||') as source_costs
-            FROM research_sessions rs
-            LEFT JOIN sources s ON s.session_id = rs.id
-            WHERE {where_clause}
-            GROUP BY rs.id
-            ORDER BY rs.started_at DESC
-        """, params).fetchall()
-
-        result = []
-        for s in sessions:
-            sources = []
-            if s["source_paths"]:
-                paths = s["source_paths"].split("||")
-                statuses = (s["source_statuses"] or "").split("||")
-                claims = (s["source_claims"] or "").split("||")
-                costs = (s["source_costs"] or "").split("||")
-                for i, p in enumerate(paths):
-                    sources.append({
-                        "path": p,
-                        "status": statuses[i] if i < len(statuses) else None,
-                        "claims_count": int(claims[i]) if i < len(claims) and claims[i] else 0,
-                        "extraction_cost": float(costs[i]) if i < len(costs) and costs[i] else 0,
-                    })
-
-            result.append({
-                "id": s["id"],
-                "agent": s["agent"],
-                "domain": s["domain"],
-                "topic": s["topic"],
-                "reasoning": s["reasoning"],
-                "summary": s["summary"],
-                "sources_planned": s["sources_planned"],
-                "sources_produced": s["sources_produced"],
-                "model": s["model"],
-                "input_tokens": s["input_tokens"],
-                "output_tokens": s["output_tokens"],
-                "research_cost": s["cost_usd"],
-                "extraction_cost": sum(src["extraction_cost"] for src in sources),
-                "total_cost": s["cost_usd"] + sum(src["extraction_cost"] for src in sources),
-                "total_claims": sum(src["claims_count"] for src in sources),
-                "status": s["status"],
-                "started_at": s["started_at"],
-                "completed_at": s["completed_at"],
-                "sources": sources,
-            })
-
-        # Summary stats
-        total_sessions = len(result)
-        total_cost = sum(r["total_cost"] for r in result)
-        total_claims = sum(r["total_claims"] for r in result)
-        total_sources = sum(r["sources_produced"] for r in result)
-
-        return web.json_response({
-            "summary": {
-                "sessions": total_sessions,
-                "total_cost": round(total_cost, 2),
-                "total_claims": total_claims,
-                "total_sources": total_sources,
-                "avg_cost_per_claim": round(total_cost / total_claims, 4) if total_claims else 0,
-                "avg_cost_per_session": round(total_cost / total_sessions, 4) if total_sessions else 0,
-            },
-            "sessions": result,
-        })
-    finally:
-        conn.close()
-
-
-async def handle_api_costs(request):
-    """GET /api/costs?days=14&by=stage|model|date
-
-    Comprehensive cost breakdown. Works with EXISTING data in costs table
-    plus the new extraction costs once backfilled.
-    """
-    try:
-        days = int(request.query.get("days", 14))
-    except (ValueError, TypeError):
-        days = 14
-    group_by = request.query.get("by", "stage")
-
-    conn = _conn(request.app)
-    try:
-        valid_groups = {"stage", "model", "date"}
-        if group_by not in valid_groups:
-            group_by = "stage"
-
-        rows = conn.execute(f"""
-            SELECT {group_by},
-                   SUM(calls) as total_calls,
-                   SUM(input_tokens) as total_input,
-                   SUM(output_tokens) as total_output,
-                   SUM(cost_usd) as total_cost
-            FROM costs
-            WHERE date >= date('now', ?)
-            GROUP BY {group_by}
-            ORDER BY total_cost DESC
-        """, (f"-{days} days",)).fetchall()
-
-        result = []
-        for r in rows:
-            result.append({
-                group_by: r[group_by],
-                "calls": r["total_calls"],
-                "input_tokens": r["total_input"],
-                "output_tokens": r["total_output"],
-                "cost_usd": round(r["total_cost"], 4),
-            })
-
-        grand_total = sum(r["cost_usd"] for r in result)
-
-        # Also get per-agent cost from sources table (extraction costs)
-        agent_costs = conn.execute("""
-            SELECT p.agent,
-                   COUNT(DISTINCT s.path) as sources,
-                   SUM(s.cost_usd) as extraction_cost,
-                   SUM(s.claims_count) as claims
-            FROM sources s
-            LEFT JOIN prs p ON p.source_path = s.path
-            WHERE s.cost_usd > 0
-            GROUP BY p.agent
-            ORDER BY extraction_cost DESC
-        """).fetchall()
-
-        agent_breakdown = []
-        for r in agent_costs:
-            agent_breakdown.append({
-                "agent": r["agent"] or "unlinked",
-                "sources": r["sources"],
-                "extraction_cost": round(r["extraction_cost"], 2),
-                "claims": r["claims"],
-                "cost_per_claim": round(r["extraction_cost"] / r["claims"], 4) if r["claims"] else 0,
-            })
-
-        return web.json_response({
-            "period_days": days,
-            "grand_total": round(grand_total, 2),
-            "by_" + group_by: result,
-            "by_agent": agent_breakdown,
-        })
-    finally:
-        conn.close()
-
-
-async def handle_api_source_detail(request):
-    """GET /api/source/{path}
-
-    Full lifecycle of a single source: research session → extraction → claims → eval outcomes.
-    """
-    source_path = request.match_info["path"]
-
-    conn = _conn(request.app)
-    try:
-        # Try exact match first, fall back to suffix match (anchored)
-        source = conn.execute(
-            "SELECT * FROM sources WHERE path = ?",
-            (source_path,),
-        ).fetchone()
-        if not source:
-            # Suffix match — anchor with / prefix to avoid substring hits
-            source = conn.execute(
-                "SELECT * FROM sources WHERE path LIKE ? ORDER BY length(path) LIMIT 1",
-                (f"%/{source_path}",),
-            ).fetchone()
-
-        if not source:
-            return web.json_response({"error": "Source not found"}, status=404)
-
-        result = dict(source)
-
-        # Get research session if linked
-        if source["session_id"]:
-            session = conn.execute(
-                "SELECT * FROM research_sessions WHERE id = ?",
-                (source["session_id"],),
-            ).fetchone()
-            result["research_session"] = dict(session) if session else None
-        else:
-            result["research_session"] = None
-
-        # Get PRs from this source
-        prs = conn.execute(
-            "SELECT number, status, domain, agent, tier, leo_verdict, domain_verdict, "
-            "cost_usd, created_at, merged_at, commit_type, transient_retries, substantive_retries, last_error "
-            "FROM prs WHERE source_path = ?",
-            (source["path"],),
-        ).fetchall()
-        result["prs"] = [dict(p) for p in prs]
-
-        # Get eval events from audit_log for those PRs
-        # NOTE: audit_log.detail is mixed — some rows are JSON (evaluate events),
-        # some are plain text. Use json_valid() to filter safely.
-        pr_numbers = [p["number"] for p in prs]
-        if pr_numbers:
-            placeholders = ",".join("?" * len(pr_numbers))
-            evals = conn.execute(f"""
-                SELECT * FROM audit_log
-                WHERE stage = 'evaluate'
-                AND json_valid(detail)
-                AND json_extract(detail, '$.pr') IN ({placeholders})
-                ORDER BY timestamp
-            """, pr_numbers).fetchall()
-            result["eval_history"] = [
-                {"timestamp": e["timestamp"], "event": e["event"],
-                 "detail": json.loads(e["detail"]) if e["detail"] else None}
-                for e in evals
-            ]
-        else:
-            result["eval_history"] = []
-
-        return web.json_response(result)
-    finally:
-        conn.close()
-
-
-def setup_research_routes(app):
-    """Register research tracking routes. Call from create_app()."""
-    app.router.add_get("/api/research-sessions", handle_api_research_sessions)
-    app.router.add_get("/api/costs", handle_api_costs)
-    app.router.add_get("/api/source/{path:.+}", handle_api_source_detail)
-
-
-# Public paths to add to auth middleware
-RESEARCH_PUBLIC_PATHS = frozenset({
-    "/api/research-sessions",
-    "/api/costs",
-})
-# /api/source/{path} needs prefix matching — add to auth middleware:
-# if path.startswith("/api/source/"): allow
--- a/diagnostics/review_queue.py
+++ b/diagnostics/review_queue.py
@ -140,7 +140,7 @@ async def fetch_review_queue(
    if forgejo_token:
        headers["Authorization"] = f"token {forgejo_token}"

-    connector = aiohttp.TCPConnector()  # Default SSL verification — Forgejo token must not be exposed to MITM
+    connector = aiohttp.TCPConnector(ssl=False)
    async with aiohttp.ClientSession(headers=headers, connector=connector) as session:
        # Fetch open PRs
        url = f"{FORGEJO_BASE}/repos/{REPO}/pulls?state=open&limit=50&sort=oldest"
--- a/diagnostics/shared_ui.py
+++ b/diagnostics/shared_ui.py
@ -11,7 +11,6 @@ PAGES = [
    {"path": "/health", "label": "Knowledge Health", "icon": "&#9829;"},
    {"path": "/agents", "label": "Agents", "icon": "&#9733;"},
    {"path": "/epistemic", "label": "Epistemic", "icon": "&#9878;"},
-    {"path": "/portfolio", "label": "Portfolio", "icon": "&#9733;"},
 ]


--- a/scripts/embed-claims.py
+++ b/scripts/embed-claims.py
--- a/evaluate-trigger.sh
+++ b/evaluate-trigger.sh
@ -0,0 +1,621 @@
+#!/usr/bin/env bash
+# evaluate-trigger.sh — Find unreviewed PRs, run 2-agent review, auto-merge if approved.
+#
+# Reviews each PR with up to THREE agents:
+#   1. Leo (evaluator) — quality gates, cross-domain connections, coherence
+#   2. Domain agent — domain expertise, duplicate check, technical accuracy
+#   3. Ganymede (code reviewer) — code quality, correctness, safety (code PRs only)
+#
+# Ganymede reviews any PR that touches code files (ops/, diagnostics/, .py, .sh, etc.)
+#
+# After all reviews, auto-merges if:
+#   - Leo's comment contains "**Verdict:** approve"
+#   - Domain agent's comment contains "**Verdict:** approve" (if applicable)
+#   - Ganymede's comment contains "**Verdict:** approve" (if code PR)
+#   - No territory violations (files outside proposer's domain)
+#
+# Usage:
+#   ./ops/evaluate-trigger.sh              # review + auto-merge approved PRs
+#   ./ops/evaluate-trigger.sh 47           # review a specific PR by number
+#   ./ops/evaluate-trigger.sh --dry-run    # show what would be reviewed, don't run
+#   ./ops/evaluate-trigger.sh --leo-only   # skip domain agent, just run Leo
+#   ./ops/evaluate-trigger.sh --no-merge   # review only, don't auto-merge (old behavior)
+#
+# Requirements:
+#   - claude CLI (claude -p for headless mode)
+#   - gh CLI authenticated with repo access
+#   - Run from the teleo-codex repo root
+#
+# Safety:
+#   - Lockfile prevents concurrent runs
+#   - Auto-merge requires ALL reviewers to approve + no territory violations
+#   - Each PR runs sequentially to avoid branch conflicts
+#   - Timeout: 20 minutes per agent per PR
+#   - Pre-flight checks: clean working tree, gh auth
+#
+# Verdict protocol:
+#   All agents use `gh pr comment` (NOT `gh pr review`) because all agents
+#   share the m3taversal GitHub account — `gh pr review --approve` fails
+#   when the PR author and reviewer are the same user. The merge check
+#   parses issue comments for structured verdict markers instead.
+
+set -euo pipefail
+
+# Allow nested Claude Code sessions (headless spawned from interactive)
+unset CLAUDECODE 2>/dev/null || true
+
+REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+cd "$REPO_ROOT"
+
+LOCKFILE="/tmp/evaluate-trigger.lock"
+LOG_DIR="$REPO_ROOT/ops/sessions"
+TIMEOUT_SECONDS=1200
+DRY_RUN=false
+LEO_ONLY=false
+NO_MERGE=false
+SPECIFIC_PR=""
+
+# --- Code PR detection ---
+# Returns "true" if the PR touches code files (ops/, diagnostics/, scripts, .py, .sh, .js, .html)
+# These PRs need Ganymede code review in addition to Leo's quality review.
+detect_code_pr() {
+  local pr_number="$1"
+  local files
+
+  files=$(gh pr view "$pr_number" --json files --jq '.files[].path' 2>/dev/null || echo "")
+
+  if echo "$files" | grep -qE "^ops/|^diagnostics/|\.py$|\.sh$|\.js$|\.html$|\.css$|\.json$"; then
+    echo "true"
+  else
+    echo "false"
+  fi
+}
+
+# --- Domain routing map ---
+# Maps branch prefix or domain directory to agent name and identity path
+detect_domain_agent() {
+  local pr_number="$1"
+  local branch files domain agent
+
+  branch=$(gh pr view "$pr_number" --json headRefName --jq '.headRefName' 2>/dev/null || echo "")
+  files=$(gh pr view "$pr_number" --json files --jq '.files[].path' 2>/dev/null || echo "")
+
+  # Try branch prefix first
+  case "$branch" in
+    rio/*|*/internet-finance*) agent="rio"; domain="internet-finance" ;;
+    clay/*|*/entertainment*)   agent="clay"; domain="entertainment" ;;
+    theseus/*|*/ai-alignment*) agent="theseus"; domain="ai-alignment" ;;
+    vida/*|*/health*)          agent="vida"; domain="health" ;;
+    astra/*|*/space-development*) agent="astra"; domain="space-development" ;;
+    leo/*|*/grand-strategy*)   agent="leo"; domain="grand-strategy" ;;
+    contrib/*)
+      # External contributor — detect domain from changed files (fall through to file check)
+      agent=""; domain=""
+      ;;
+    *)
+      agent=""; domain=""
+      ;;
+  esac
+
+  # If no agent detected from branch prefix, check changed files
+  if [ -z "$agent" ]; then
+    if echo "$files" | grep -q "domains/internet-finance/"; then
+      agent="rio"; domain="internet-finance"
+    elif echo "$files" | grep -q "domains/entertainment/"; then
+      agent="clay"; domain="entertainment"
+    elif echo "$files" | grep -q "domains/ai-alignment/"; then
+      agent="theseus"; domain="ai-alignment"
+    elif echo "$files" | grep -q "domains/health/"; then
+      agent="vida"; domain="health"
+    elif echo "$files" | grep -q "domains/space-development/"; then
+      agent="astra"; domain="space-development"
+    fi
+  fi
+
+  echo "$agent $domain"
+}
+
+# --- Parse arguments ---
+for arg in "$@"; do
+  case "$arg" in
+    --dry-run) DRY_RUN=true ;;
+    --leo-only) LEO_ONLY=true ;;
+    --no-merge) NO_MERGE=true ;;
+    [0-9]*) SPECIFIC_PR="$arg" ;;
+    --help|-h)
+      head -23 "$0" | tail -21
+      exit 0
+      ;;
+    *)
+      echo "Unknown argument: $arg"
+      exit 1
+      ;;
+  esac
+done
+
+# --- Pre-flight checks ---
+if ! gh auth status >/dev/null 2>&1; then
+  echo "ERROR: gh CLI not authenticated. Run 'gh auth login' first."
+  exit 1
+fi
+
+if ! command -v claude >/dev/null 2>&1; then
+  echo "ERROR: claude CLI not found. Install it first."
+  exit 1
+fi
+
+# Check for dirty working tree (ignore ops/, .claude/, .github/ which may contain local-only files)
+DIRTY_FILES=$(git status --porcelain | grep -v '^?? ops/' | grep -v '^ M ops/' | grep -v '^?? \.claude/' | grep -v '^ M \.claude/' | grep -v '^?? \.github/' | grep -v '^ M \.github/' || true)
+if [ -n "$DIRTY_FILES" ]; then
+  echo "ERROR: Working tree is dirty. Clean up before running."
+  echo "$DIRTY_FILES"
+  exit 1
+fi
+
+# --- Lockfile (prevent concurrent runs) ---
+if [ -f "$LOCKFILE" ]; then
+  LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "")
+  if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then
+    echo "Another evaluate-trigger is running (PID $LOCK_PID). Exiting."
+    exit 1
+  else
+    echo "Stale lockfile found. Removing."
+    rm -f "$LOCKFILE"
+  fi
+fi
+echo $$ > "$LOCKFILE"
+trap 'rm -f "$LOCKFILE"' EXIT
+
+# --- Ensure log directory exists ---
+mkdir -p "$LOG_DIR"
+
+# --- Find PRs to review ---
+if [ -n "$SPECIFIC_PR" ]; then
+  PR_STATE=$(gh pr view "$SPECIFIC_PR" --json state --jq '.state' 2>/dev/null || echo "NOT_FOUND")
+  if [ "$PR_STATE" != "OPEN" ]; then
+    echo "PR #$SPECIFIC_PR is $PR_STATE (not OPEN). Reviewing anyway for testing."
+  fi
+  PRS_TO_REVIEW="$SPECIFIC_PR"
+else
+  # NOTE: gh pr list silently returns empty in some worktree configs; use gh api instead
+  OPEN_PRS=$(gh api repos/:owner/:repo/pulls --jq '.[].number' 2>/dev/null || echo "")
+
+  if [ -z "$OPEN_PRS" ]; then
+    echo "No open PRs found. Nothing to review."
+    exit 0
+  fi
+
+  PRS_TO_REVIEW=""
+  for pr in $OPEN_PRS; do
+    # Check if this PR already has a Leo verdict comment (avoid re-reviewing)
+    LEO_COMMENTED=$(gh pr view "$pr" --json comments \
+      --jq '[.comments[] | select(.body | test("VERDICT:LEO:(APPROVE|REQUEST_CHANGES)"))] | length' 2>/dev/null || echo "0")
+    LAST_COMMIT_DATE=$(gh pr view "$pr" --json commits --jq '.commits[-1].committedDate' 2>/dev/null || echo "")
+
+    if [ "$LEO_COMMENTED" = "0" ]; then
+      PRS_TO_REVIEW="$PRS_TO_REVIEW $pr"
+    else
+      # Check if new commits since last Leo review
+      LAST_LEO_DATE=$(gh pr view "$pr" --json comments \
+        --jq '[.comments[] | select(.body | test("VERDICT:LEO:")) | .createdAt] | last' 2>/dev/null || echo "")
+      if [ -n "$LAST_COMMIT_DATE" ] && [ -n "$LAST_LEO_DATE" ] && [[ "$LAST_COMMIT_DATE" > "$LAST_LEO_DATE" ]]; then
+        echo "PR #$pr: New commits since last review. Queuing for re-review."
+        PRS_TO_REVIEW="$PRS_TO_REVIEW $pr"
+      else
+        echo "PR #$pr: Already reviewed. Skipping."
+      fi
+    fi
+  done
+
+  PRS_TO_REVIEW=$(echo "$PRS_TO_REVIEW" | xargs)
+
+  if [ -z "$PRS_TO_REVIEW" ]; then
+    echo "All open PRs are up to date. Nothing to do."
+    exit 0
+  fi
+fi
+
+echo "PRs to review: $PRS_TO_REVIEW"
+
+if [ "$DRY_RUN" = true ]; then
+  for pr in $PRS_TO_REVIEW; do
+    read -r agent domain <<< "$(detect_domain_agent "$pr")"
+    is_code=$(detect_code_pr "$pr")
+    reviewers="Leo + ${agent:-unknown} (${domain:-unknown domain})"
+    [ "$is_code" = "true" ] && reviewers="$reviewers + Ganymede (code)"
+    echo "[DRY RUN] PR #$pr — $reviewers"
+  done
+  exit 0
+fi
+
+# --- Run headless reviews on each PR ---
+run_agent_review() {
+  local pr="$1" agent_name="$2" prompt="$3" model="$4"
+  local timestamp log_file review_file
+
+  timestamp=$(date +%Y%m%d-%H%M%S)
+  log_file="$LOG_DIR/${agent_name}-review-pr${pr}-${timestamp}.log"
+  review_file="/tmp/${agent_name}-review-pr${pr}.md"
+
+  echo "  Running ${agent_name} (model: ${model})..."
+  echo "  Log: $log_file"
+
+  if perl -e "alarm $TIMEOUT_SECONDS; exec @ARGV" claude -p \
+    --model "$model" \
+    --allowedTools "Read,Write,Edit,Bash,Glob,Grep" \
+    --permission-mode bypassPermissions \
+    "$prompt" \
+    > "$log_file" 2>&1; then
+    echo "  ${agent_name}: Review posted."
+    rm -f "$review_file"
+    return 0
+  else
+    local exit_code=$?
+    if [ "$exit_code" -eq 142 ] || [ "$exit_code" -eq 124 ]; then
+      echo "  ${agent_name}: TIMEOUT after ${TIMEOUT_SECONDS}s."
+    else
+      echo "  ${agent_name}: FAILED (exit code $exit_code)."
+    fi
+    rm -f "$review_file"
+    return 1
+  fi
+}
+
+# --- Territory violation check ---
+# Verifies all changed files are within the proposer's expected territory
+check_territory_violations() {
+  local pr_number="$1"
+  local branch files proposer violations
+
+  branch=$(gh pr view "$pr_number" --json headRefName --jq '.headRefName' 2>/dev/null || echo "")
+  files=$(gh pr view "$pr_number" --json files --jq '.files[].path' 2>/dev/null || echo "")
+
+  # Determine proposer from branch prefix
+  proposer=$(echo "$branch" | cut -d'/' -f1)
+
+  # Map proposer to allowed directories
+  local allowed_domains=""
+  case "$proposer" in
+    rio)     allowed_domains="domains/internet-finance/" ;;
+    clay)    allowed_domains="domains/entertainment/" ;;
+    theseus) allowed_domains="domains/ai-alignment/" ;;
+    vida)    allowed_domains="domains/health/" ;;
+    astra)   allowed_domains="domains/space-development/" ;;
+    leo)     allowed_domains="core/|foundations/" ;;
+    contrib) echo ""; return 0 ;;  # External contributors — skip territory check
+    *)       echo ""; return 0 ;;  # Unknown proposer — skip check
+  esac
+
+  # Check each file — allow inbox/archive/, agents/{proposer}/, schemas/, foundations/, and the agent's domain
+  violations=""
+  while IFS= read -r file; do
+    [ -z "$file" ] && continue
+    # Always allowed: inbox/archive, own agent dir, maps/, foundations/ (any agent can propose foundation claims)
+    if echo "$file" | grep -qE "^inbox/archive/|^agents/${proposer}/|^maps/|^foundations/"; then
+      continue
+    fi
+    # Check against allowed domain directories
+    if echo "$file" | grep -qE "^${allowed_domains}"; then
+      continue
+    fi
+    violations="${violations}  - ${file}\n"
+  done <<< "$files"
+
+  if [ -n "$violations" ]; then
+    echo -e "$violations"
+  else
+    echo ""
+  fi
+}
+
+# --- Auto-merge check ---
+# Parses issue comments for structured verdict markers.
+# Verdict protocol: agents post `<!-- VERDICT:AGENT_KEY:APPROVE -->` or
+# `<!-- VERDICT:AGENT_KEY:REQUEST_CHANGES -->` as HTML comments in their review.
+# This is machine-parseable and invisible in the rendered comment.
+check_merge_eligible() {
+  local pr_number="$1"
+  local domain_agent="$2"
+  local leo_passed="$3"
+  local is_code_pr="${4:-false}"
+  local ganymede_passed="${5:-true}"
+
+  # Gate 1: Leo must have completed without timeout/error
+  if [ "$leo_passed" != "true" ]; then
+    echo "BLOCK: Leo review failed or timed out"
+    return 1
+  fi
+
+  # Gate 2: Check Leo's verdict from issue comments
+  local leo_verdict
+  leo_verdict=$(gh pr view "$pr_number" --json comments \
+    --jq '[.comments[] | select(.body | test("VERDICT:LEO:")) | .body] | last' 2>/dev/null || echo "")
+
+  if echo "$leo_verdict" | grep -q "VERDICT:LEO:APPROVE"; then
+    echo "Leo: APPROVED"
+  elif echo "$leo_verdict" | grep -q "VERDICT:LEO:REQUEST_CHANGES"; then
+    echo "BLOCK: Leo requested changes"
+    return 1
+  else
+    echo "BLOCK: Could not find Leo's verdict marker in PR comments"
+    return 1
+  fi
+
+  # Gate 3: Check domain agent verdict (if applicable)
+  if [ -n "$domain_agent" ] && [ "$domain_agent" != "leo" ]; then
+    local domain_key
+    domain_key=$(echo "$domain_agent" | tr '[:lower:]' '[:upper:]')
+    local domain_verdict
+    domain_verdict=$(gh pr view "$pr_number" --json comments \
+      --jq "[.comments[] | select(.body | test(\"VERDICT:${domain_key}:\")) | .body] | last" 2>/dev/null || echo "")
+
+    if echo "$domain_verdict" | grep -q "VERDICT:${domain_key}:APPROVE"; then
+      echo "Domain agent ($domain_agent): APPROVED"
+    elif echo "$domain_verdict" | grep -q "VERDICT:${domain_key}:REQUEST_CHANGES"; then
+      echo "BLOCK: $domain_agent requested changes"
+      return 1
+    else
+      echo "BLOCK: No verdict marker found for $domain_agent"
+      return 1
+    fi
+  else
+    echo "Domain agent: N/A (leo-only or grand-strategy)"
+  fi
+
+  # Gate 4: Ganymede code review (for code PRs)
+  if [ "$is_code_pr" = "true" ]; then
+    if [ "$ganymede_passed" != "true" ]; then
+      echo "BLOCK: Ganymede code review failed or timed out"
+      return 1
+    fi
+
+    local ganymede_verdict
+    ganymede_verdict=$(gh pr view "$pr_number" --json comments \
+      --jq '[.comments[] | select(.body | test("VERDICT:GANYMEDE:")) | .body] | last' 2>/dev/null || echo "")
+
+    if echo "$ganymede_verdict" | grep -q "VERDICT:GANYMEDE:APPROVE"; then
+      echo "Ganymede (code review): APPROVED"
+    elif echo "$ganymede_verdict" | grep -q "VERDICT:GANYMEDE:REQUEST_CHANGES"; then
+      echo "BLOCK: Ganymede requested code changes"
+      return 1
+    else
+      echo "BLOCK: No verdict marker found for Ganymede code review"
+      return 1
+    fi
+  fi
+
+  # Gate 5: Territory violations
+  local violations
+  violations=$(check_territory_violations "$pr_number")
+
+  if [ -n "$violations" ]; then
+    echo "BLOCK: Territory violations detected:"
+    echo -e "$violations"
+    return 1
+  else
+    echo "Territory: clean"
+  fi
+
+  return 0
+}
+
+REVIEWED=0
+FAILED=0
+MERGED=0
+
+for pr in $PRS_TO_REVIEW; do
+  echo ""
+  echo "=== PR #$pr ==="
+  echo "Started: $(date)"
+
+  # Detect which domain agent should review
+  read -r DOMAIN_AGENT DOMAIN <<< "$(detect_domain_agent "$pr")"
+  echo "Domain: ${DOMAIN:-unknown} | Agent: ${DOMAIN_AGENT:-none detected}"
+
+  # --- Review 1: Leo (evaluator) ---
+  LEO_REVIEW_FILE="/tmp/leo-review-pr${pr}.md"
+  LEO_PROMPT="You are Leo. Read agents/leo/identity.md, agents/leo/beliefs.md, agents/leo/reasoning.md, and skills/evaluate.md.
+
+Review PR #${pr} on this repo.
+
+First, run: gh pr view ${pr} --json title,body,files,additions,deletions
+Then checkout the PR branch: gh pr checkout ${pr}
+Read every changed file completely.
+
+Before evaluating, scan the existing knowledge base for duplicate and contradiction checks:
+- List claim files in the relevant domain directory (e.g., domains/${DOMAIN}/)
+- Read titles to check for semantic duplicates
+- Check for contradictions with existing claims in that domain and in foundations/
+
+For each proposed claim, evaluate against these 11 quality criteria from CLAUDE.md:
+1. Specificity — Is this specific enough to disagree with?
+2. Evidence — Is there traceable evidence in the body?
+3. Description quality — Does the description add info beyond the title?
+4. Confidence calibration — Does the confidence level match the evidence?
+5. Duplicate check — Does this already exist in the knowledge base?
+6. Contradiction check — Does this contradict an existing claim? If so, is the contradiction explicit?
+7. Value add — Does this genuinely expand what the knowledge base knows?
+8. Wiki links — Do all [[links]] point to real files?
+9. Scope qualification — Does the claim specify structural vs functional, micro vs macro, causal vs correlational?
+10. Universal quantifier check — Does the title use unwarranted universals (all, always, never, the only)?
+11. Counter-evidence acknowledgment — For likely or higher: is opposing evidence acknowledged?
+
+Also check:
+- Source archive updated correctly (status field)
+- Commit messages follow conventions
+- Files are in the correct domain directory
+- Cross-domain connections that the proposer may have missed
+
+Write your complete review to ${LEO_REVIEW_FILE}
+
+CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line):
+  <!-- VERDICT:LEO:APPROVE -->
+  <!-- VERDICT:LEO:REQUEST_CHANGES -->
+
+Then post the review as an issue comment:
+  gh pr comment ${pr} --body-file ${LEO_REVIEW_FILE}
+
+IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
+DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
+Work autonomously. Do not ask for confirmation."
+
+  if run_agent_review "$pr" "leo" "$LEO_PROMPT" "opus"; then
+    LEO_PASSED=true
+  else
+    LEO_PASSED=false
+  fi
+
+  # Return to main between reviews
+  git checkout main 2>/dev/null || git checkout -f main
+  PR_BRANCH=$(gh pr view "$pr" --json headRefName --jq '.headRefName' 2>/dev/null || echo "")
+  [ -n "$PR_BRANCH" ] && git branch -D "$PR_BRANCH" 2>/dev/null || true
+
+  # --- Review 2: Domain agent ---
+  if [ "$LEO_ONLY" = true ]; then
+    echo "  Skipping domain agent review (--leo-only)."
+  elif [ -z "$DOMAIN_AGENT" ]; then
+    echo "  Could not detect domain agent. Skipping domain review."
+  elif [ "$DOMAIN_AGENT" = "leo" ]; then
+    echo "  Domain is grand-strategy (Leo's territory). Single review sufficient."
+  else
+    DOMAIN_REVIEW_FILE="/tmp/${DOMAIN_AGENT}-review-pr${pr}.md"
+    AGENT_NAME_UPPER=$(echo "${DOMAIN_AGENT}" | awk '{print toupper(substr($0,1,1)) substr($0,2)}')
+    AGENT_KEY_UPPER=$(echo "${DOMAIN_AGENT}" | tr '[:lower:]' '[:upper:]')
+    DOMAIN_PROMPT="You are ${AGENT_NAME_UPPER}. Read agents/${DOMAIN_AGENT}/identity.md, agents/${DOMAIN_AGENT}/beliefs.md, and skills/evaluate.md.
+
+You are reviewing PR #${pr} as the domain expert for ${DOMAIN}.
+
+First, run: gh pr view ${pr} --json title,body,files,additions,deletions
+Then checkout the PR branch: gh pr checkout ${pr}
+Read every changed file completely.
+
+Your review focuses on DOMAIN EXPERTISE — things only a ${DOMAIN} specialist would catch:
+
+1. **Technical accuracy** — Are the claims factually correct within the ${DOMAIN} domain?
+2. **Domain duplicates** — Do any claims duplicate existing knowledge in domains/${DOMAIN}/?
+   Scan the directory and read titles carefully.
+3. **Missing context** — What important nuance from the ${DOMAIN} domain is the claim missing?
+4. **Belief impact** — Do any claims affect your current beliefs? Read agents/${DOMAIN_AGENT}/beliefs.md
+   and flag if any belief needs updating.
+5. **Connections** — What existing claims in your domain should be wiki-linked?
+6. **Confidence calibration** — From your domain expertise, is the confidence level right?
+
+Write your review to ${DOMAIN_REVIEW_FILE}
+
+CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line):
+  <!-- VERDICT:${AGENT_KEY_UPPER}:APPROVE -->
+  <!-- VERDICT:${AGENT_KEY_UPPER}:REQUEST_CHANGES -->
+
+Then post the review as an issue comment:
+  gh pr comment ${pr} --body-file ${DOMAIN_REVIEW_FILE}
+
+IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
+Sign your review as ${AGENT_NAME_UPPER} (domain reviewer for ${DOMAIN}).
+DO NOT duplicate Leo's quality gate checks — he covers those.
+DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
+Work autonomously. Do not ask for confirmation."
+
+    run_agent_review "$pr" "$DOMAIN_AGENT" "$DOMAIN_PROMPT" "sonnet"
+
+    # Clean up branch again
+    git checkout main 2>/dev/null || git checkout -f main
+    [ -n "$PR_BRANCH" ] && git branch -D "$PR_BRANCH" 2>/dev/null || true
+  fi
+
+  # --- Review 3: Ganymede code review (for PRs touching code files) ---
+  IS_CODE_PR=$(detect_code_pr "$pr")
+  GANYMEDE_PASSED=true
+
+  if [ "$IS_CODE_PR" = "true" ] && [ "$LEO_ONLY" != true ]; then
+    echo "  Code files detected — running Ganymede code review."
+    GANYMEDE_REVIEW_FILE="/tmp/ganymede-review-pr${pr}.md"
+    GANYMEDE_PROMPT="You are Ganymede, the code quality reviewer for the Teleo collective.
+
+Review PR #${pr} for code quality, correctness, and safety.
+
+First, run: gh pr view ${pr} --json title,body,files,additions,deletions
+Then checkout the PR branch: gh pr checkout ${pr}
+Read every changed file completely. Also read the existing versions of modified files on main for comparison.
+
+Your review focuses on CODE QUALITY — things a code reviewer catches:
+
+1. **Correctness** — Does the code do what it claims? Are there logic errors, off-by-one bugs, or unhandled edge cases?
+2. **Safety** — Any security issues? SQL injection, path traversal, unchecked inputs, secrets in code?
+3. **Breaking changes** — Does this change file formats, API responses, DB schemas, or config structures that other agents depend on? If so, is there a migration path?
+4. **Error handling** — Will failures be visible or silent? Are there bare excepts, missing error messages, or swallowed exceptions?
+5. **Integration** — Does the code work with the existing system? Are imports correct, paths valid, dependencies present?
+6. **Simplicity** — Is this more complex than it needs to be? Could it be simpler?
+
+Also check:
+- systemd ReadWritePaths if new file write paths are introduced
+- Path format consistency (absolute vs relative)
+- Concurrent edit risk on shared files (app.py, bot.py, etc.)
+
+Write your review to ${GANYMEDE_REVIEW_FILE}
+
+CRITICAL — Verdict format: Your review MUST end with exactly one of these verdict markers (as an HTML comment on its own line):
+  <!-- VERDICT:GANYMEDE:APPROVE -->
+  <!-- VERDICT:GANYMEDE:REQUEST_CHANGES -->
+
+Then post the review as an issue comment:
+  gh pr comment ${pr} --body-file ${GANYMEDE_REVIEW_FILE}
+
+IMPORTANT: Use 'gh pr comment' NOT 'gh pr review'. We use a shared GitHub account so gh pr review --approve fails.
+Sign your review as Ganymede (code reviewer).
+DO NOT duplicate Leo's knowledge quality checks — he covers those. You cover code.
+DO NOT merge — the orchestrator handles merge decisions after all reviews are posted.
+Work autonomously. Do not ask for confirmation."
+
+    if run_agent_review "$pr" "ganymede" "$GANYMEDE_PROMPT" "sonnet"; then
+      GANYMEDE_PASSED=true
+    else
+      GANYMEDE_PASSED=false
+    fi
+
+    # Clean up branch
+    git checkout main 2>/dev/null || git checkout -f main
+    [ -n "$PR_BRANCH" ] && git branch -D "$PR_BRANCH" 2>/dev/null || true
+  elif [ "$IS_CODE_PR" = "true" ] && [ "$LEO_ONLY" = true ]; then
+    echo "  Code files detected but skipping Ganymede review (--leo-only)."
+  fi
+
+  if [ "$LEO_PASSED" = true ]; then
+    REVIEWED=$((REVIEWED + 1))
+  else
+    FAILED=$((FAILED + 1))
+  fi
+
+  # --- Auto-merge decision ---
+  if [ "$NO_MERGE" = true ]; then
+    echo "  Auto-merge: skipped (--no-merge)"
+  elif [ "$LEO_PASSED" != "true" ]; then
+    echo "  Auto-merge: skipped (Leo review failed)"
+  else
+    echo ""
+    echo "  --- Merge eligibility check ---"
+    MERGE_LOG=$(check_merge_eligible "$pr" "$DOMAIN_AGENT" "$LEO_PASSED" "$IS_CODE_PR" "$GANYMEDE_PASSED")
+    MERGE_RESULT=$?
+    echo "$MERGE_LOG" | sed 's/^/    /'
+
+    if [ "$MERGE_RESULT" -eq 0 ]; then
+      echo "  Auto-merge: ALL GATES PASSED — merging PR #$pr"
+      if gh pr merge "$pr" --squash 2>&1; then
+        echo "  PR #$pr: MERGED successfully."
+        MERGED=$((MERGED + 1))
+      else
+        echo "  PR #$pr: Merge FAILED. May need manual intervention."
+      fi
+    else
+      echo "  Auto-merge: BLOCKED — see reasons above"
+    fi
+  fi
+
+  echo "Finished: $(date)"
+done
+
+echo ""
+echo "=== Summary ==="
+echo "Reviewed: $REVIEWED"
+echo "Failed: $FAILED"
+echo "Merged: $MERGED"
+echo "Logs: $LOG_DIR"
--- a/extract-cron.sh
+++ b/extract-cron.sh
@ -0,0 +1,179 @@
+#!/bin/bash
+# Extract claims from unprocessed sources in inbox/archive/
+# Runs via cron on VPS every 15 minutes.
+#
+# Concurrency model:
+#   - Lockfile prevents overlapping runs
+#   - MAX_SOURCES=5 per cycle (works through backlog over multiple runs)
+#   - Sequential processing (one source at a time)
+#   - 50 sources landing at once = ~10 cron cycles to clear, not 50 parallel agents
+#
+# Domain routing:
+#   - Reads domain: field from source frontmatter
+#   - Maps to the domain agent (rio, clay, theseus, vida, astra, leo)
+#   - Runs extraction AS that agent — their territory, their extraction
+#   - Skips sources with status: processing (agent handling it themselves)
+#
+# Flow:
+#   1. Pull latest main
+#   2. Find sources with status: unprocessed (skip processing/processed/null-result)
+#   3. For each: run Claude headless to extract claims as the domain agent
+#   4. Commit extractions, push, open PR
+#   5. Update source status to processed
+#
+# The eval pipeline (webhook.py) handles review and merge separately.
+
+set -euo pipefail
+
+REPO_DIR="/opt/teleo-eval/workspaces/extract"
+REPO_URL="http://m3taversal:$(cat /opt/teleo-eval/secrets/forgejo-admin-token)@localhost:3000/teleo/teleo-codex.git"
+CLAUDE_BIN="/home/teleo/.local/bin/claude"
+LOG_DIR="/opt/teleo-eval/logs"
+LOG="$LOG_DIR/extract-cron.log"
+LOCKFILE="/tmp/extract-cron.lock"
+MAX_SOURCES=5  # Process at most 5 sources per run to limit cost
+
+log() { echo "[$(date -Iseconds)] $*" >> "$LOG"; }
+
+# --- Lock ---
+if [ -f "$LOCKFILE" ]; then
+    pid=$(cat "$LOCKFILE" 2>/dev/null)
+    if kill -0 "$pid" 2>/dev/null; then
+        log "SKIP: already running (pid $pid)"
+        exit 0
+    fi
+    log "WARN: stale lockfile, removing"
+    rm -f "$LOCKFILE"
+fi
+echo $$ > "$LOCKFILE"
+trap 'rm -f "$LOCKFILE"' EXIT
+
+# --- Ensure repo clone ---
+if [ ! -d "$REPO_DIR/.git" ]; then
+    log "Cloning repo..."
+    git clone "$REPO_URL" "$REPO_DIR" >> "$LOG" 2>&1
+fi
+
+cd "$REPO_DIR"
+
+# --- Pull latest main ---
+git checkout main >> "$LOG" 2>&1
+git pull --rebase >> "$LOG" 2>&1
+
+# --- Find unprocessed sources ---
+UNPROCESSED=$(grep -rl '^status: unprocessed' inbox/archive/ 2>/dev/null | head -n "$MAX_SOURCES" || true)
+
+if [ -z "$UNPROCESSED" ]; then
+    log "No unprocessed sources found"
+    exit 0
+fi
+
+COUNT=$(echo "$UNPROCESSED" | wc -l | tr -d ' ')
+log "Found $COUNT unprocessed source(s)"
+
+# --- Process each source ---
+for SOURCE_FILE in $UNPROCESSED; do
+    SLUG=$(basename "$SOURCE_FILE" .md)
+    BRANCH="extract/$SLUG"
+
+    log "Processing: $SOURCE_FILE → branch $BRANCH"
+
+    # Create branch from main
+    git checkout main >> "$LOG" 2>&1
+    git branch -D "$BRANCH" 2>/dev/null || true
+    git checkout -b "$BRANCH" >> "$LOG" 2>&1
+
+    # Read domain from frontmatter
+    DOMAIN=$(grep '^domain:' "$SOURCE_FILE" | head -1 | sed 's/domain: *//' | tr -d '"' | tr -d "'" | xargs)
+
+    # Map domain to agent
+    case "$DOMAIN" in
+        internet-finance) AGENT="rio" ;;
+        entertainment) AGENT="clay" ;;
+        ai-alignment) AGENT="theseus" ;;
+        health) AGENT="vida" ;;
+        space-development) AGENT="astra" ;;
+        *) AGENT="leo" ;;
+    esac
+
+    AGENT_TOKEN=$(cat "/opt/teleo-eval/secrets/forgejo-${AGENT}-token" 2>/dev/null || cat /opt/teleo-eval/secrets/forgejo-leo-token)
+
+    log "Domain: $DOMAIN, Agent: $AGENT"
+
+    # Run Claude headless to extract claims
+    EXTRACT_PROMPT="You are $AGENT, a Teleo knowledge base agent. Extract claims from this source.
+
+READ these files first:
+- skills/extract.md (extraction process)
+- schemas/claim.md (claim format)
+- $SOURCE_FILE (the source to extract from)
+
+Then scan domains/$DOMAIN/ to check for duplicate claims.
+
+EXTRACT claims following the process in skills/extract.md:
+1. Read the source completely
+2. Separate evidence from interpretation
+3. Extract candidate claims (specific, disagreeable, evidence-backed)
+4. Check for duplicates against existing claims in domains/$DOMAIN/
+5. Write claim files to domains/$DOMAIN/ with proper YAML frontmatter
+6. Update $SOURCE_FILE: set status to 'processed', add processed_by: $AGENT, processed_date: $(date +%Y-%m-%d), and claims_extracted list
+
+If no claims can be extracted, update $SOURCE_FILE: set status to 'null-result' and add notes explaining why.
+
+IMPORTANT: Use the Edit tool to update the source file status. Use the Write tool to create new claim files. Do not create claims that duplicate existing ones."
+
+    # Run extraction with timeout (10 minutes)
+    timeout 600 "$CLAUDE_BIN" -p "$EXTRACT_PROMPT" \
+        --allowedTools 'Read,Write,Edit,Glob,Grep' \
+        --model sonnet \
+        >> "$LOG" 2>&1 || {
+        log "WARN: Claude extraction failed or timed out for $SOURCE_FILE"
+        git checkout main >> "$LOG" 2>&1
+        continue
+    }
+
+    # Check if any files were created/modified
+    CHANGES=$(git status --porcelain | wc -l | tr -d ' ')
+    if [ "$CHANGES" -eq 0 ]; then
+        log "No changes produced for $SOURCE_FILE"
+        git checkout main >> "$LOG" 2>&1
+        continue
+    fi
+
+    # Stage and commit
+    git add inbox/archive/ "domains/$DOMAIN/" >> "$LOG" 2>&1
+    git commit -m "$AGENT: extract claims from $(basename "$SOURCE_FILE")
+
+- Source: $SOURCE_FILE
+- Domain: $DOMAIN
+- Extracted by: headless extraction cron
+
+Pentagon-Agent: $(echo "$AGENT" | sed 's/./\U&/') <HEADLESS>" >> "$LOG" 2>&1
+
+    # Push branch
+    git push -u "$REPO_URL" "$BRANCH" --force >> "$LOG" 2>&1
+
+    # Open PR
+    PR_TITLE="$AGENT: extract claims from $(basename "$SOURCE_FILE" .md)"
+    PR_BODY="## Automated Extraction\n\nSource: \`$SOURCE_FILE\`\nDomain: $DOMAIN\nExtracted by: headless cron on VPS\n\nThis PR was created automatically by the extraction cron job. Claims were extracted using \`skills/extract.md\` process via Claude headless."
+
+    curl -s -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
+        -H "Authorization: token $AGENT_TOKEN" \
+        -H "Content-Type: application/json" \
+        -d "{
+            \"title\": \"$PR_TITLE\",
+            \"body\": \"$PR_BODY\",
+            \"base\": \"main\",
+            \"head\": \"$BRANCH\"
+        }" >> "$LOG" 2>&1
+
+    log "PR opened for $SOURCE_FILE"
+
+    # Back to main for next source
+    git checkout main >> "$LOG" 2>&1
+
+    # Brief pause between extractions
+    sleep 5
+done
+
+log "Extraction run complete: processed $COUNT source(s)"
--- a/scripts/extract-decisions.py
+++ b/scripts/extract-decisions.py
--- a/scripts/extract-graph-data.py
+++ b/scripts/extract-graph-data.py
--- a/fetch_coins.py
+++ b/fetch_coins.py
@ -1,841 +0,0 @@
-#!/usr/bin/env python3
-"""
-Ownership Coin Portfolio Data Fetcher
-
-Reads entity files for token addresses, fetches current and historical
-price data from DexScreener and CoinGecko, stores daily snapshots in
-pipeline.db coin_snapshots table.
-
-Usage:
-  python3 fetch_coins.py --daily          # Today's snapshot (current prices + on-chain)
-  python3 fetch_coins.py --backfill       # Historical daily prices from CoinGecko
-  python3 fetch_coins.py --backfill-days 90  # Last N days only
-"""
-
-import argparse
-import datetime
-import json
-import logging
-import os
-import sqlite3
-import sys
-import time
-from pathlib import Path
-
-import urllib.request
-import base58
-import yaml
-
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s %(levelname)s %(message)s",
-)
-logger = logging.getLogger("fetch_coins")
-
-MAIN_WORKTREE = Path(os.environ.get("MAIN_WORKTREE", "/opt/teleo-eval/workspaces/main"))
-DB_PATH = Path(os.environ.get("DB_PATH", "/opt/teleo-eval/pipeline/pipeline.db"))
-ENTITY_DIR = MAIN_WORKTREE / "entities" / "internet-finance"
-
-DEXSCREENER_TOKEN_URL = "https://api.dexscreener.com/tokens/v1/solana/{mint}"
-COINGECKO_HISTORY_URL = (
-    "https://api.coingecko.com/api/v3/coins/solana/contract/{mint}"
-    "/market_chart?vs_currency=usd&days={days}"
-)
-COINGECKO_RATE_LIMIT = 6.0  # seconds between requests (free tier — 10-15 req/min)
-
-USDC_MINT = "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v"
-SOLANA_RPC = "https://api.mainnet-beta.solana.com"
-
-
-def _http_get_json(url, retries=3, timeout=15):
-    for attempt in range(retries + 1):
-        try:
-            req = urllib.request.Request(url, headers={
-                "Accept": "application/json",
-                "User-Agent": "teleo-portfolio/1.0",
-            })
-            with urllib.request.urlopen(req, timeout=timeout) as resp:
-                return json.loads(resp.read())
-        except urllib.error.HTTPError as e:
-            if e.code == 429 and attempt < retries:
-                wait = 15 * (attempt + 1)
-                logger.info("Rate limited, waiting %ds...", wait)
-                time.sleep(wait)
-                continue
-            logger.warning("HTTP %d for %s", e.code, url[:80])
-            return None
-        except Exception as e:
-            if attempt < retries:
-                time.sleep(2 ** attempt)
-                continue
-            logger.warning("HTTP GET failed after %d attempts: %s — %s", retries + 1, url[:80], e)
-            return None
-
-
-def load_ownership_coins():
-    """Read entity files and return list of coin dicts with chain data."""
-    coins = []
-    for f in sorted(ENTITY_DIR.glob("*.md")):
-        content = f.read_text()
-        if "---" not in content:
-            continue
-        parts = content.split("---", 2)
-        if len(parts) < 3:
-            continue
-        try:
-            fm = yaml.safe_load(parts[1])
-        except Exception:
-            continue
-        if not isinstance(fm, dict):
-            continue
-        if fm.get("subtype") != "ownership-coin":
-            continue
-        if fm.get("status") == "liquidated":
-            continue
-
-        chain = fm.get("chain") or {}
-        if isinstance(chain, str):
-            chain = {}
-        raise_data = fm.get("raise") or {}
-        ops = fm.get("operations") or {}
-        liq = fm.get("liquidation") or {}
-
-        coins.append({
-            "name": fm.get("name", f.stem),
-            "ticker": fm.get("ticker"),
-            "status": fm.get("status", "unknown"),
-            "token_mint": chain.get("token_mint"),
-            "treasury_multisig": chain.get("treasury_multisig"),
-            "lp_pools": chain.get("lp_pools") or [],
-            "vesting_wallets": chain.get("vesting_wallets") or [],
-            "investor_locked_tokens": chain.get("investor_locked_tokens") or 0,
-            "meteora_seed_tokens": chain.get("meteora_seed_tokens") or 0,
-            "initial_price": raise_data.get("initial_token_price_usd"),
-            "amount_raised": raise_data.get("amount_raised_usd"),
-            "monthly_allowance": ops.get("monthly_allowance_usd"),
-            "liquidation_date": liq.get("date"),
-            "liquidation_return": liq.get("return_per_dollar"),
-            "file": f.name,
-        })
-
-    return coins
-
-
-def ensure_schema(conn):
-    """Create coin_snapshots table if it doesn't exist."""
-    conn.execute("""
-        CREATE TABLE IF NOT EXISTS coin_snapshots (
-            id INTEGER PRIMARY KEY AUTOINCREMENT,
-            snapshot_date TEXT NOT NULL,
-            name TEXT NOT NULL,
-            ticker TEXT,
-            token_mint TEXT,
-            status TEXT,
-            price_usd REAL,
-            market_cap_usd REAL,
-            fdv_usd REAL,
-            circulating_supply REAL,
-            total_supply REAL,
-            volume_24h_usd REAL,
-            liquidity_usd REAL,
-            treasury_multisig_usd REAL,
-            lp_usdc_total REAL,
-            lp_pools_detail TEXT,
-            equity_value_usd REAL,
-            initial_price_usd REAL,
-            amount_raised_usd REAL,
-            monthly_allowance_usd REAL,
-            effective_liq_price REAL,
-            delta_pct REAL,
-            months_runway REAL,
-            protocol_owned_tokens REAL,
-            adjusted_circulating_supply REAL,
-            data_source TEXT,
-            fetched_at TEXT NOT NULL,
-            UNIQUE(snapshot_date, name)
-        )
-    """)
-    # Legacy migration — these columns exist in CREATE TABLE but may be missing in older DBs
-    for col in ("protocol_owned_tokens", "adjusted_circulating_supply", "treasury_protocol_tokens", "vesting_tokens"):
-        try:
-            conn.execute(f"ALTER TABLE coin_snapshots ADD COLUMN {col} REAL")
-        except sqlite3.OperationalError:
-            pass
-    conn.execute("""
-        CREATE INDEX IF NOT EXISTS idx_coin_snapshots_date
-        ON coin_snapshots(snapshot_date)
-    """)
-    conn.execute("""
-        CREATE INDEX IF NOT EXISTS idx_coin_snapshots_name
-        ON coin_snapshots(name)
-    """)
-    conn.commit()
-
-
-def fetch_dexscreener(mint):
-    """Get current price, mcap, fdv, volume, liquidity from DexScreener."""
-    url = DEXSCREENER_TOKEN_URL.format(mint=mint)
-    data = _http_get_json(url)
-    if not data:
-        return None
-
-    pairs = data if isinstance(data, list) else data.get("pairs", [])
-    if not pairs:
-        return None
-
-    # Use highest-liquidity pair
-    best = max(pairs, key=lambda p: (p.get("liquidity") or {}).get("usd", 0))
-    liq = best.get("liquidity") or {}
-
-    return {
-        "price_usd": float(best["priceUsd"]) if best.get("priceUsd") else None,
-        "market_cap_usd": best.get("marketCap"),
-        "fdv_usd": best.get("fdv"),
-        "volume_24h_usd": (best.get("volume") or {}).get("h24"),
-        "liquidity_usd": liq.get("usd"),
-        "circulating_supply": None,  # DexScreener doesn't provide this directly
-        "total_supply": None,
-    }
-
-
-def fetch_coingecko_history(mint, days=365):
-    """Get daily price history from CoinGecko."""
-    url = COINGECKO_HISTORY_URL.format(mint=mint, days=days)
-    data = _http_get_json(url)
-    if not data or "prices" not in data:
-        return []
-
-    daily = {}
-    for ts_ms, price in data["prices"]:
-        dt = datetime.datetime.fromtimestamp(ts_ms / 1000, tz=datetime.timezone.utc)
-        date_str = dt.strftime("%Y-%m-%d")
-        daily[date_str] = price  # last value for that day wins (CoinGecko returns multiple per day)
-
-    market_caps = {}
-    for ts_ms, mc in data.get("market_caps", []):
-        dt = datetime.datetime.fromtimestamp(ts_ms / 1000, tz=datetime.timezone.utc)
-        date_str = dt.strftime("%Y-%m-%d")
-        market_caps[date_str] = mc
-
-    volumes = {}
-    for ts_ms, vol in data.get("total_volumes", []):
-        dt = datetime.datetime.fromtimestamp(ts_ms / 1000, tz=datetime.timezone.utc)
-        date_str = dt.strftime("%Y-%m-%d")
-        volumes[date_str] = vol
-
-    result = []
-    for date_str in sorted(daily.keys()):
-        result.append({
-            "date": date_str,
-            "price_usd": daily[date_str],
-            "market_cap_usd": market_caps.get(date_str),
-            "volume_24h_usd": volumes.get(date_str),
-        })
-
-    return result
-
-
-def fetch_solana_token_supply(mint):
-    """Get token supply from Solana RPC."""
-    payload = {
-        "jsonrpc": "2.0",
-        "id": 1,
-        "method": "getTokenSupply",
-        "params": [mint],
-    }
-    req = urllib.request.Request(
-        SOLANA_RPC,
-        data=json.dumps(payload).encode(),
-        headers={"Content-Type": "application/json"},
-    )
-    try:
-        with urllib.request.urlopen(req, timeout=10) as resp:
-            data = json.loads(resp.read())
-        val = data.get("result", {}).get("value", {})
-        amount = val.get("uiAmount")
-        return {"total_supply": amount}
-    except Exception as e:
-        logger.warning("Solana RPC getTokenSupply failed for %s: %s", mint[:12], e)
-        return {}
-
-
-def fetch_solana_usdc_balance(wallet_address):
-    """Get USDC balance for a wallet from Solana RPC."""
-    if not wallet_address:
-        return None
-    payload = {
-        "jsonrpc": "2.0",
-        "id": 1,
-        "method": "getTokenAccountsByOwner",
-        "params": [
-            wallet_address,
-            {"mint": USDC_MINT},
-            {"encoding": "jsonParsed"},
-        ],
-    }
-    req = urllib.request.Request(
-        SOLANA_RPC,
-        data=json.dumps(payload).encode(),
-        headers={"Content-Type": "application/json"},
-    )
-    try:
-        with urllib.request.urlopen(req, timeout=10) as resp:
-            data = json.loads(resp.read())
-        accounts = data.get("result", {}).get("value", [])
-        total = 0.0
-        for acct in accounts:
-            info = acct.get("account", {}).get("data", {}).get("parsed", {}).get("info", {})
-            token_amount = info.get("tokenAmount", {})
-            total += float(token_amount.get("uiAmount", 0))
-        return total
-    except Exception as e:
-        logger.warning("Solana RPC USDC balance failed for %s: %s", wallet_address[:12], e)
-        return None
-
-
-def fetch_solana_token_balance(wallet_address, token_mint):
-    """Get balance of a specific SPL token for a wallet from Solana RPC."""
-    if not wallet_address or not token_mint:
-        return None
-    payload = {
-        "jsonrpc": "2.0",
-        "id": 1,
-        "method": "getTokenAccountsByOwner",
-        "params": [
-            wallet_address,
-            {"mint": token_mint},
-            {"encoding": "jsonParsed"},
-        ],
-    }
-    for attempt in range(3):
-        req = urllib.request.Request(
-            SOLANA_RPC,
-            data=json.dumps(payload).encode(),
-            headers={"Content-Type": "application/json"},
-        )
-        try:
-            with urllib.request.urlopen(req, timeout=10) as resp:
-                data = json.loads(resp.read())
-            if "error" in data:
-                code = data["error"].get("code", 0)
-                if code == 429 and attempt < 2:
-                    wait = 10 * (attempt + 1)
-                    logger.info("RPC rate limited for %s, retrying in %ds...", wallet_address[:12], wait)
-                    time.sleep(wait)
-                    continue
-                logger.warning("RPC error for %s: %s", wallet_address[:12], data["error"])
-                return None
-            accounts = data.get("result", {}).get("value", [])
-            total = 0.0
-            for acct in accounts:
-                info = acct.get("account", {}).get("data", {}).get("parsed", {}).get("info", {})
-                token_amount = info.get("tokenAmount", {})
-                total += float(token_amount.get("uiAmount", 0))
-            return total
-        except urllib.error.HTTPError as e:
-            if e.code == 429 and attempt < 2:
-                wait = 10 * (attempt + 1)
-                logger.info("RPC 429 for %s, retrying in %ds...", wallet_address[:12], wait)
-                time.sleep(wait)
-                continue
-            logger.warning("Solana RPC token balance failed for %s (mint %s): %s",
-                           wallet_address[:12], token_mint[:12], e)
-            return None
-        except Exception as e:
-            logger.warning("Solana RPC token balance failed for %s (mint %s): %s",
-                           wallet_address[:12], token_mint[:12], e)
-            return None
-    return None
-
-
-
-# Meteora program IDs
-METEORA_CPAMM = "cpamdpZCGKUy5JxQXB4dcpGPiikHawvSWAd6mEn1sGG"
-METEORA_DLMM = "LBUZKhRxPF3XUpBCjp4YzTKgLccjZhTSDM9YuVaPwxo"
-# CPAMM: vault_a at byte 232, vault_b at byte 264
-# DLMM:  reserve_x at byte 152, reserve_y at byte 184
-
-def _resolve_meteora_vaults(pool_address):
-    """For Meteora pools, read account data to find actual token vaults.
-
-    Returns (vault_a_addr, vault_b_addr, program_type) or (None, None, None).
-    """
-    import base64
-    payload = {
-        "jsonrpc": "2.0", "id": 1,
-        "method": "getAccountInfo",
-        "params": [pool_address, {"encoding": "base64"}],
-    }
-    for attempt in range(3):
-        try:
-            req = urllib.request.Request(
-                SOLANA_RPC,
-                data=json.dumps(payload).encode(),
-                headers={"Content-Type": "application/json"},
-            )
-            with urllib.request.urlopen(req, timeout=15) as resp:
-                data = json.loads(resp.read())
-            if "error" in data:
-                code = data["error"].get("code", 0)
-                if code == 429 and attempt < 2:
-                    time.sleep(10 * (attempt + 1))
-                    continue
-                return None, None, None
-            val = data.get("result", {}).get("value")
-            if not val:
-                return None, None, None
-            owner = val.get("owner", "")
-            raw = base64.b64decode(val["data"][0])
-
-            if owner == METEORA_CPAMM and len(raw) >= 296:
-                va = base58.b58encode(raw[232:264]).decode()
-                vb = base58.b58encode(raw[264:296]).decode()
-                return va, vb, "cpamm"
-            elif owner == METEORA_DLMM and len(raw) >= 216:
-                va = base58.b58encode(raw[152:184]).decode()
-                vb = base58.b58encode(raw[184:216]).decode()
-                return va, vb, "dlmm"
-            return None, None, None
-        except urllib.error.HTTPError as e:
-            if e.code == 429 and attempt < 2:
-                time.sleep(10 * (attempt + 1))
-                continue
-            return None, None, None
-        except Exception:
-            return None, None, None
-    return None, None, None
-
-
-def _fetch_vault_balance(vault_address):
-    """Get token balance from a vault/reserve account. Returns (mint, amount) or (None, 0)."""
-    payload = {
-        "jsonrpc": "2.0", "id": 1,
-        "method": "getAccountInfo",
-        "params": [vault_address, {"encoding": "jsonParsed"}],
-    }
-    for attempt in range(3):
-        try:
-            req = urllib.request.Request(
-                SOLANA_RPC,
-                data=json.dumps(payload).encode(),
-                headers={"Content-Type": "application/json"},
-            )
-            with urllib.request.urlopen(req, timeout=15) as resp:
-                data = json.loads(resp.read())
-            if "error" in data:
-                code = data["error"].get("code", 0)
-                if code == 429 and attempt < 2:
-                    time.sleep(10 * (attempt + 1))
-                    continue
-                return None, 0.0
-            val = data.get("result", {}).get("value")
-            if not val or not isinstance(val.get("data"), dict):
-                return None, 0.0
-            info = val["data"]["parsed"]["info"]
-            mint = info["mint"]
-            amt = float(info["tokenAmount"]["uiAmountString"])
-            return mint, amt
-        except urllib.error.HTTPError as e:
-            if e.code == 429 and attempt < 2:
-                time.sleep(10 * (attempt + 1))
-                continue
-            return None, 0.0
-        except Exception:
-            return None, 0.0
-    return None, 0.0
-
-
-def fetch_lp_wallet_balances(lp_pools, token_mint):
-    """Query LP wallets for USDC balance and protocol-owned tokens.
-
-    Returns (lp_usdc_total, protocol_owned_tokens, lp_details_list).
-    """
-    if not lp_pools:
-        return 0.0, 0.0, []
-
-    total_usdc = 0.0
-    total_protocol_tokens = 0.0
-    details = []
-
-    for pool in lp_pools:
-        address = pool.get("address")
-        dex = pool.get("dex", "unknown")
-        if not address:
-            continue
-
-        pool_usdc = 0.0
-        pool_tokens = 0.0
-
-        # Try Meteora vault resolution first (CPAMM + DLMM)
-        if dex == "meteora":
-            vault_a, vault_b, prog_type = _resolve_meteora_vaults(address)
-            if vault_a and vault_b:
-                logger.info("Meteora %s pool %s: vaults %s, %s", prog_type, address[:12], vault_a[:12], vault_b[:12])
-                time.sleep(2)
-                for vault_addr in [vault_a, vault_b]:
-                    mint, amt = _fetch_vault_balance(vault_addr)
-                    if mint and amt > 0:
-                        if mint == USDC_MINT:
-                            pool_usdc += amt
-                        elif token_mint and mint == token_mint:
-                            pool_tokens += amt
-                    time.sleep(2)
-            else:
-                logger.warning("Meteora vault resolution failed for %s, falling back to getTokenAccountsByOwner", address[:12])
-
-        # Fallback: getTokenAccountsByOwner (works for futarchy-amm and non-Meteora pools)
-        if pool_usdc == 0 and pool_tokens == 0:
-            payload = {
-                "jsonrpc": "2.0",
-                "id": 1,
-                "method": "getTokenAccountsByOwner",
-                "params": [
-                    address,
-                    {"programId": "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"},
-                    {"encoding": "jsonParsed"},
-                ],
-            }
-            for attempt in range(3):
-                try:
-                    req = urllib.request.Request(
-                        SOLANA_RPC,
-                        data=json.dumps(payload).encode(),
-                        headers={"Content-Type": "application/json"},
-                    )
-                    with urllib.request.urlopen(req, timeout=15) as resp:
-                        data = json.loads(resp.read())
-                    if "error" in data:
-                        code = data["error"].get("code", 0)
-                        if code == 429 and attempt < 2:
-                            logger.info("RPC rate limited for %s, retrying in %ds...", address[:12], 5 * (attempt + 1))
-                            time.sleep(10 * (attempt + 1))
-                            continue
-                        logger.warning("RPC error for LP %s: %s", address[:12], data["error"])
-                        break
-                    for acct in data.get("result", {}).get("value", []):
-                        info = acct["account"]["data"]["parsed"]["info"]
-                        mint = info["mint"]
-                        amt = float(info["tokenAmount"]["uiAmountString"])
-                        if amt == 0:
-                            continue
-                        if mint == USDC_MINT:
-                            pool_usdc += amt
-                        elif token_mint and mint == token_mint:
-                            pool_tokens += amt
-                    break
-                except urllib.error.HTTPError as e:
-                    if e.code == 429 and attempt < 2:
-                        wait = 5 * (attempt + 1)
-                        logger.info("RPC 429 for %s, retrying in %ds...", address[:12], wait)
-                        time.sleep(wait * 2)
-                        continue
-                    logger.warning("LP wallet query failed for %s (%s): %s", dex, address[:12], e)
-                    break
-                except Exception as e:
-                    logger.warning("LP wallet query failed for %s (%s): %s", dex, address[:12], e)
-                    break
-
-        total_usdc += pool_usdc
-        total_protocol_tokens += pool_tokens
-        details.append({
-            "dex": dex,
-            "address": address,
-            "usdc": round(pool_usdc, 2),
-            "protocol_tokens": round(pool_tokens, 2),
-        })
-        time.sleep(5)
-
-    return total_usdc, total_protocol_tokens, details
-
-
-def compute_derived(row, coin):
-    """Compute effective liquidation price, delta, equity, runway."""
-    price = row.get("price_usd")
-    treasury = row.get("treasury_multisig_usd") or 0
-    lp_total = row.get("lp_usdc_total") or 0
-    mcap = row.get("market_cap_usd") or 0
-    monthly = coin.get("monthly_allowance")
-    protocol_tokens = row.get("protocol_owned_tokens") or 0
-    total_supply = row.get("total_supply")
-
-    cash_total = treasury + lp_total
-
-    adj_circ = row.get("adjusted_circulating_supply")
-    if not adj_circ and total_supply and total_supply > 0:
-        adj_circ = total_supply - protocol_tokens
-        row["adjusted_circulating_supply"] = adj_circ
-
-    if adj_circ and adj_circ > 0:
-        row["effective_liq_price"] = cash_total / adj_circ
-        if price and price > 0:
-            original_mcap = row.get("market_cap_usd")
-            row["market_cap_usd"] = price * adj_circ
-            mcap = row["market_cap_usd"]
-            if original_mcap and abs(mcap - original_mcap) > 1:
-                logger.debug("%s: adjusted mcap $%.0f (was $%.0f, protocol_owned=%s)",
-                             row.get("name", "?"), mcap, original_mcap, protocol_tokens)
-    if price and price > 0 and row.get("effective_liq_price"):
-        row["delta_pct"] = ((row["effective_liq_price"] / price) - 1) * 100
-
-    row["equity_value_usd"] = mcap - cash_total if mcap else None
-
-    if monthly and monthly > 0 and treasury:
-        row["months_runway"] = treasury / monthly
-
-    return row
-
-
-def upsert_snapshot(conn, row):
-    """Insert or replace a daily snapshot."""
-    conn.execute("""
-        INSERT OR REPLACE INTO coin_snapshots (
-            snapshot_date, name, ticker, token_mint, status,
-            price_usd, market_cap_usd, fdv_usd,
-            circulating_supply, total_supply,
-            volume_24h_usd, liquidity_usd,
-            treasury_multisig_usd, lp_usdc_total, lp_pools_detail,
-            equity_value_usd, initial_price_usd, amount_raised_usd,
-            monthly_allowance_usd, effective_liq_price, delta_pct,
-            months_runway, protocol_owned_tokens, adjusted_circulating_supply,
-            treasury_protocol_tokens, vesting_tokens,
-            data_source, fetched_at
-        ) VALUES (
-            :snapshot_date, :name, :ticker, :token_mint, :status,
-            :price_usd, :market_cap_usd, :fdv_usd,
-            :circulating_supply, :total_supply,
-            :volume_24h_usd, :liquidity_usd,
-            :treasury_multisig_usd, :lp_usdc_total, :lp_pools_detail,
-            :equity_value_usd, :initial_price_usd, :amount_raised_usd,
-            :monthly_allowance_usd, :effective_liq_price, :delta_pct,
-            :months_runway, :protocol_owned_tokens, :adjusted_circulating_supply,
-            :treasury_protocol_tokens, :vesting_tokens,
-            :data_source, :fetched_at
-        )
-    """, row)
-
-
-def cmd_daily(coins, conn):
-    """Fetch current data for all coins and store today's snapshot."""
-    today = datetime.date.today().isoformat()
-    now = datetime.datetime.now(datetime.timezone.utc).isoformat()
-
-    for coin in coins:
-        mint = coin["token_mint"]
-        if not mint:
-            logger.info("Skipping %s — no token mint", coin["name"])
-            continue
-
-        logger.info("Fetching %s (%s)...", coin["name"], coin["ticker"])
-
-        # Current price from DexScreener
-        dex = fetch_dexscreener(mint)
-        if not dex:
-            logger.warning("DexScreener returned nothing for %s — trying last known price", coin["name"])
-            last_row = conn.execute(
-                "SELECT price_usd FROM coin_snapshots WHERE name=? AND price_usd IS NOT NULL ORDER BY snapshot_date DESC LIMIT 1",
-                (coin["name"],)
-            ).fetchone()
-            if last_row and last_row[0]:
-                dex = {"price_usd": last_row[0], "market_cap_usd": None, "fdv_usd": None, "volume_24h_usd": None, "liquidity_usd": None, "circulating_supply": None, "total_supply": None}
-                logger.info("  Using last known price: $%.4f", last_row[0])
-            else:
-                logger.warning("  No historical price either — skipping %s", coin["name"])
-                continue
-
-        # Token supply from Solana RPC
-        supply = fetch_solana_token_supply(mint)
-        time.sleep(4)
-
-        # Treasury USDC balance + protocol token balance
-        treasury_usd = None
-        treasury_tokens = 0.0
-        if coin.get("treasury_multisig"):
-            treasury_usd = fetch_solana_usdc_balance(coin["treasury_multisig"])
-            time.sleep(2)
-            treas_tok = fetch_solana_token_balance(coin["treasury_multisig"], mint)
-            if treas_tok and treas_tok > 0:
-                treasury_tokens = treas_tok
-                logger.info("  %s treasury holds %.0f protocol tokens", coin["name"], treasury_tokens)
-            time.sleep(2)
-
-        time.sleep(4)
-
-        # Vesting wallet scanning — tokens locked in vesting contracts
-        vesting_tokens = 0.0
-        if coin.get("vesting_wallets"):
-            for vw in coin["vesting_wallets"]:
-                vw_addr = vw.get("address") if isinstance(vw, dict) else vw
-                if not vw_addr:
-                    continue
-                vt = fetch_solana_token_balance(vw_addr, mint)
-                if vt and vt > 0:
-                    vesting_tokens += vt
-                    label = vw.get("label", vw_addr[:12]) if isinstance(vw, dict) else vw_addr[:12]
-                    logger.info("  %s vesting wallet (%s) holds %.0f tokens", coin["name"], label, vt)
-                time.sleep(2)
-
-        # LP pool balances — query each wallet for USDC + protocol-owned tokens
-        lp_total = 0.0
-        protocol_tokens = 0.0
-        lp_detail = None
-        if coin.get("lp_pools"):
-            lp_total, protocol_tokens, lp_details_list = fetch_lp_wallet_balances(
-                coin["lp_pools"], mint
-            )
-            lp_detail = json.dumps(lp_details_list) if lp_details_list else None
-
-        total_supply = supply.get("total_supply")
-
-        # Adjusted circulating supply: total - LP tokens - treasury tokens
-        investor_locked = float(coin.get("investor_locked_tokens") or 0)
-        meteora_seed = float(coin.get("meteora_seed_tokens") or 0)
-        all_protocol_tokens = protocol_tokens + treasury_tokens + vesting_tokens + investor_locked + meteora_seed
-        if investor_locked > 0:
-            logger.info("  %s investor locked tokens: %.0f", coin["name"], investor_locked)
-        if meteora_seed > 0:
-            logger.info("  %s meteora seed tokens: %.0f", coin["name"], meteora_seed)
-        adj_circ = None
-        if total_supply and total_supply > 0:
-            adj_circ = total_supply - all_protocol_tokens
-
-        # If we have adj_circ and price but no mcap, compute from adjusted supply
-        if adj_circ and dex.get("price_usd"):
-            dex["market_cap_usd"] = adj_circ * dex["price_usd"]
-        elif total_supply and dex.get("price_usd") and not dex.get("market_cap_usd"):
-            dex["market_cap_usd"] = total_supply * dex["price_usd"]
-
-        row = {
-            "snapshot_date": today,
-            "name": coin["name"],
-            "ticker": coin["ticker"],
-            "token_mint": mint,
-            "status": coin["status"],
-            "price_usd": dex.get("price_usd"),
-            "market_cap_usd": dex.get("market_cap_usd"),
-            "fdv_usd": dex.get("fdv_usd"),
-            "circulating_supply": dex.get("circulating_supply"),
-            "total_supply": total_supply,
-            "volume_24h_usd": dex.get("volume_24h_usd"),
-            "liquidity_usd": dex.get("liquidity_usd"),
-            "treasury_multisig_usd": treasury_usd,
-            "lp_usdc_total": lp_total if lp_total else None,
-            "lp_pools_detail": lp_detail,
-            "equity_value_usd": None,
-            "initial_price_usd": coin.get("initial_price"),
-            "amount_raised_usd": coin.get("amount_raised"),
-            "monthly_allowance_usd": coin.get("monthly_allowance"),
-            "effective_liq_price": None,
-            "delta_pct": None,
-            "months_runway": None,
-            "protocol_owned_tokens": all_protocol_tokens if all_protocol_tokens else None,
-            "treasury_protocol_tokens": treasury_tokens if treasury_tokens else None,
-            "vesting_tokens": vesting_tokens if vesting_tokens else None,
-            "adjusted_circulating_supply": adj_circ,
-            "data_source": "dexscreener+solana_rpc",
-            "fetched_at": now,
-        }
-
-        row = compute_derived(row, coin)
-        upsert_snapshot(conn, row)
-        lp_msg = f" lp_usdc=${row.get('lp_usdc_total') or 0:,.0f} lp_tokens={protocol_tokens:,.0f} treas_tokens={treasury_tokens:,.0f}" if row.get("lp_usdc_total") or treasury_tokens else ""
-        logger.info("  %s: $%.4f mcap=$%s adj_circ=%s%s",
-                     coin["name"], row["price_usd"] or 0,
-                     f'{row["market_cap_usd"]:,.0f}' if row["market_cap_usd"] else "N/A",
-                     f'{row["adjusted_circulating_supply"]:,.0f}' if row.get("adjusted_circulating_supply") else "N/A",
-                     lp_msg)
-        time.sleep(1)
-
-    conn.commit()
-    logger.info("Daily snapshot complete for %s", today)
-
-
-def cmd_backfill(coins, conn, days=365):
-    """Backfill historical daily prices from CoinGecko."""
-    now = datetime.datetime.now(datetime.timezone.utc).isoformat()
-
-    for coin in coins:
-        mint = coin["token_mint"]
-        if not mint:
-            logger.info("Skipping %s — no token mint", coin["name"])
-            continue
-
-        logger.info("Backfilling %s (%s) — %d days...", coin["name"], coin["ticker"], days)
-        history = fetch_coingecko_history(mint, days=days)
-
-        if not history:
-            logger.warning("No CoinGecko history for %s", coin["name"])
-            time.sleep(COINGECKO_RATE_LIMIT)
-            continue
-
-        inserted = 0
-        for point in history:
-            row = {
-                "snapshot_date": point["date"],
-                "name": coin["name"],
-                "ticker": coin["ticker"],
-                "token_mint": mint,
-                "status": coin["status"],
-                "price_usd": point["price_usd"],
-                "market_cap_usd": point.get("market_cap_usd"),
-                "fdv_usd": None,
-                "circulating_supply": None,
-                "total_supply": None,
-                "volume_24h_usd": point.get("volume_24h_usd"),
-                "liquidity_usd": None,
-                "treasury_multisig_usd": None,
-                "lp_usdc_total": None,
-                "lp_pools_detail": None,
-                "equity_value_usd": None,
-                "initial_price_usd": coin.get("initial_price"),
-                "amount_raised_usd": coin.get("amount_raised"),
-                "monthly_allowance_usd": coin.get("monthly_allowance"),
-                "effective_liq_price": None,
-                "delta_pct": None,
-                "months_runway": None,
-                "protocol_owned_tokens": None,
-                "adjusted_circulating_supply": None,
-                "treasury_protocol_tokens": None,
-                "vesting_tokens": None,
-                "data_source": "coingecko_history",
-                "fetched_at": now,
-            }
-            upsert_snapshot(conn, row)
-            inserted += 1
-
-        conn.commit()
-        logger.info("  %s: %d daily snapshots inserted", coin["name"], inserted)
-        time.sleep(COINGECKO_RATE_LIMIT)
-
-    logger.info("Backfill complete")
-
-
-def main():
-    parser = argparse.ArgumentParser(description="Ownership coin portfolio data fetcher")
-    parser.add_argument("--daily", action="store_true", help="Fetch today's snapshot")
-    parser.add_argument("--backfill", action="store_true", help="Backfill historical prices")
-    parser.add_argument("--backfill-days", type=int, default=365, help="Days to backfill (default: 365)")
-    args = parser.parse_args()
-
-    if not args.daily and not args.backfill:
-        parser.error("Specify --daily or --backfill")
-
-    coins = load_ownership_coins()
-    logger.info("Loaded %d ownership coins (%d with token mints)",
-                len(coins), sum(1 for c in coins if c["token_mint"]))
-
-    conn = sqlite3.connect(str(DB_PATH), timeout=30)
-    conn.execute("PRAGMA journal_mode=WAL")
-    conn.execute("PRAGMA busy_timeout=30000")
-    ensure_schema(conn)
-
-    try:
-        if args.backfill:
-            cmd_backfill(coins, conn, days=args.backfill_days)
-        if args.daily:
-            cmd_daily(coins, conn)
-    finally:
-        conn.close()
-
-
-if __name__ == "__main__":
-    main()
--- a/deploy/fix-ownership.sh
+++ b/deploy/fix-ownership.sh
--- a/hermes-agent/install-hermes.sh
+++ b/hermes-agent/install-hermes.sh
--- a/lib/attribution.py
+++ b/lib/attribution.py
@ -21,92 +21,6 @@ logger = logging.getLogger("pipeline.attribution")

 VALID_ROLES = frozenset({"sourcer", "extractor", "challenger", "synthesizer", "reviewer"})

-# Agent-owned branch prefixes — PRs from these branches get Pentagon-Agent trailer
-# credit for challenger/synthesizer roles. Pipeline-infra branches (extract/ reweave/
-# fix/ ingestion/) are deliberately excluded: they're automation, not contribution.
-# Single source of truth; imported by contributor.py and backfill-events.py.
-AGENT_BRANCH_PREFIXES = (
-    "rio/", "theseus/", "leo/", "vida/", "astra/", "clay/", "oberon/",
-)
-
-# Handle sanity: lowercase alphanumerics, hyphens, underscores. 1-39 chars (matches
-# GitHub's handle rules). Rejects garbage like "governance---meritocratic-voting-+-futarchy"
-# or "sec-interpretive-release-s7-2026-09-(march-17" that upstream frontmatter hygiene
-# bugs produce. Apply at parse time so bad handles never reach the contributors table.
-_HANDLE_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,38}$")
-
-
-def _valid_handle(handle: str) -> bool:
-    """Return True if handle matches the handle format (alphanum + _-, ≤39 chars)."""
-    if not handle or not isinstance(handle, str):
-        return False
-    h = handle.strip().lower().lstrip("@")
-    if h.endswith("-") or h.endswith("_"):
-        return False
-    return bool(_HANDLE_RE.match(h))
-
-
-def _filter_valid_handles(result: dict) -> dict:
-    """Drop entries with invalid handles from a parsed attribution dict."""
-    filtered: dict[str, list[dict]] = {role: [] for role in VALID_ROLES}
-    for role, entries in result.items():
-        for entry in entries:
-            if _valid_handle(entry.get("handle", "")):
-                filtered[role].append(entry)
-    return filtered
-
-
-# ─── Handle normalization + kind classification (schema v24) ──────────────
-
-# Known Pentagon agents. Used to classify contributor kind='agent' so the
-# leaderboard can filter them out of the default person view.
-PENTAGON_AGENTS = frozenset({
-    "rio", "leo", "theseus", "vida", "clay", "astra",
-    "oberon", "argus", "rhea", "ganymede", "epimetheus", "hermes", "ship",
-    "pipeline",  # pipeline-owned commits (extract/*, reweave/*, fix/*)
-})
-
-
-def normalize_handle(handle: str, conn=None) -> str:
-    """Canonicalize a handle: lowercase, strip @, resolve alias if conn provided.
-
-    Examples:
-      '@thesensatore' → 'thesensatore'
-      'Cameron' → 'cameron' → 'cameron-s1' (via alias if seeded)
-      'CNBC' → 'cnbc'
-
-    Always lowercases and strips @ prefix. Alias resolution requires a conn
-    argument (not always available at parse time; merge-time writer passes it).
-    """
-    if not handle:
-        return ""
-    h = handle.strip().lower().lstrip("@")
-    if conn is None:
-        return h
-    try:
-        row = conn.execute(
-            "SELECT canonical FROM contributor_aliases WHERE alias = ?", (h,),
-        ).fetchone()
-        if row:
-            return row["canonical"] if isinstance(row, dict) or hasattr(row, "keys") else row[0]
-    except Exception:
-        # Alias table might not exist yet on pre-v24 DBs — degrade gracefully.
-        logger.debug("normalize_handle: alias lookup failed for %r", h, exc_info=True)
-    return h
-
-
-def classify_kind(handle: str) -> str:
-    """Return 'agent' for known Pentagon agents, 'person' otherwise.
-
-    The 'org' kind (CNBC, SpaceNews, etc.) is assigned by operator review,
-    not inferred here. Keeping heuristics narrow: we know our own agents;
-    everything else defaults to person until explicitly classified.
-    """
-    h = handle.strip().lower().lstrip("@")
-    if h in PENTAGON_AGENTS:
-        return "agent"
-    return "person"
-

 # ─── Parse attribution from claim content ──────────────────────────────────

@ -137,11 +51,7 @@ def parse_attribution(fm: dict) -> dict[str, list[dict]]:
            elif isinstance(entries, str):
                # Single entry as string
                result[role].append({"handle": entries.strip().lower().lstrip("@"), "agent_id": None, "context": None})
-        # Fall through to the filter at the end (don't early-return). The nested
-        # block path was skipping the handle sanity filter, letting garbage like
-        # "senator-elissa-slotkin-/-the-hill" through when it was written into
-        # frontmatter during the legacy-fallback era.
-        return _filter_valid_handles(result)
+        return result

    # Flat format fallback (attribution_sourcer, attribution_extractor, etc.)
    for role in VALID_ROLES:
@ -154,40 +64,22 @@ def parse_attribution(fm: dict) -> dict[str, list[dict]]:
                    if isinstance(v, str):
                        result[role].append({"handle": v.strip().lower().lstrip("@"), "agent_id": None, "context": None})

-    # Bare-key flat format: `sourcer: alexastrum`, `extractor: leo`, etc.
-    # This is what extract.py writes (line 290: f'sourcer: "{sourcer}"') — the most
-    # common format in practice (~42% of claim files). The Apr 24 incident traced
-    # missing leaderboard entries to this format being silently dropped because the
-    # parser only checked the `attribution_*` prefix.
-    # Only fill if the role wasn't already populated by the prefixed form, to avoid
-    # double-counting when both formats coexist on the same claim.
-    for role in VALID_ROLES:
-        if result[role]:
-            continue
-        bare_val = fm.get(role)
-        if isinstance(bare_val, str) and bare_val.strip():
-            result[role].append({"handle": bare_val.strip().lower().lstrip("@"), "agent_id": None, "context": None})
-        elif isinstance(bare_val, list):
-            for v in bare_val:
-                if isinstance(v, str) and v.strip():
-                    result[role].append({"handle": v.strip().lower().lstrip("@"), "agent_id": None, "context": None})
-                elif isinstance(v, dict) and v.get("handle"):
-                    result[role].append({
-                        "handle": v["handle"].strip().lower().lstrip("@"),
-                        "agent_id": v.get("agent_id"),
-                        "context": v.get("context"),
-                    })
+    # Legacy fallback: infer from source field
+    if not any(result[r] for r in VALID_ROLES):
+        source = fm.get("source", "")
+        if isinstance(source, str) and source:
+            # Try to extract author handle from source string
+            # Patterns: "@handle", "Author Name", "org, description"
+            handle_match = re.search(r"@(\w+)", source)
+            if handle_match:
+                result["sourcer"].append({"handle": handle_match.group(1).lower(), "agent_id": None, "context": source})
+            else:
+                # Use first word/phrase before comma as sourcer handle
+                author = source.split(",")[0].strip().lower().replace(" ", "-")
+                if author and len(author) > 1:
+                    result["sourcer"].append({"handle": author, "agent_id": None, "context": source})

-    # Legacy `source` heuristic REMOVED (Ganymede review, Apr 24). It fabricated
-    # handles from descriptive source strings — "governance---meritocratic-voting-+-
-    # futarchy", "cameron-(contributor)", "sec-interpretive-release-s7-2026-09-
-    # (march-17". Hit rate on real handles was near-zero, false-positive rate was
-    # high. Claims without explicit attribution now return empty (better surface as
-    # data hygiene than invent fake contributors).
-
-    # Filter to valid handles only. Bad handles (garbage from upstream frontmatter
-    # bugs) get dropped rather than written to the contributors table.
-    return _filter_valid_handles(result)
+    return result


 def parse_attribution_from_file(filepath: str) -> dict[str, list[dict]]:
--- a/lib/cascade.py
+++ b/lib/cascade.py
@ -9,7 +9,7 @@ the same atomic-write pattern as lib-state.sh.
 """

 import asyncio
-import secrets
+import hashlib
 import json
 import logging
 import os
@ -116,8 +116,8 @@ def _write_inbox_message(agent: str, subject: str, body: str) -> bool:
        return False

    ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
-    nonce = secrets.token_hex(3)
-    filename = f"cascade-{ts}-{nonce}-{subject[:60]}.md"
+    file_hash = hashlib.md5(f"{agent}-{subject}-{body[:200]}".encode()).hexdigest()[:8]
+    filename = f"cascade-{ts}-{subject[:60]}-{file_hash}.md"
    final_path = inbox_dir / filename

    try:
--- a/lib/config.py
+++ b/lib/config.py
@ -156,13 +156,13 @@ CONTRIBUTOR_TIER_RULES = {
    },
 }

-# Role weights for CI computation (must match core/contribution-architecture.md)
+# Role weights for CI computation (must match schemas/contribution-weights.yaml)
 CONTRIBUTION_ROLE_WEIGHTS = {
-    "challenger": 0.35,
-    "synthesizer": 0.25,
-    "reviewer": 0.20,
    "sourcer": 0.15,
-    "extractor": 0.05,
+    "extractor": 0.40,
+    "challenger": 0.20,
+    "synthesizer": 0.15,
+    "reviewer": 0.10,
 }

 # --- Circuit breakers ---
@ -200,9 +200,6 @@ MERGE_INTERVAL = 30
 FIX_INTERVAL = 60
 HEALTH_CHECK_INTERVAL = 60

-# --- Extraction gates ---
-EXTRACTION_COOLDOWN_HOURS = 4  # Skip sources with any PR activity in this window. Defense-in-depth for DB-status filter.
-
 # --- Retrieval (Telegram bot) ---
 RETRIEVAL_RRF_K = 20  # RRF smoothing constant — tuned for 5-10 results per source
 RETRIEVAL_ENTITY_BOOST = 1.5  # RRF score multiplier for claims wiki-linked from matched entities
--- a/lib/connect.py
+++ b/lib/connect.py
@ -63,7 +63,7 @@ def _build_search_text(content: str) -> str:
    return " ".join(parts)


-def _add_related_edges(claim_path: str, neighbor_slugs: list[str]) -> bool:
+def _add_related_edges(claim_path: str, neighbor_titles: list[str]) -> bool:
    """Add related edges to a claim's frontmatter. Returns True if modified."""
    try:
        with open(claim_path) as f:
@ -87,10 +87,10 @@ def _add_related_edges(claim_path: str, neighbor_slugs: list[str]) -> bool:

    # Add new edges
    added = []
-    for slug in neighbor_slugs:
-        if slug.strip().lower() not in existing_lower:
-            added.append(slug)
-            existing_lower.add(slug.strip().lower())
+    for title in neighbor_titles:
+        if title.strip().lower() not in existing_lower:
+            added.append(title)
+            existing_lower.add(title.strip().lower())

    if not added:
        return False
@ -107,6 +107,7 @@ def _add_related_edges(claim_path: str, neighbor_slugs: list[str]) -> bool:

 def connect_new_claims(
    claim_paths: list[str],
+    domain: str | None = None,
    threshold: float = CONNECT_THRESHOLD,
    max_neighbors: int = CONNECT_MAX_NEIGHBORS,
 ) -> dict:
@ -114,6 +115,7 @@ def connect_new_claims(

    Args:
        claim_paths: List of file paths to newly-written claim files.
+        domain: Optional domain filter for Qdrant search.
        threshold: Minimum cosine similarity for connection.
        max_neighbors: Maximum edges to add per claim.

@ -167,28 +169,27 @@ def connect_new_claims(
            stats["skipped_no_neighbors"] += 1
            continue

-        # Extract neighbor slugs (filename stems, not titles — reciprocal edges need resolvable names)
-        neighbor_slugs = []
+        # Extract neighbor titles
+        neighbor_titles = []
        for hit in hits:
            payload = hit.get("payload", {})
-            claim_path_qdrant = payload.get("claim_path", "")
-            if claim_path_qdrant:
-                slug = claim_path_qdrant.rsplit("/", 1)[-1].replace(".md", "")
-                neighbor_slugs.append(slug)
+            title = payload.get("claim_title", "")
+            if title:
+                neighbor_titles.append(title)

-        if not neighbor_slugs:
+        if not neighbor_titles:
            stats["skipped_no_neighbors"] += 1
            continue

        # Add edges to the new claim's frontmatter
-        if _add_related_edges(claim_path, neighbor_slugs):
+        if _add_related_edges(claim_path, neighbor_titles):
            stats["connected"] += 1
-            stats["edges_added"] += len(neighbor_slugs)
+            stats["edges_added"] += len(neighbor_titles)
            stats["connections"].append({
                "claim": os.path.basename(claim_path),
-                "neighbors": neighbor_slugs,
+                "neighbors": neighbor_titles,
            })
-            logger.info("Connected %s → %d neighbors", os.path.basename(claim_path), len(neighbor_slugs))
+            logger.info("Connected %s → %d neighbors", os.path.basename(claim_path), len(neighbor_titles))
        else:
            stats["skipped_no_neighbors"] += 1

--- a/lib/contributor.py
+++ b/lib/contributor.py
@ -1,491 +0,0 @@
-"""Contributor attribution — tracks who contributed what and calculates tiers.
-
-Extracted from merge.py (Phase 5 decomposition). Functions:
- is_knowledge_pr: diff classification (knowledge vs pipeline-only)
- refine_commit_type: extract → challenge/enrich refinement from diff content
- record_contributor_attribution: parse trailers + frontmatter, upsert contributors
- upsert_contributor: insert/update contributor record with role counts
- insert_contribution_event: event-sourced credit log (schema v24)
- recalculate_tier: tier promotion based on config rules
-"""
-
-import json
-import logging
-import re
-
-from . import config, db
-from .attribution import AGENT_BRANCH_PREFIXES, classify_kind, normalize_handle
-from .forgejo import get_pr_diff
-
-logger = logging.getLogger("pipeline.contributor")
-
-
-# ─── Event schema (v24) ───────────────────────────────────────────────────
-
-# Role → CI weight, per Cory's confirmed schema (Apr 24 conversation).
-# Humans-are-always-author rule: agents never accumulate author credit;
-# evaluator (0.05) is the only agent-facing role. Internal agents still earn
-# author/challenger/synthesizer on their own autonomous research PRs but
-# surface in the kind='agent' leaderboard, not the default person view.
-ROLE_WEIGHTS = {
-    "author": 0.30,
-    "challenger": 0.25,
-    "synthesizer": 0.20,
-    "originator": 0.15,
-    "evaluator": 0.05,
-}
-
-
-def insert_contribution_event(
-    conn,
-    handle: str,
-    role: str,
-    pr_number: int,
-    *,
-    claim_path: str | None = None,
-    domain: str | None = None,
-    channel: str | None = None,
-    timestamp: str | None = None,
-) -> bool:
-    """Emit a contribution_events row. Idempotent via UNIQUE constraint.
-
-    Returns True if the event was inserted, False if the constraint blocked it
-    (same handle/role/pr/claim_path combo already recorded — safe to replay).
-
-    Canonicalizes handle via alias table. Classifies kind from handle.
-    Falls back silently if contribution_events table doesn't exist yet (pre-v24).
-    """
-    if role not in ROLE_WEIGHTS:
-        logger.warning("insert_contribution_event: unknown role %r", role)
-        return False
-    weight = ROLE_WEIGHTS[role]
-    canonical = normalize_handle(handle, conn=conn)
-    if not canonical:
-        return False
-    kind = classify_kind(canonical)
-    try:
-        cur = conn.execute(
-            """INSERT OR IGNORE INTO contribution_events
-               (handle, kind, role, weight, pr_number, claim_path, domain, channel, timestamp)
-               VALUES (?, ?, ?, ?, ?, ?, ?, ?, COALESCE(?, datetime('now')))""",
-            (canonical, kind, role, weight, pr_number, claim_path, domain, channel, timestamp),
-        )
-        return cur.rowcount > 0
-    except Exception:
-        logger.debug("insert_contribution_event failed for pr=%d handle=%r role=%r",
-                     pr_number, canonical, role, exc_info=True)
-        return False
-
-
-def is_knowledge_pr(diff: str) -> bool:
-    """Check if a PR touches knowledge files (claims, decisions, core, foundations).
-
-    Knowledge PRs get full CI attribution weight.
-    Pipeline-only PRs (inbox, entities, agents, archive) get zero CI weight.
-
-    Mixed PRs count as knowledge — if a PR adds a claim, it gets attribution
-    even if it also moves source files. Knowledge takes priority. (Ganymede review)
-    """
-    knowledge_prefixes = ("domains/", "core/", "foundations/", "decisions/")
-
-    for line in diff.split("\n"):
-        if line.startswith("+++ b/") or line.startswith("--- a/"):
-            path = line.split("/", 1)[1] if "/" in line else ""
-            if any(path.startswith(p) for p in knowledge_prefixes):
-                return True
-
-    return False
-
-
-COMMIT_TYPE_TO_ROLE = {
-    "challenge": "challenger",
-    "enrich": "synthesizer",
-    "extract": "extractor",
-    "research": "synthesizer",
-    "entity": "extractor",
-    "reweave": "synthesizer",
-    "fix": "extractor",
-}
-
-
-def commit_type_to_role(commit_type: str) -> str:
-    """Map a refined commit_type to a contributor role."""
-    return COMMIT_TYPE_TO_ROLE.get(commit_type, "extractor")
-
-
-def refine_commit_type(diff: str, branch_commit_type: str) -> str:
-    """Refine commit_type from diff content when branch prefix is ambiguous.
-
-    Branch prefix gives initial classification (extract, research, entity, etc.).
-    For 'extract' branches, diff content can distinguish:
-    - challenge: adds challenged_by edges to existing claims
-    - enrich: modifies existing claim frontmatter without new files
-    - extract: creates new claim files (default for extract branches)
-
-    Only refines 'extract' type — other branch types (research, entity, reweave, fix)
-    are already specific enough.
-    """
-    if branch_commit_type != "extract":
-        return branch_commit_type
-
-    new_files = 0
-    modified_files = 0
-    has_challenge_edge = False
-
-    in_diff_header = False
-    current_is_new = False
-    for line in diff.split("\n"):
-        if line.startswith("diff --git"):
-            in_diff_header = True
-            current_is_new = False
-        elif line.startswith("new file"):
-            current_is_new = True
-        elif line.startswith("+++ b/"):
-            path = line[6:]
-            if any(path.startswith(p) for p in ("domains/", "core/", "foundations/")):
-                if current_is_new:
-                    new_files += 1
-                else:
-                    modified_files += 1
-            in_diff_header = False
-        elif line.startswith("+") and not line.startswith("+++"):
-            if "challenged_by:" in line or "challenges:" in line:
-                has_challenge_edge = True
-
-    if has_challenge_edge and new_files == 0:
-        return "challenge"
-    if modified_files > 0 and new_files == 0:
-        return "enrich"
-    return "extract"
-
-
-async def record_contributor_attribution(conn, pr_number: int, branch: str, git_fn):
-    """Record contributor attribution after a successful merge.
-
-    Parses git trailers and claim frontmatter to identify contributors
-    and their roles. Upserts into contributors table. Refines commit_type
-    from diff content. Pipeline-only PRs (no knowledge files) are skipped.
-
-    Args:
-        git_fn: async callable matching _git signature (for git log parsing).
-    """
-    from datetime import date as _date
-
-    today = _date.today().isoformat()
-
-    # Get the PR diff to parse claim frontmatter for attribution blocks
-    diff = await get_pr_diff(pr_number)
-    if not diff:
-        return
-
-    # Pipeline-only PRs (inbox, entities, agents) don't count toward CI
-    if not is_knowledge_pr(diff):
-        logger.info("PR #%d: pipeline-only commit — skipping CI attribution", pr_number)
-        return
-
-    # Refine commit_type from diff content (branch prefix may be too broad)
-    row = conn.execute(
-        "SELECT commit_type, submitted_by, domain, source_channel, leo_verdict, "
-        "domain_verdict, domain_agent, merged_at FROM prs WHERE number = ?",
-        (pr_number,),
-    ).fetchone()
-    branch_type = row["commit_type"] if row and row["commit_type"] else "extract"
-    refined_type = refine_commit_type(diff, branch_type)
-    if refined_type != branch_type:
-        conn.execute("UPDATE prs SET commit_type = ? WHERE number = ?", (refined_type, pr_number))
-        logger.info("PR #%d: commit_type refined %s → %s", pr_number, branch_type, refined_type)
-
-    # Schema v24 event-sourcing context. Fetched once per PR, reused across emit sites.
-    pr_domain = row["domain"] if row else None
-    pr_channel = row["source_channel"] if row else None
-    pr_submitted_by = row["submitted_by"] if row else None
-    # Use the PR's merged_at timestamp so event time matches the actual merge.
-    # If a merge retries after a crash, this keeps forward-emitted and backfilled
-    # events on the same timeline. Falls back to datetime('now') in the writer.
-    pr_merged_at = row["merged_at"] if row and row["merged_at"] else None
-
-    # ── AUTHOR event (schema v24, double-write) ──
-    # Humans-are-always-author rule: the human in the loop gets author credit.
-    # Precedence: prs.submitted_by (set by extract.py from source proposed_by, or
-    # by discover for human PRs) → git author of first commit → branch-prefix agent.
-    # Pentagon-owned infra branches (extract/ reweave/ fix/ ingestion/) don't get
-    # author events from branch prefix; extract/ PRs carry submitted_by from the
-    # source's proposed_by field so the human who submitted gets credit via path 1.
-    author_candidate: str | None = None
-    if pr_submitted_by:
-        author_candidate = pr_submitted_by
-    else:
-        # External GitHub PRs: git author of the FIRST commit on the branch is
-        # the real submitter. `git log -1` would return the latest commit, which
-        # mis-credits multi-commit PRs where a reviewer rebased or force-pushed.
-        # Take the last line of the unreversed log (= oldest commit, since git
-        # log defaults to reverse-chronological). Ganymede review, Apr 24.
-        rc_author_log, author_log = await git_fn(
-            "log", f"origin/main..origin/{branch}", "--no-merges",
-            "--format=%an", timeout=5,
-        )
-        if rc_author_log == 0 and author_log.strip():
-            lines = [line for line in author_log.strip().split("\n") if line.strip()]
-            if lines:
-                candidate = lines[-1].strip().lower()
-                if candidate and candidate not in {"teleo", "teleo-bot", "pipeline",
-                                                   "github-actions[bot]", "forgejo-actions"}:
-                    author_candidate = candidate
-        # Agent-owned branches with no submitted_by: theseus/research-*, leo/*, etc.
-        if not author_candidate and branch.startswith(AGENT_BRANCH_PREFIXES):
-            # Autonomous agent PR (theseus/research-*, leo/entity-*, etc.) —
-            # credit goes to the agent as author per Cory's directive.
-            author_candidate = branch.split("/", 1)[0]
-
-    if author_candidate:
-        insert_contribution_event(
-            conn, author_candidate, "author", pr_number,
-            claim_path=None, domain=pr_domain, channel=pr_channel,
-            timestamp=pr_merged_at,
-        )
-
-    # ── EVALUATOR events (schema v24) ──
-    # Leo reviews every PR (STANDARD/DEEP tiers). domain_agent is the second
-    # reviewer. Both earn evaluator credit (0.05) per approved PR. Skip when
-    # verdict is 'request_changes' — failed review isn't contribution credit.
-    if row:
-        if row["leo_verdict"] == "approve":
-            insert_contribution_event(
-                conn, "leo", "evaluator", pr_number,
-                claim_path=None, domain=pr_domain, channel=pr_channel,
-                timestamp=pr_merged_at,
-            )
-        if row["domain_verdict"] == "approve" and row["domain_agent"]:
-            dagent = row["domain_agent"].strip().lower()
-            if dagent and dagent != "leo":  # don't double-credit leo
-                insert_contribution_event(
-                    conn, dagent, "evaluator", pr_number,
-                    claim_path=None, domain=pr_domain, channel=pr_channel,
-                    timestamp=pr_merged_at,
-                )
-
-    # Parse Pentagon-Agent trailer from branch commit messages
-    agents_found: set[str] = set()
-    # Agent-owned branches (theseus/*, rio/*, etc.) give the trailer-named agent
-    # challenger/synthesizer credit based on refined commit_type. Pipeline-owned
-    # branches (extract/*, reweave/*, etc.) don't — those are infra, not work.
-    is_agent_branch = branch.startswith(AGENT_BRANCH_PREFIXES)
-    _TRAILER_EVENT_ROLE = {
-        "challenge": "challenger",
-        "enrich": "synthesizer",
-        "research": "synthesizer",
-        "reweave": "synthesizer",
-    }
-    rc, log_output = await git_fn(
-        "log", f"origin/main..origin/{branch}", "--format=%b%n%N",
-        timeout=10,
-    )
-    if rc == 0:
-        for match in re.finditer(r"Pentagon-Agent:\s*(\S+)\s*<([^>]+)>", log_output):
-            agent_name = match.group(1).lower()
-            agent_uuid = match.group(2)
-            role = commit_type_to_role(refined_type)
-            upsert_contributor(
-                conn, agent_name, agent_uuid, role, today,
-            )
-            # Event-emit only for agent-owned branches where the trailer's agent
-            # actually did the substantive work (challenger/synthesizer).
-            event_role = _TRAILER_EVENT_ROLE.get(refined_type)
-            if is_agent_branch and event_role:
-                insert_contribution_event(
-                    conn, agent_name, event_role, pr_number,
-                    claim_path=None, domain=pr_domain, channel=pr_channel,
-                    timestamp=pr_merged_at,
-                )
-            agents_found.add(agent_name)
-
-    # Parse attribution from NEWLY ADDED knowledge files via the canonical attribution
-    # parser (lib/attribution.py). The previous diff-line regex parser dropped
-    # both the bare-key flat format (`sourcer: alexastrum`) and the nested
-    # `attribution:` block format because it only matched `- handle: "X"` lines.
-    # The Apr 24 incident traced missing leaderboard entries (alexastrum=0,
-    # thesensatore=0, cameron-s1=0) directly to this parser's blind spots.
-    #
-    # --diff-filter=A restricts to added files only (Ganymede review): enrich and
-    # challenge PRs modify existing claims, and re-crediting the existing sourcer on
-    # every modification would inflate counts. The synthesizer/challenger/reviewer
-    # roles for those PRs are credited via the Pentagon-Agent trailer path above.
-    rc_files, files_output = await git_fn(
-        "diff", "--name-only", "--diff-filter=A",
-        f"origin/main...origin/{branch}", timeout=10,
-    )
-    if rc_files == 0 and files_output:
-        from pathlib import Path
-        from . import config
-        from .attribution import parse_attribution_from_file
-
-        main_root = Path(config.MAIN_WORKTREE)
-        # Match is_knowledge_pr's gate exactly. Entities/convictions are excluded
-        # here because is_knowledge_pr skips entity-only PRs at line 123 — so a
-        # broader list here only matters for mixed PRs where the narrower list
-        # already matches via the claim file. Widening requires Cory sign-off
-        # since it would change leaderboard accounting (entity-only PRs → CI credit).
-        knowledge_prefixes = ("domains/", "core/", "foundations/", "decisions/")
-        author_canonical = normalize_handle(author_candidate, conn=conn) if author_candidate else None
-        for rel_path in files_output.strip().split("\n"):
-            rel_path = rel_path.strip()
-            if not rel_path.endswith(".md"):
-                continue
-            if not rel_path.startswith(knowledge_prefixes):
-                continue
-            full = main_root / rel_path
-            if not full.exists():
-                continue  # file removed in this PR
-            attribution = parse_attribution_from_file(str(full))
-            for role, entries in attribution.items():
-                for entry in entries:
-                    handle = entry.get("handle")
-                    if handle:
-                        upsert_contributor(
-                            conn, handle, entry.get("agent_id"), role, today,
-                        )
-                        # Event-emit: only 'sourcer' frontmatter entries become
-                        # originator events. 'extractor' frontmatter = infrastructure
-                        # (the Sonnet extraction agent), no event. challenger/
-                        # synthesizer frontmatter is extremely rare at extract time.
-                        # Skip originator if same as author — avoids double-credit
-                        # when someone submits their own content (self-authored).
-                        if role == "sourcer":
-                            origin_canonical = normalize_handle(handle, conn=conn)
-                            if origin_canonical and origin_canonical != author_canonical:
-                                insert_contribution_event(
-                                    conn, handle, "originator", pr_number,
-                                    claim_path=rel_path,
-                                    domain=pr_domain, channel=pr_channel,
-                                    timestamp=pr_merged_at,
-                                )
-
-    # Fallback: if no Pentagon-Agent trailer found, try git commit authors
-    _BOT_AUTHORS = frozenset({
-        "m3taversal", "teleo", "teleo-bot", "pipeline",
-        "github-actions[bot]", "forgejo-actions",
-    })
-    if not agents_found:
-        rc_author, author_output = await git_fn(
-            "log", f"origin/main..origin/{branch}", "--no-merges",
-            "--format=%an", timeout=10,
-        )
-        if rc_author == 0 and author_output.strip():
-            for author_line in author_output.strip().split("\n"):
-                author_name = author_line.strip().lower()
-                if author_name and author_name not in _BOT_AUTHORS:
-                    role = commit_type_to_role(refined_type)
-                    upsert_contributor(conn, author_name, None, role, today)
-                    # Event-model parity: emit challenger/synthesizer event when
-                    # the fallback credits a human/agent for that kind of work.
-                    # Without this, external-contributor challenge/enrich PRs
-                    # accumulate legacy counts but disappear from event-sourced
-                    # leaderboards when Phase B cuts over. (Ganymede review.)
-                    event_role_fb = _TRAILER_EVENT_ROLE.get(refined_type)
-                    if event_role_fb:
-                        insert_contribution_event(
-                            conn, author_name, event_role_fb, pr_number,
-                            claim_path=None, domain=pr_domain, channel=pr_channel,
-                            timestamp=pr_merged_at,
-                        )
-                    agents_found.add(author_name)
-
-        if not agents_found:
-            fb_row = conn.execute(
-                "SELECT agent FROM prs WHERE number = ?", (pr_number,)
-            ).fetchone()
-            if fb_row and fb_row["agent"] and fb_row["agent"] != "external":
-                pr_agent = fb_row["agent"].lower()
-                role = commit_type_to_role(refined_type)
-                upsert_contributor(conn, pr_agent, None, role, today)
-                event_role_fb = _TRAILER_EVENT_ROLE.get(refined_type)
-                if event_role_fb:
-                    insert_contribution_event(
-                        conn, pr_agent, event_role_fb, pr_number,
-                        claim_path=None, domain=pr_domain, channel=pr_channel,
-                        timestamp=pr_merged_at,
-                    )
-
-
-def upsert_contributor(
-    conn, handle: str, agent_id: str | None, role: str, date_str: str,
-):
-    """Upsert a contributor record, incrementing the appropriate role count."""
-    role_col = f"{role}_count"
-    if role_col not in (
-        "sourcer_count", "extractor_count", "challenger_count",
-        "synthesizer_count", "reviewer_count",
-    ):
-        logger.warning("Unknown contributor role: %s", role)
-        return
-
-    existing = conn.execute(
-        "SELECT handle FROM contributors WHERE handle = ?", (handle,)
-    ).fetchone()
-
-    if existing:
-        conn.execute(
-            f"""UPDATE contributors SET
-                {role_col} = {role_col} + 1,
-                claims_merged = claims_merged + CASE WHEN ? IN ('extractor', 'sourcer') THEN 1 ELSE 0 END,
-                last_contribution = ?,
-                updated_at = datetime('now')
-            WHERE handle = ?""",
-            (role, date_str, handle),
-        )
-    else:
-        conn.execute(
-            f"""INSERT INTO contributors (handle, agent_id, first_contribution, last_contribution, {role_col}, claims_merged)
-            VALUES (?, ?, ?, ?, 1, CASE WHEN ? IN ('extractor', 'sourcer') THEN 1 ELSE 0 END)""",
-            (handle, agent_id, date_str, date_str, role),
-        )
-
-    # Recalculate tier
-    recalculate_tier(conn, handle)
-
-
-def recalculate_tier(conn, handle: str):
-    """Recalculate contributor tier based on config rules."""
-    from datetime import date as _date, datetime as _dt
-
-    row = conn.execute(
-        "SELECT claims_merged, challenges_survived, first_contribution, tier FROM contributors WHERE handle = ?",
-        (handle,),
-    ).fetchone()
-    if not row:
-        return
-
-    current_tier = row["tier"]
-    claims_merged = row["claims_merged"] or 0
-    challenges_survived = row["challenges_survived"] or 0
-    first_contribution = row["first_contribution"]
-
-    days_since_first = 0
-    if first_contribution:
-        try:
-            first_date = _dt.strptime(first_contribution, "%Y-%m-%d").date()
-            days_since_first = (_date.today() - first_date).days
-        except ValueError:
-            pass
-
-    # Check veteran first (higher tier)
-    vet_rules = config.CONTRIBUTOR_TIER_RULES["veteran"]
-    if (claims_merged >= vet_rules["claims_merged"]
-            and days_since_first >= vet_rules["min_days_since_first"]
-            and challenges_survived >= vet_rules["challenges_survived"]):
-        new_tier = "veteran"
-    elif claims_merged >= config.CONTRIBUTOR_TIER_RULES["contributor"]["claims_merged"]:
-        new_tier = "contributor"
-    else:
-        new_tier = "new"
-
-    if new_tier != current_tier:
-        conn.execute(
-            "UPDATE contributors SET tier = ?, updated_at = datetime('now') WHERE handle = ?",
-            (new_tier, handle),
-        )
-        logger.info("Contributor %s: tier %s → %s", handle, current_tier, new_tier)
-        db.audit(
-            conn, "contributor", "tier_change",
-            json.dumps({"handle": handle, "from": current_tier, "to": new_tier}),
-        )
--- a/lib/db.py
+++ b/lib/db.py
@ -9,7 +9,7 @@ from . import config

 logger = logging.getLogger("pipeline.db")

-SCHEMA_VERSION = 26
+SCHEMA_VERSION = 19

 SCHEMA_SQL = """
 CREATE TABLE IF NOT EXISTS schema_version (
@ -35,15 +35,6 @@ CREATE TABLE IF NOT EXISTS sources (
    feedback TEXT,
    -- eval feedback for re-extraction (JSON)
    cost_usd REAL DEFAULT 0,
-    -- v26: provenance — publisher (news org / venue) + content author.
-    -- publisher_id references publishers(id) when source is from a known org.
-    -- original_author_handle references contributors(handle) when author is in our system.
-    -- original_author is free-text fallback ("Kim et al.", "Robin Hanson") — not credit-bearing.
-    publisher_id INTEGER REFERENCES publishers(id),
-    content_type TEXT,
-    -- article | paper | tweet | conversation | self_authored | webpage | podcast
-    original_author TEXT,
-    original_author_handle TEXT REFERENCES contributors(handle),
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now'))
 );
@ -79,8 +70,6 @@ CREATE TABLE IF NOT EXISTS prs (
    last_attempt TEXT,
    cost_usd REAL DEFAULT 0,
    auto_merge INTEGER DEFAULT 0,
-    github_pr INTEGER,
-    source_channel TEXT,
    created_at TEXT DEFAULT (datetime('now')),
    merged_at TEXT
 );
@ -166,83 +155,11 @@ CREATE TABLE IF NOT EXISTS response_audit (
 CREATE INDEX IF NOT EXISTS idx_sources_status ON sources(status);
 CREATE INDEX IF NOT EXISTS idx_prs_status ON prs(status);
 CREATE INDEX IF NOT EXISTS idx_prs_domain ON prs(domain);
-CREATE INDEX IF NOT EXISTS idx_prs_source_path ON prs(source_path) WHERE source_path IS NOT NULL;
 CREATE INDEX IF NOT EXISTS idx_costs_date ON costs(date);
 CREATE INDEX IF NOT EXISTS idx_audit_stage ON audit_log(stage);
 CREATE INDEX IF NOT EXISTS idx_response_audit_ts ON response_audit(timestamp);
 CREATE INDEX IF NOT EXISTS idx_response_audit_agent ON response_audit(agent);
 CREATE INDEX IF NOT EXISTS idx_response_audit_chat_ts ON response_audit(chat_id, timestamp);
-
-- Event-sourced contributions (schema v24).
-- One row per credit-earning event. Idempotent via two partial UNIQUE indexes
-- (SQLite treats NULL != NULL in UNIQUE constraints, so a single composite
-- UNIQUE with nullable claim_path would allow evaluator-event duplicates).
-- Leaderboards are SQL aggregations over this table; contributors becomes a materialized cache.
-CREATE TABLE IF NOT EXISTS contribution_events (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    handle TEXT NOT NULL,
-    kind TEXT NOT NULL DEFAULT 'person',
-    -- person | org | agent
-    role TEXT NOT NULL,
-    -- author | originator | challenger | synthesizer | evaluator
-    weight REAL NOT NULL,
-    pr_number INTEGER NOT NULL,
-    claim_path TEXT,
-    -- NULL for PR-level events (e.g. evaluator). Set for per-claim events.
-    domain TEXT,
-    channel TEXT,
-    -- telegram | github | agent | web | unknown
-    timestamp TEXT NOT NULL DEFAULT (datetime('now'))
-);
-- Per-claim events: unique on (handle, role, pr_number, claim_path) when path IS NOT NULL.
-CREATE UNIQUE INDEX IF NOT EXISTS idx_ce_unique_claim ON contribution_events(
-    handle, role, pr_number, claim_path
-) WHERE claim_path IS NOT NULL;
-- PR-level events (evaluator, author, trailer-based): unique on (handle, role, pr_number) when path IS NULL.
-CREATE UNIQUE INDEX IF NOT EXISTS idx_ce_unique_pr ON contribution_events(
-    handle, role, pr_number
-) WHERE claim_path IS NULL;
-CREATE INDEX IF NOT EXISTS idx_ce_handle_ts ON contribution_events(handle, timestamp);
-CREATE INDEX IF NOT EXISTS idx_ce_domain_ts ON contribution_events(domain, timestamp);
-CREATE INDEX IF NOT EXISTS idx_ce_pr ON contribution_events(pr_number);
-CREATE INDEX IF NOT EXISTS idx_ce_role_ts ON contribution_events(role, timestamp);
-CREATE INDEX IF NOT EXISTS idx_ce_kind_ts ON contribution_events(kind, timestamp);
-
-- Handle aliasing. @thesensatore → thesensatore. cameron → cameron-s1.
-- Writers call resolve_alias(handle) before inserting events or upserting contributors.
-CREATE TABLE IF NOT EXISTS contributor_aliases (
-    alias TEXT PRIMARY KEY,
-    canonical TEXT NOT NULL,
-    created_at TEXT DEFAULT (datetime('now'))
-);
-CREATE INDEX IF NOT EXISTS idx_aliases_canonical ON contributor_aliases(canonical);
-
-- Publishers: news orgs, academic venues, social platforms. NOT contributors — these
-- provide metadata/provenance for sources, never earn leaderboard credit. Separating
-- these from contributors prevents CNBC/SpaceNews from dominating the leaderboard.
-- (Apr 24 Cory directive: "only credit the original source if its on X or tg")
-CREATE TABLE IF NOT EXISTS publishers (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    name TEXT NOT NULL UNIQUE,
-    kind TEXT CHECK(kind IN ('news', 'academic', 'social_platform', 'podcast', 'self', 'internal', 'legal', 'government', 'research_org', 'commercial', 'other')),
-    url_pattern TEXT,
-    created_at TEXT DEFAULT (datetime('now'))
-);
-CREATE INDEX IF NOT EXISTS idx_publishers_name ON publishers(name);
-CREATE INDEX IF NOT EXISTS idx_publishers_kind ON publishers(kind);
-
-- Multi-platform identity: one contributor, many handles. Enables the leaderboard to
-- unify @thesensatore (X) + thesensatore (TG) + thesensatore@github into one person.
-- Writers check this table after resolving aliases to find canonical contributor handle.
-CREATE TABLE IF NOT EXISTS contributor_identities (
-    contributor_handle TEXT NOT NULL,
-    platform TEXT NOT NULL CHECK(platform IN ('x', 'telegram', 'github', 'email', 'web', 'internal')),
-    platform_handle TEXT NOT NULL,
-    verified INTEGER DEFAULT 0,
-    created_at TEXT DEFAULT (datetime('now')),
-    PRIMARY KEY (platform, platform_handle)
-);
-CREATE INDEX IF NOT EXISTS idx_identities_contributor ON contributor_identities(contributor_handle);
 """


@ -278,7 +195,6 @@ def transaction(conn: sqlite3.Connection):
 # Branch prefix → (agent, commit_type) mapping.
 # Single source of truth — used by merge.py at INSERT time and migration v7 backfill.
 # Unknown prefixes → ('unknown', 'unknown') + warning log.
-# Keep in sync with _CHANNEL_MAP below.
 BRANCH_PREFIX_MAP = {
    "extract": ("pipeline", "extract"),
    "ingestion": ("pipeline", "extract"),
@ -291,7 +207,6 @@ BRANCH_PREFIX_MAP = {
    "leo": ("leo", "entity"),
    "reweave": ("pipeline", "reweave"),
    "fix": ("pipeline", "fix"),
-    "contrib": ("external", "contrib"),
 }


@ -301,9 +216,6 @@ def classify_branch(branch: str) -> tuple[str, str]:
    Returns ('unknown', 'unknown') and logs a warning for unrecognized prefixes.
    """
    prefix = branch.split("/", 1)[0] if "/" in branch else branch
-    # Fork PR branches: gh-pr-N/original-branch
-    if prefix.startswith("gh-pr-"):
-        return ("external", "contrib")
    result = BRANCH_PREFIX_MAP.get(prefix)
    if result is None:
        logger.warning("Unknown branch prefix %r in branch %r — defaulting to ('unknown', 'unknown')", prefix, branch)
@ -311,47 +223,6 @@ def classify_branch(branch: str) -> tuple[str, str]:
    return result


-# Keep in sync with BRANCH_PREFIX_MAP above.
-#
-# Valid source_channel values: github | telegram | agent | maintenance | web | unknown
-#   - github: external contributor PR (set via sync-mirror.sh github_pr linking,
-#     or from gh-pr-* branches, or any time github_pr is provided)
-#   - telegram: message captured by telegram bot (must be tagged explicitly by
-#     ingestion — extract/* default is "unknown" because the bare branch prefix
-#     can no longer distinguish telegram-origin from github-origin extractions)
-#   - agent: per-agent research branches (rio/, theseus/, etc.)
-#   - maintenance: pipeline housekeeping (reweave/, epimetheus/, fix/)
-#   - web: future in-app submissions (chat UI or form posts)
-#   - unknown: fallback when provenance cannot be determined
-_CHANNEL_MAP = {
-    "extract": "unknown",
-    "ingestion": "unknown",
-    "rio": "agent",
-    "theseus": "agent",
-    "astra": "agent",
-    "vida": "agent",
-    "clay": "agent",
-    "leo": "agent",
-    "oberon": "agent",
-    "reweave": "maintenance",
-    "epimetheus": "maintenance",
-    "fix": "maintenance",
-}
-
-
-def classify_source_channel(branch: str, *, github_pr: int = None) -> str:
-    """Derive source_channel from branch prefix and github_pr flag.
-
-    Precedence: github_pr flag > gh-pr- branch prefix > _CHANNEL_MAP lookup.
-    extract/* defaults to "unknown" — callers with better provenance (telegram
-    bot, web submission handler) must override at PR-insert time.
-    """
-    if github_pr is not None or branch.startswith("gh-pr-"):
-        return "github"
-    prefix = branch.split("/", 1)[0] if "/" in branch else branch
-    return _CHANNEL_MAP.get(prefix, "unknown")
-
-
 def migrate(conn: sqlite3.Connection):
    """Run schema migrations."""
    conn.executescript(SCHEMA_SQL)
@ -608,9 +479,6 @@ def migrate(conn: sqlite3.Connection):
        logger.info("Migration v11: added auto_merge column to prs table")


-    # v12-v16 ran manually on VPS before code was version-controlled.
-    # Their changes are consolidated into v17+ migrations below.
-
    if current < 17:
        # Add prompt/pipeline version tracking per PR
        for col, default in [
@ -662,189 +530,6 @@ def migrate(conn: sqlite3.Connection):
        conn.commit()
        logger.info("Migration v19: added submitted_by to prs and sources tables")

-    if current < 20:
-        for col, default in [
-            ("conflict_rebase_attempts", "INTEGER DEFAULT 0"),
-            ("merge_failures", "INTEGER DEFAULT 0"),
-            ("merge_cycled", "INTEGER DEFAULT 0"),
-        ]:
-            try:
-                conn.execute(f"ALTER TABLE prs ADD COLUMN {col} {default}")
-            except sqlite3.OperationalError:
-                pass
-        conn.commit()
-        logger.info("Migration v20: added conflict retry columns to prs")
-
-    if current < 21:
-        try:
-            conn.execute("ALTER TABLE prs ADD COLUMN github_pr INTEGER")
-        except sqlite3.OperationalError:
-            pass
-        conn.execute(
-            "CREATE INDEX IF NOT EXISTS idx_prs_github_pr ON prs (github_pr) WHERE github_pr IS NOT NULL"
-        )
-        conn.commit()
-        logger.info("Migration v21: added github_pr column + index to prs")
-
-    if current < 22:
-        try:
-            conn.execute("ALTER TABLE prs ADD COLUMN source_channel TEXT")
-        except sqlite3.OperationalError:
-            pass
-        conn.execute("""
-            UPDATE prs SET source_channel = CASE
-                WHEN github_pr IS NOT NULL THEN 'github'
-                WHEN branch LIKE 'gh-pr-%%' THEN 'github'
-                WHEN branch LIKE 'theseus/%%' THEN 'agent'
-                WHEN branch LIKE 'rio/%%' THEN 'agent'
-                WHEN branch LIKE 'astra/%%' THEN 'agent'
-                WHEN branch LIKE 'clay/%%' THEN 'agent'
-                WHEN branch LIKE 'vida/%%' THEN 'agent'
-                WHEN branch LIKE 'oberon/%%' THEN 'agent'
-                WHEN branch LIKE 'leo/%%' THEN 'agent'
-                WHEN branch LIKE 'reweave/%%' THEN 'maintenance'
-                WHEN branch LIKE 'epimetheus/%%' THEN 'maintenance'
-                WHEN branch LIKE 'fix/%%' THEN 'maintenance'
-                WHEN branch LIKE 'extract/%%' THEN 'telegram'
-                WHEN branch LIKE 'ingestion/%%' THEN 'telegram'
-                ELSE 'unknown'
-            END
-            WHERE source_channel IS NULL
-        """)
-        conn.commit()
-        logger.info("Migration v22: added source_channel to prs + backfilled from branch prefix")
-
-    if current < 23:
-        conn.execute(
-            "CREATE INDEX IF NOT EXISTS idx_prs_source_path ON prs(source_path) WHERE source_path IS NOT NULL"
-        )
-        conn.commit()
-        logger.info("Migration v23: added idx_prs_source_path for auto-close dedup lookup")
-
-    if current < 24:
-        # Event-sourced contributions table + alias table + kind column on contributors.
-        # Non-breaking: contributors table stays; events are written in addition via
-        # double-write in merge.py. Leaderboards switch to events in Phase B.
-        conn.executescript("""
-            CREATE TABLE IF NOT EXISTS contribution_events (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                handle TEXT NOT NULL,
-                kind TEXT NOT NULL DEFAULT 'person',
-                role TEXT NOT NULL,
-                weight REAL NOT NULL,
-                pr_number INTEGER NOT NULL,
-                claim_path TEXT,
-                domain TEXT,
-                channel TEXT,
-                timestamp TEXT NOT NULL DEFAULT (datetime('now'))
-            );
-            -- Partial unique indexes handle SQLite's NULL != NULL UNIQUE semantics.
-            -- Per-claim events dedup on 4-tuple; PR-level events dedup on 3-tuple.
-            CREATE UNIQUE INDEX IF NOT EXISTS idx_ce_unique_claim ON contribution_events(
-                handle, role, pr_number, claim_path
-            ) WHERE claim_path IS NOT NULL;
-            CREATE UNIQUE INDEX IF NOT EXISTS idx_ce_unique_pr ON contribution_events(
-                handle, role, pr_number
-            ) WHERE claim_path IS NULL;
-            CREATE INDEX IF NOT EXISTS idx_ce_handle_ts ON contribution_events(handle, timestamp);
-            CREATE INDEX IF NOT EXISTS idx_ce_domain_ts ON contribution_events(domain, timestamp);
-            CREATE INDEX IF NOT EXISTS idx_ce_pr ON contribution_events(pr_number);
-            CREATE INDEX IF NOT EXISTS idx_ce_role_ts ON contribution_events(role, timestamp);
-            CREATE INDEX IF NOT EXISTS idx_ce_kind_ts ON contribution_events(kind, timestamp);
-
-            CREATE TABLE IF NOT EXISTS contributor_aliases (
-                alias TEXT PRIMARY KEY,
-                canonical TEXT NOT NULL,
-                created_at TEXT DEFAULT (datetime('now'))
-            );
-            CREATE INDEX IF NOT EXISTS idx_aliases_canonical ON contributor_aliases(canonical);
-        """)
-        try:
-            conn.execute("ALTER TABLE contributors ADD COLUMN kind TEXT DEFAULT 'person'")
-        except sqlite3.OperationalError:
-            pass  # column already exists
-        # Seed known aliases. @thesensatore → thesensatore catches the zombie row Argus flagged.
-        # cameron → cameron-s1 reconciles the Leo-flagged missing contributor.
-        conn.executemany(
-            "INSERT OR IGNORE INTO contributor_aliases (alias, canonical) VALUES (?, ?)",
-            [
-                ("@thesensatore", "thesensatore"),
-                ("cameron", "cameron-s1"),
-            ],
-        )
-        # Seed kind='agent' for known Pentagon agents so the events writer picks it up.
-        # Must stay in sync with lib/attribution.PENTAGON_AGENTS — drift causes
-        # contributors.kind to disagree with classify_kind() output for future
-        # inserts. (Ganymede review: "pipeline" was missing until Apr 24.)
-        pentagon_agents = [
-            "rio", "leo", "theseus", "vida", "clay", "astra",
-            "oberon", "argus", "rhea", "ganymede", "epimetheus", "hermes", "ship",
-            "pipeline",
-        ]
-        for agent in pentagon_agents:
-            conn.execute(
-                "UPDATE contributors SET kind = 'agent' WHERE handle = ?",
-                (agent,),
-            )
-        conn.commit()
-        logger.info("Migration v24: added contribution_events + contributor_aliases tables, kind column")
-
-    if current < 25:
-        # v24 seeded 13 Pentagon agents but missed "pipeline" — classify_kind()
-        # treats it as agent so contributors.kind drifted from event-insert output.
-        # Idempotent corrective UPDATE: fresh installs have no "pipeline" row
-        # (no-op), upgraded envs flip it if it exists. (Ganymede review Apr 24.)
-        conn.execute(
-            "UPDATE contributors SET kind = 'agent' WHERE handle = 'pipeline'"
-        )
-        conn.commit()
-        logger.info("Migration v25: patched kind='agent' for pipeline handle")
-
-    if current < 26:
-        # Add publishers + contributor_identities. Non-breaking — new tables only.
-        # No existing data moved. Classification into publishers happens via a
-        # separate script (scripts/reclassify-contributors.py) with Cory-reviewed
-        # seed list. CHECK constraint on contributors.kind deferred to v27 after
-        # classification completes. (Apr 24 Cory directive: "fix schema, don't
-        # filter output" — separate contributors from publishers at the data layer.)
-        conn.executescript("""
-            CREATE TABLE IF NOT EXISTS publishers (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                name TEXT NOT NULL UNIQUE,
-                kind TEXT CHECK(kind IN ('news', 'academic', 'social_platform', 'podcast', 'self', 'internal', 'legal', 'government', 'research_org', 'commercial', 'other')),
-                url_pattern TEXT,
-                created_at TEXT DEFAULT (datetime('now'))
-            );
-            CREATE INDEX IF NOT EXISTS idx_publishers_name ON publishers(name);
-            CREATE INDEX IF NOT EXISTS idx_publishers_kind ON publishers(kind);
-
-            CREATE TABLE IF NOT EXISTS contributor_identities (
-                contributor_handle TEXT NOT NULL,
-                platform TEXT NOT NULL CHECK(platform IN ('x', 'telegram', 'github', 'email', 'web', 'internal')),
-                platform_handle TEXT NOT NULL,
-                verified INTEGER DEFAULT 0,
-                created_at TEXT DEFAULT (datetime('now')),
-                PRIMARY KEY (platform, platform_handle)
-            );
-            CREATE INDEX IF NOT EXISTS idx_identities_contributor ON contributor_identities(contributor_handle);
-        """)
-        # Extend sources with provenance columns. ALTER TABLE ADD COLUMN is
-        # idempotent-safe via try/except because SQLite doesn't support IF NOT EXISTS
-        # on column adds.
-        for col_sql in (
-            "ALTER TABLE sources ADD COLUMN publisher_id INTEGER REFERENCES publishers(id)",
-            "ALTER TABLE sources ADD COLUMN content_type TEXT",
-            "ALTER TABLE sources ADD COLUMN original_author TEXT",
-            "ALTER TABLE sources ADD COLUMN original_author_handle TEXT REFERENCES contributors(handle)",
-        ):
-            try:
-                conn.execute(col_sql)
-            except sqlite3.OperationalError as e:
-                if "duplicate column" not in str(e).lower():
-                    raise
-        conn.commit()
-        logger.info("Migration v26: added publishers + contributor_identities tables + sources provenance columns")
-
    if current < SCHEMA_VERSION:
        conn.execute(
            "INSERT OR REPLACE INTO schema_version (version) VALUES (?)",
--- a/lib/domains.py
+++ b/lib/domains.py
@ -37,11 +37,6 @@ _AGENT_PRIMARY_DOMAIN: dict[str, str] = {
    "leo": "grand-strategy",
 }

-_INGESTION_SOURCE_DOMAIN: dict[str, str] = {
-    "futardio": "internet-finance",
-    "metadao": "internet-finance",
-}
-

 def agent_for_domain(domain: str | None) -> str:
    """Get the reviewing agent for a domain. Falls back to Leo."""
@ -87,14 +82,6 @@ def detect_domain_from_branch(branch: str) -> str | None:
    """Extract domain from branch name like 'rio/claims-futarchy' → 'internet-finance'.

    Uses agent prefix → primary domain mapping for pipeline branches.
-    For ingestion branches, checks the rest of the name for source-type hints.
    """
    prefix = branch.split("/")[0].lower() if "/" in branch else ""
-    if prefix in _AGENT_PRIMARY_DOMAIN:
-        return _AGENT_PRIMARY_DOMAIN[prefix]
-    if prefix == "ingestion":
-        rest = branch.split("/", 1)[1].lower() if "/" in branch else ""
-        for source_key, domain in _INGESTION_SOURCE_DOMAIN.items():
-            if source_key in rest:
-                return domain
-    return None
+    return _AGENT_PRIMARY_DOMAIN.get(prefix)
--- a/lib/eval_actions.py
+++ b/lib/eval_actions.py
@ -1,260 +0,0 @@
-"""PR disposition actions — async Forgejo + DB operations for end-of-eval decisions.
-
-Extracted from evaluate.py to isolate the "do something to this PR" functions
-from orchestration logic. Contains:
-
- post_formal_approvals: submit Forgejo reviews from 2 agents (not PR author)
- terminate_pr: close PR, post rejection comment, requeue source
- dispose_rejected_pr: disposition logic for rejected PRs on attempt 2+
-
-All functions are async (Forgejo API calls). Dependencies: forgejo, db, config,
-pr_state, feedback, eval_parse.
-"""
-
-import asyncio
-import json
-import logging
-
-from . import config, db
-from .eval_parse import classify_issues
-from .feedback import format_rejection_comment
-from .forgejo import api as forgejo_api, get_agent_token, get_pr_diff, repo_path
-from .github_feedback import on_closed, on_eval_complete
-from .pr_state import close_pr
-
-logger = logging.getLogger("pipeline.eval_actions")
-
-
-async def post_formal_approvals(pr_number: int, pr_author: str):
-    """Submit formal Forgejo reviews from 2 agents (not the PR author)."""
-    approvals = 0
-    for agent_name in ["leo", "vida", "theseus", "clay", "astra", "rio"]:
-        if agent_name == pr_author:
-            continue
-        if approvals >= 2:
-            break
-        token = get_agent_token(agent_name)
-        if token:
-            result = await forgejo_api(
-                "POST",
-                repo_path(f"pulls/{pr_number}/reviews"),
-                {"body": "Approved.", "event": "APPROVED"},
-                token=token,
-            )
-            if result is not None:
-                approvals += 1
-                logger.debug("Formal approval for PR #%d by %s (%d/2)", pr_number, agent_name, approvals)
-
-
-async def terminate_pr(conn, pr_number: int, reason: str):
-    """Terminal state: close PR on Forgejo, mark source needs_human."""
-    # Get issue tags for structured feedback
-    row = conn.execute("SELECT eval_issues, agent FROM prs WHERE number = ?", (pr_number,)).fetchone()
-    issues = []
-    if row and row["eval_issues"]:
-        try:
-            issues = json.loads(row["eval_issues"])
-        except (json.JSONDecodeError, TypeError):
-            pass
-
-    # Post structured rejection comment with quality gate guidance
-    if issues:
-        feedback_body = format_rejection_comment(issues, source="eval_terminal")
-        comment_body = (
-            f"**Closed by eval pipeline** — {reason}.\n\n"
-            f"Evaluated {config.MAX_EVAL_ATTEMPTS} times without passing. "
-            f"Source will be re-queued with feedback.\n\n"
-            f"{feedback_body}"
-        )
-    else:
-        comment_body = (
-            f"**Closed by eval pipeline** — {reason}.\n\n"
-            f"Evaluated {config.MAX_EVAL_ATTEMPTS} times without passing. "
-            f"Source will be re-queued with feedback."
-        )
-
-    await forgejo_api(
-        "POST",
-        repo_path(f"issues/{pr_number}/comments"),
-        {"body": comment_body},
-    )
-    closed = await close_pr(conn, pr_number, last_error=reason)
-    if not closed:
-        logger.warning("PR #%d: Forgejo close failed — skipping source requeue, will retry next cycle", pr_number)
-        return
-
-    try:
-        await on_closed(conn, pr_number, reason=reason)
-    except Exception:
-        logger.exception("PR #%d: GitHub close feedback failed (non-fatal)", pr_number)
-
-    # Tag source for re-extraction with feedback
-    cursor = conn.execute(
-        """UPDATE sources SET status = 'needs_reextraction',
-           updated_at = datetime('now')
-           WHERE path = (SELECT source_path FROM prs WHERE number = ?)""",
-        (pr_number,),
-    )
-    if cursor.rowcount == 0:
-        logger.warning("PR #%d: no source_path linked — source not requeued for re-extraction", pr_number)
-
-    db.audit(
-        conn,
-        "evaluate",
-        "pr_terminated",
-        json.dumps(
-            {
-                "pr": pr_number,
-                "reason": reason,
-            }
-        ),
-    )
-    logger.info("PR #%d: TERMINATED — %s", pr_number, reason)
-
-
-async def dispose_rejected_pr(conn, pr_number: int, eval_attempts: int, all_issues: list[str]):
-    """Disposition logic for rejected PRs on attempt 2+.
-
-    Auto-close gate (all attempts): near-duplicate of an already-merged PR for
-    the same source — close immediately. Avoids the Apr 22 runaway-damage
-    pattern where a source extracted 20+ times in a short window produced
-    dozens of open PRs that all had to be closed manually.
-
-    Attempt 1: normal — back to open, wait for fix.
-    Attempt 2: check issue classification.
-      - Mechanical only: keep open for one more attempt (auto-fix future).
-      - Substantive or mixed: close PR, requeue source.
-    Attempt 3+: terminal.
-    """
-    # Auto-close near-duplicate when a merged sibling for the same source exists.
-    # Runs before the attempt-count branches so it catches the common runaway
-    # case on attempt 1 instead of waiting for attempt 2's terminate path.
-    #
-    # Exact-match requirement (Ganymede review): compound rejections like
-    # ["near_duplicate", "factual_discrepancy"] carry signal about the merged
-    # sibling being wrong or limited — we want humans to see those. Only the
-    # pure single-issue case is safe to auto-close.
-    if all_issues == ["near_duplicate"]:
-        existing_merged = conn.execute(
-            """SELECT p2.number, p1.source_path FROM prs p1
-               JOIN prs p2 ON p2.source_path = p1.source_path
-               WHERE p1.number = ?
-                 AND p1.source_path IS NOT NULL
-                 AND p2.number != p1.number
-                 AND p2.status = 'merged'
-               LIMIT 1""",
-            (pr_number,),
-        ).fetchone()
-        if existing_merged:
-            sibling = existing_merged[0]
-            source_path = existing_merged[1]
-
-            # Enrichment guard: LLM reviewers can flag enrichment prose as
-            # "redundant" via eval_parse regex, tagging near_duplicate even
-            # though validate.py's structural check only fires on NEW files.
-            # If the PR only MODIFIES existing files (no "new file mode" in
-            # diff), it's an enrichment — skip auto-close so a human reviews.
-            #
-            # 10s timeout bounds damage when Forgejo is wedged (Apr 22 incident:
-            # hung for 2.5h). Conservative fallback: skip auto-close on any
-            # failure — fall through to normal rejection path.
-            try:
-                diff = await asyncio.wait_for(get_pr_diff(pr_number), timeout=10)
-            except (asyncio.TimeoutError, Exception):
-                logger.warning(
-                    "PR #%d: diff fetch failed/timed out for near-dup guard — skipping auto-close",
-                    pr_number, exc_info=True,
-                )
-                diff = None
-
-            if not diff:
-                # None or empty — conservative fallback, fall through to attempt-count branches
-                pass
-            elif "new file mode" not in diff:
-                logger.info(
-                    "PR #%d: near_duplicate but modifies-only (enrichment) — skipping auto-close",
-                    pr_number,
-                )
-            else:
-                logger.info(
-                    "PR #%d: auto-closing near-duplicate of merged PR #%d (same source)",
-                    pr_number, sibling,
-                )
-                # Post a brief explanation before closing (best-effort — non-fatal)
-                try:
-                    await forgejo_api(
-                        "POST",
-                        repo_path(f"issues/{pr_number}/comments"),
-                        {"body": (
-                            f"Auto-closed: near-duplicate of already-merged PR "
-                            f"#{sibling} (same source: `{source_path}`)."
-                        )},
-                    )
-                except Exception:
-                    logger.debug("PR #%d: auto-close comment failed (non-fatal)", pr_number, exc_info=True)
-                await close_pr(
-                    conn, pr_number,
-                    last_error=f"auto_closed_near_duplicate: merged sibling #{sibling}",
-                )
-                db.audit(
-                    conn, "evaluate", "auto_closed_near_duplicate",
-                    json.dumps({
-                        "pr": pr_number,
-                        "merged_sibling": sibling,
-                        "source_path": source_path,
-                        "eval_attempts": eval_attempts,
-                    }),
-                )
-                return
-
-    if eval_attempts < 2:
-        # Attempt 1: post structured feedback so agent learns, but don't close
-        if all_issues:
-            feedback_body = format_rejection_comment(all_issues, source="eval_attempt_1")
-            await forgejo_api(
-                "POST",
-                repo_path(f"issues/{pr_number}/comments"),
-                {"body": feedback_body},
-            )
-        return
-
-    classification = classify_issues(all_issues)
-
-    if eval_attempts >= config.MAX_EVAL_ATTEMPTS:
-        # Terminal
-        await terminate_pr(conn, pr_number, f"eval budget exhausted after {eval_attempts} attempts")
-        return
-
-    if classification == "mechanical":
-        # Mechanical issues only — keep open for one more attempt.
-        # Future: auto-fix module will push fixes here.
-        logger.info(
-            "PR #%d: attempt %d, mechanical issues only (%s) — keeping open for fix attempt",
-            pr_number,
-            eval_attempts,
-            all_issues,
-        )
-        db.audit(
-            conn,
-            "evaluate",
-            "mechanical_retry",
-            json.dumps(
-                {
-                    "pr": pr_number,
-                    "attempt": eval_attempts,
-                    "issues": all_issues,
-                }
-            ),
-        )
-    else:
-        # Substantive, mixed, or unknown — close and requeue
-        logger.info(
-            "PR #%d: attempt %d, %s issues (%s) — closing and requeuing source",
-            pr_number,
-            eval_attempts,
-            classification,
-            all_issues,
-        )
-        await terminate_pr(
-            conn, pr_number, f"substantive issues after {eval_attempts} attempts: {', '.join(all_issues)}"
-        )
--- a/lib/eval_parse.py
+++ b/lib/eval_parse.py
@ -1,434 +0,0 @@
-"""Pure parsing functions for the eval stage — zero I/O, zero async.
-
-Extracted from evaluate.py to isolate testable parsing logic from
-orchestration, DB, and Forgejo API calls.
-
-Contents:
- Diff helpers: filter, classify, tier routing
- Verdict/issue parsing: structured tags + prose inference
- Batch response parsing: fan-out validation
-
-All functions are pure (input → output). The only external dependency
-is config.MECHANICAL_ISSUE_TAGS / config.SUBSTANTIVE_ISSUE_TAGS for
-classify_issues.
-"""
-
-import logging
-import re
-
-from . import config
-
-logger = logging.getLogger("pipeline.eval_parse")
-
-
-# ─── Diff helpers ──────────────────────────────────────────────────────────
-
-
-def filter_diff(diff: str) -> tuple[str, str]:
-    """Filter diff to only review-relevant files.
-
-    Returns (review_diff, entity_diff).
-    Strips: inbox/, schemas/, skills/, agents/*/musings/
-    """
-    sections = re.split(r"(?=^diff --git )", diff, flags=re.MULTILINE)
-    skip_patterns = [r"^diff --git a/(inbox/(archive|queue|null-result)|schemas|skills|agents/[^/]+/musings)/"]
-    core_domains = {"living-agents", "living-capital", "teleohumanity", "mechanisms"}
-
-    claim_sections = []
-    entity_sections = []
-
-    for section in sections:
-        if not section.strip():
-            continue
-        if any(re.match(p, section) for p in skip_patterns):
-            continue
-        entity_match = re.match(r"^diff --git a/entities/([^/]+)/", section)
-        if entity_match and entity_match.group(1) not in core_domains:
-            entity_sections.append(section)
-            continue
-        claim_sections.append(section)
-
-    return "".join(claim_sections), "".join(entity_sections)
-
-
-def extract_changed_files(diff: str) -> str:
-    """Extract changed file paths from diff."""
-    return "\n".join(
-        line.replace("diff --git a/", "").split(" b/")[0] for line in diff.split("\n") if line.startswith("diff --git")
-    )
-
-
-def is_musings_only(diff: str) -> bool:
-    """Check if PR only modifies musing files."""
-    has_musings = False
-    has_other = False
-    for line in diff.split("\n"):
-        if line.startswith("diff --git"):
-            if "agents/" in line and "/musings/" in line:
-                has_musings = True
-            else:
-                has_other = True
-    return has_musings and not has_other
-
-
-def diff_contains_claim_type(diff: str) -> bool:
-    """Claim-shape detector: check if any file in diff has type: claim in frontmatter.
-
-    Mechanical check ($0). If YAML declares type: claim, this is a factual claim —
-    not an entity update or formatting fix. Must be classified STANDARD minimum
-    regardless of Haiku triage. Catches factual claims disguised as LIGHT content.
-    (Theseus: converts semantic problem to mechanical check)
-    """
-    for line in diff.split("\n"):
-        if line.startswith("+") and not line.startswith("+++"):
-            stripped = line[1:].strip()
-            if stripped in ("type: claim", 'type: "claim"', "type: 'claim'"):
-                return True
-    return False
-
-
-def deterministic_tier(diff: str) -> str | None:
-    """Deterministic tier routing — skip Haiku triage for obvious cases.
-
-    Checks diff file patterns before calling the LLM. Returns tier string
-    if deterministic, None if Haiku triage is needed.
-
-    Rules (Leo-calibrated):
-    - All files in entities/ only → LIGHT
-    - All files in inbox/ only (queue, archive, null-result) → LIGHT
-    - Any file in core/ or foundations/ → DEEP (structural KB changes)
-    - Has challenged_by field → DEEP (challenges existing claims)
-    - Modifies existing file (not new) in domains/ → DEEP (enrichment/change)
-    - Otherwise → None (needs Haiku triage)
-
-    NOTE: Cross-domain wiki links are NOT a DEEP signal — most claims link
-    across domains, that's the whole point of the knowledge graph (Leo).
-    """
-    changed_files = []
-    for line in diff.split("\n"):
-        if line.startswith("diff --git a/"):
-            path = line.replace("diff --git a/", "").split(" b/")[0]
-            changed_files.append(path)
-
-    if not changed_files:
-        return None
-
-    # All entities/ only → LIGHT
-    if all(f.startswith("entities/") for f in changed_files):
-        logger.info("Deterministic tier: LIGHT (all files in entities/)")
-        return "LIGHT"
-
-    # All inbox/ only (queue, archive, null-result) → LIGHT
-    if all(f.startswith("inbox/") for f in changed_files):
-        logger.info("Deterministic tier: LIGHT (all files in inbox/)")
-        return "LIGHT"
-
-    # Any file in core/ or foundations/ → DEEP (structural KB changes)
-    if any(f.startswith("core/") or f.startswith("foundations/") for f in changed_files):
-        logger.info("Deterministic tier: DEEP (touches core/ or foundations/)")
-        return "DEEP"
-
-    # Check diff content for DEEP signals
-    has_challenged_by = False
-    new_files: set[str] = set()
-
-    lines = diff.split("\n")
-    for i, line in enumerate(lines):
-        # Detect new files
-        if line.startswith("--- /dev/null") and i + 1 < len(lines) and lines[i + 1].startswith("+++ b/"):
-            new_files.add(lines[i + 1][6:])
-        # Check for challenged_by field
-        if line.startswith("+") and not line.startswith("+++"):
-            stripped = line[1:].strip()
-            if stripped.startswith("challenged_by:"):
-                has_challenged_by = True
-
-    if has_challenged_by:
-        logger.info("Deterministic tier: DEEP (has challenged_by field)")
-        return "DEEP"
-
-    # NOTE: Modified existing domain claims are NOT auto-DEEP — enrichments
-    # (appending evidence) are common and should be STANDARD. Let Haiku triage
-    # distinguish enrichments from structural changes.
-
-    return None
-
-
-# ─── Verdict parsing ──────────────────────────────────────────────────────
-
-
-def parse_verdict(review_text: str, reviewer: str) -> str:
-    """Parse VERDICT tag from review. Returns 'approve' or 'request_changes'."""
-    upper = reviewer.upper()
-    if f"VERDICT:{upper}:APPROVE" in review_text:
-        return "approve"
-    elif f"VERDICT:{upper}:REQUEST_CHANGES" in review_text:
-        return "request_changes"
-    else:
-        logger.warning("No parseable verdict from %s — treating as request_changes", reviewer)
-        return "request_changes"
-
-
-# Map model-invented tags to valid tags. Models consistently ignore the valid
-# tag list and invent their own. This normalizes them. (Ganymede, Mar 14)
-_TAG_ALIASES: dict[str, str] = {
-    "schema_violation": "frontmatter_schema",
-    "missing_schema_fields": "frontmatter_schema",
-    "missing_schema": "frontmatter_schema",
-    "schema": "frontmatter_schema",
-    "missing_frontmatter": "frontmatter_schema",
-    "redundancy": "near_duplicate",
-    "duplicate": "near_duplicate",
-    "missing_confidence": "confidence_miscalibration",
-    "confidence_error": "confidence_miscalibration",
-    "vague_claims": "scope_error",
-    "unfalsifiable": "scope_error",
-    "unverified_wiki_links": "broken_wiki_links",
-    "unverified-wiki-links": "broken_wiki_links",
-    "missing_wiki_links": "broken_wiki_links",
-    "invalid_wiki_links": "broken_wiki_links",
-    "wiki_link_errors": "broken_wiki_links",
-    "overclaiming": "title_overclaims",
-    "title_overclaim": "title_overclaims",
-    "date_error": "date_errors",
-    "factual_error": "factual_discrepancy",
-    "factual_inaccuracy": "factual_discrepancy",
-}
-
-VALID_ISSUE_TAGS = {"broken_wiki_links", "frontmatter_schema", "title_overclaims",
-                    "confidence_miscalibration", "date_errors", "factual_discrepancy",
-                    "near_duplicate", "scope_error"}
-
-
-def normalize_tag(tag: str) -> str | None:
-    """Normalize a model-generated tag to a valid tag, or None if unrecognizable."""
-    tag = tag.strip().lower().replace("-", "_")
-    if tag in VALID_ISSUE_TAGS:
-        return tag
-    if tag in _TAG_ALIASES:
-        return _TAG_ALIASES[tag]
-    # Fuzzy: check if any valid tag is a substring or vice versa
-    for valid in VALID_ISSUE_TAGS:
-        if valid in tag or tag in valid:
-            return valid
-    return None
-
-
-# ─── Issue parsing ─────────────────────────────────────────────────────────
-
-
-# Keyword patterns for inferring issue tags from unstructured review prose.
-# Conservative: only match unambiguous indicators. Order doesn't matter.
-_PROSE_TAG_PATTERNS: dict[str, list[re.Pattern]] = {
-    "frontmatter_schema": [
-        re.compile(r"frontmatter", re.IGNORECASE),
-        re.compile(r"missing.{0,20}(type|domain|confidence|source|created)\b", re.IGNORECASE),
-        re.compile(r"yaml.{0,10}(invalid|missing|error|schema)", re.IGNORECASE),
-        re.compile(r"required field", re.IGNORECASE),
-        re.compile(r"lacks?.{0,15}(required|yaml|schema|fields)", re.IGNORECASE),
-        re.compile(r"missing.{0,15}(schema|fields|frontmatter)", re.IGNORECASE),
-        re.compile(r"schema.{0,10}(compliance|violation|missing|invalid)", re.IGNORECASE),
-    ],
-    "broken_wiki_links": [
-        re.compile(r"(broken|dead|invalid).{0,10}(wiki.?)?link", re.IGNORECASE),
-        re.compile(r"wiki.?link.{0,20}(not found|missing|broken|invalid|resolv|unverif)", re.IGNORECASE),
-        re.compile(r"\[\[.{1,80}\]\].{0,20}(not found|doesn.t exist|missing)", re.IGNORECASE),
-        re.compile(r"unverified.{0,10}(wiki|link)", re.IGNORECASE),
-    ],
-    "factual_discrepancy": [
-        re.compile(r"factual.{0,10}(error|inaccura|discrepanc|incorrect)", re.IGNORECASE),
-        re.compile(r"misrepresent", re.IGNORECASE),
-    ],
-    "confidence_miscalibration": [
-        re.compile(r"confidence.{0,20}(too high|too low|miscalibrat|overstat|should be)", re.IGNORECASE),
-        re.compile(r"(overstat|understat).{0,20}confidence", re.IGNORECASE),
-    ],
-    "scope_error": [
-        re.compile(r"scope.{0,10}(error|too broad|overscop|unscoped)", re.IGNORECASE),
-        re.compile(r"unscoped.{0,10}(universal|claim)", re.IGNORECASE),
-        re.compile(r"(vague|unfalsifiable).{0,15}(claim|assertion)", re.IGNORECASE),
-        re.compile(r"not.{0,10}(specific|falsifiable|disagreeable).{0,10}enough", re.IGNORECASE),
-    ],
-    "title_overclaims": [
-        re.compile(r"title.{0,20}(overclaim|overstat|too broad)", re.IGNORECASE),
-        re.compile(r"overclaim", re.IGNORECASE),
-    ],
-    "near_duplicate": [
-        re.compile(r"near.?duplicate", re.IGNORECASE),
-        re.compile(r"(very|too) similar.{0,20}(claim|title|existing)", re.IGNORECASE),
-        re.compile(r"duplicate.{0,20}(of|claim|title|existing|information)", re.IGNORECASE),
-        re.compile(r"redundan", re.IGNORECASE),
-    ],
-}
-
-
-def parse_issues(review_text: str) -> list[str]:
-    """Extract issue tags from review.
-
-    First tries structured <!-- ISSUES: tag1, tag2 --> comment with tag normalization.
-    Falls back to keyword inference from prose.
-    """
-    match = re.search(r"<!-- ISSUES: ([^>]+) -->", review_text)
-    if match:
-        raw_tags = [tag.strip() for tag in match.group(1).split(",") if tag.strip()]
-        normalized = []
-        for tag in raw_tags:
-            norm = normalize_tag(tag)
-            if norm and norm not in normalized:
-                normalized.append(norm)
-            else:
-                logger.debug("Unrecognized issue tag '%s' — dropped", tag)
-        if normalized:
-            return normalized
-    # Fallback: infer tags from review prose
-    return infer_issues_from_prose(review_text)
-
-
-def infer_issues_from_prose(review_text: str) -> list[str]:
-    """Infer issue tags from unstructured review text via keyword matching.
-
-    Fallback for reviews that reject without structured <!-- ISSUES: --> tags.
-    Conservative: requires at least one unambiguous keyword match per tag.
-    """
-    inferred = []
-    for tag, patterns in _PROSE_TAG_PATTERNS.items():
-        if any(p.search(review_text) for p in patterns):
-            inferred.append(tag)
-    return inferred
-
-
-def classify_issues(issues: list[str]) -> str:
-    """Classify issue tags as 'mechanical', 'substantive', or 'mixed'."""
-    if not issues:
-        return "unknown"
-    mechanical = set(issues) & config.MECHANICAL_ISSUE_TAGS
-    substantive = set(issues) & config.SUBSTANTIVE_ISSUE_TAGS
-    if substantive and not mechanical:
-        return "substantive"
-    if mechanical and not substantive:
-        return "mechanical"
-    if mechanical and substantive:
-        return "mixed"
-    return "unknown"  # tags not in either set
-
-
-# ─── Batch response parsing ───────────────────────────────────────────────
-
-
-def parse_batch_response(response: str, pr_numbers: list[int], agent: str) -> dict[int, str]:
-    """Parse batched domain review into per-PR review sections.
-
-    Returns {pr_number: review_text} for each PR found in the response.
-    Missing PRs are omitted — caller handles fallback.
-    """
-    agent_upper = agent.upper()
-    result: dict[int, str] = {}
-
-    # Split by PR verdict markers: <!-- PR:NNN VERDICT:AGENT:... -->
-    # Each marker terminates the previous PR's section
-    pattern = re.compile(
-        r"<!-- PR:(\d+) VERDICT:" + re.escape(agent_upper) + r":(APPROVE|REQUEST_CHANGES) -->"
-    )
-
-    matches = list(pattern.finditer(response))
-    if not matches:
-        return result
-
-    for i, match in enumerate(matches):
-        pr_num = int(match.group(1))
-        marker_end = match.end()
-
-        # Find the start of this PR's section by looking for the section header
-        # or the end of the previous verdict
-        section_header = f"=== PR #{pr_num}"
-        header_pos = response.rfind(section_header, 0, match.start())
-
-        if header_pos >= 0:
-            # Extract from header to end of verdict marker
-            section_text = response[header_pos:marker_end].strip()
-        else:
-            # No header found — extract from previous marker end to this marker end
-            prev_end = matches[i - 1].end() if i > 0 else 0
-            section_text = response[prev_end:marker_end].strip()
-
-        # Re-format as individual review comment
-        # Strip the batch section header, keep just the review content
-        # Add batch label for traceability
-        pr_nums_str = ", ".join(f"#{n}" for n in pr_numbers)
-        review_text = (
-            f"*(batch review with PRs {pr_nums_str})*\n\n"
-            f"{section_text}\n"
-        )
-        result[pr_num] = review_text
-
-    return result
-
-
-def validate_batch_fanout(
-    parsed: dict[int, str],
-    pr_diffs: list[dict],
-    agent: str,
-) -> tuple[dict[int, str], list[int]]:
-    """Validate batch fan-out for completeness and cross-contamination.
-
-    Returns (valid_reviews, fallback_pr_numbers).
-    - valid_reviews: reviews that passed validation
-    - fallback_pr_numbers: PRs that need individual review (missing or cross-contaminated)
-    """
-    valid: dict[int, str] = {}
-    fallback: list[int] = []
-
-    # Build file map: pr_number → set of path segments for matching.
-    # Use full paths (e.g., "domains/internet-finance/dao.md") not bare filenames
-    # to avoid false matches on short names like "dao.md" or "space.md" (Leo note #3).
-    pr_files: dict[int, set[str]] = {}
-    for pr in pr_diffs:
-        files = set()
-        for line in pr["diff"].split("\n"):
-            if line.startswith("diff --git a/"):
-                path = line.replace("diff --git a/", "").split(" b/")[0]
-                files.add(path)
-                # Also add the last 2 path segments (e.g., "internet-finance/dao.md")
-                # for models that abbreviate paths
-                parts = path.split("/")
-                if len(parts) >= 2:
-                    files.add("/".join(parts[-2:]))
-        pr_files[pr["number"]] = files
-
-    for pr in pr_diffs:
-        pr_num = pr["number"]
-
-        # Completeness check: is there a review for this PR?
-        if pr_num not in parsed:
-            logger.warning("Batch fan-out: PR #%d missing from response — fallback to individual", pr_num)
-            fallback.append(pr_num)
-            continue
-
-        review = parsed[pr_num]
-
-        # Cross-contamination check: does review mention at least one file from this PR?
-        # Use path segments (min 10 chars) to avoid false substring matches on short names.
-        my_files = pr_files.get(pr_num, set())
-        mentions_own_file = any(f in review for f in my_files if len(f) >= 10)
-
-        if not mentions_own_file and my_files:
-            # Check if it references files from OTHER PRs (cross-contamination signal)
-            other_files = set()
-            for other_pr in pr_diffs:
-                if other_pr["number"] != pr_num:
-                    other_files.update(pr_files.get(other_pr["number"], set()))
-            mentions_other = any(f in review for f in other_files if len(f) >= 10)
-
-            if mentions_other:
-                logger.warning(
-                    "Batch fan-out: PR #%d review references files from another PR — cross-contamination, fallback",
-                    pr_num,
-                )
-                fallback.append(pr_num)
-                continue
-            # If it doesn't mention any files at all, could be a generic review — accept it
-            # (some PRs have short diffs where the model doesn't reference filenames)
-
-        valid[pr_num] = review
-
-    return valid, fallback
--- a/lib/evaluate.py
+++ b/lib/evaluate.py
--- a/lib/extract.py
+++ b/lib/extract.py
@ -33,12 +33,10 @@ from pathlib import Path

 from . import config
 from .costs import record_usage
-from .db import classify_source_channel
 from .domains import agent_for_domain
 from .extraction_prompt import build_extraction_prompt
 from .forgejo import api as forgejo_api
 from .llm import openrouter_call
-from .connect import connect_new_claims
 from .post_extract import load_existing_claims_from_repo, validate_and_fix_claims
 from .worktree_lock import async_main_worktree_lock

@ -102,28 +100,14 @@ def _get_kb_index(domain: str) -> str:

    # Fallback: build from repo
    main = config.MAIN_WORKTREE
-    sections = []
-
-    # Domain claims
    claims = []
    domain_dir = main / "domains" / domain
    if domain_dir.is_dir():
        for f in domain_dir.glob("*.md"):
            if not f.name.startswith("_"):
-                claims.append(f"- {f.stem}")
-    sections.append(f"## Claims in domains/{domain}/\n" + "\n".join(sorted(claims)))
+                claims.append(f"- {f.name}")

-    # Domain entities — so the LLM knows what entities exist for connections
-    entities = []
-    entity_dir = main / "entities" / domain
-    if entity_dir.is_dir():
-        for f in entity_dir.glob("*.md"):
-            if not f.name.startswith("_"):
-                entities.append(f"- {f.stem}")
-    if entities:
-        sections.append(f"## Entities in entities/{domain}/\n" + "\n".join(sorted(entities)))
-
-    text = "\n\n".join(sections)
+    text = f"## Claims in domains/{domain}/\n" + "\n".join(sorted(claims))
    _kb_index_cache[domain] = text
    return text

@ -230,46 +214,18 @@ def _parse_extraction_json(text: str) -> dict | None:
        return None


-def _build_claim_content(claim: dict, agent: str, source_format: str | None = None, source_file: str = "") -> str:
+def _build_claim_content(claim: dict, agent: str) -> str:
    """Build claim markdown file content from extraction JSON."""
    today = date.today().isoformat()
    domain = claim.get("domain", "")
    title = claim.get("title", claim.get("filename", "").replace("-", " ").replace(".md", ""))
    description = claim.get("description", "")
-    raw_confidence = claim.get("confidence", "experimental")
-    _CONFIDENCE_MAP = {
-        "proven": "proven", "likely": "likely", "experimental": "experimental",
-        "speculative": "speculative", "high": "likely", "medium": "experimental",
-        "low": "speculative", "very high": "proven", "moderate": "experimental",
-    }
-    confidence = _CONFIDENCE_MAP.get(raw_confidence.lower().strip(), "experimental") if isinstance(raw_confidence, str) else "experimental"
+    confidence = claim.get("confidence", "experimental")
    source_ref = claim.get("source", "")
    body = claim.get("body", "")
    scope = claim.get("scope", "")
    sourcer = claim.get("sourcer", "")
-    related_claims = claim.get("related_claims", [])
-    connections = claim.get("connections", [])
-
-    edge_fields = {"supports": [], "challenges": [], "related": []}
-    for conn in connections:
-        target = conn.get("target", "")
-        rel = conn.get("relationship", "related")
-        if target and rel in edge_fields:
-            target = target.replace(".md", "")
-            if target not in edge_fields[rel]:
-                edge_fields[rel].append(target)
-    for r in related_claims[:5]:
-        r_clean = r.replace(".md", "").strip("[]").strip()
-        if r_clean and r_clean not in edge_fields["related"]:
-            edge_fields["related"].append(r_clean)
-
-    edge_lines = []
-    for edge_type in ("supports", "challenges", "related"):
-        targets = edge_fields[edge_type]
-        if targets:
-            edge_lines.append(f"{edge_type}:")
-            for t in targets:
-                edge_lines.append(f"  - {t}")
+    related = claim.get("related_claims", [])

    lines = [
        "---",
@ -282,16 +238,14 @@ def _build_claim_content(claim: dict, agent: str, source_format: str | None = No
        f"created: {today}",
        f"agent: {agent}",
    ]
-    if source_file:
-        lines.append(f"sourced_from: {source_file}")
    if scope:
        lines.append(f"scope: {scope}")
    if sourcer:
        lines.append(f'sourcer: "{sourcer}"')
-    if source_format and source_format.lower() == "conversation":
-        lines.append("verified: false")
-        lines.append("source_type: conversation")
-    lines.extend(edge_lines)
+    if related:
+        lines.append("related_claims:")
+        for r in related:
+            lines.append(f'  - "[[{r}]]"')
    lines.append("---")
    lines.append("")
    lines.append(f"# {title}")
@ -310,14 +264,6 @@ def _build_entity_content(entity: dict, domain: str) -> str:
    description = entity.get("content", "")

    if description:
-        # Strip code fences the LLM may have wrapped the content in
-        description = description.strip()
-        if description.startswith("```"):
-            first_nl = description.find("\n")
-            if first_nl != -1:
-                description = description[first_nl + 1:]
-        if description.endswith("```"):
-            description = description[:-3].rstrip()
        return description

    name = entity.get("filename", "").replace("-", " ").replace(".md", "").title()
@ -354,7 +300,6 @@ async def _extract_one_source(
    rationale = fm.get("rationale")
    intake_tier = fm.get("intake_tier")
    proposed_by = fm.get("proposed_by")
-    source_format = fm.get("format")

    logger.info("Extracting: %s (domain: %s, agent: %s)", source_file, domain, agent_name)

@ -378,7 +323,6 @@ async def _extract_one_source(
        proposed_by=proposed_by,
        prior_art=prior_art,
        previous_feedback=feedback,
-        source_format=source_format,
    )

    # 4. Call LLM (OpenRouter — not Claude Max CLI)
@ -432,10 +376,9 @@ async def _extract_one_source(
        filename = c.get("filename", "")
        if not filename:
            continue
-        filename = Path(filename).name  # Strip directory components — LLM output may contain path traversal
        if not filename.endswith(".md"):
            filename += ".md"
-        content = _build_claim_content(c, agent_lower, source_format=source_format, source_file=f"{domain}/{source_file}" if domain else source_file)
+        content = _build_claim_content(c, agent_lower)
        claim_files.append({"filename": filename, "domain": c.get("domain", domain), "content": content})

    # Build entity file contents
@ -444,7 +387,6 @@ async def _extract_one_source(
        filename = e.get("filename", "")
        if not filename:
            continue
-        filename = Path(filename).name  # Strip directory components — LLM output may contain path traversal
        if not filename.endswith(".md"):
            filename += ".md"
        action = e.get("action", "create")
@ -452,31 +394,6 @@ async def _extract_one_source(
            content = _build_entity_content(e, domain)
            entity_files.append({"filename": filename, "domain": domain, "content": content})

-    # 6.5. Pre-filter near-duplicates BEFORE post-extract validation
-    # Uses same SequenceMatcher threshold as tier0. Catches duplicates cheaply ($0)
-    # before they create PRs and burn eval cycles.
-    if claim_files and existing_claims:
-        from difflib import SequenceMatcher as _SM
-        _DEDUP_THRESHOLD = 0.85
-        filtered = []
-        for cf in claim_files:
-            title_lower = Path(cf["filename"]).stem.replace("-", " ").lower()
-            title_words = set(title_lower.split()[:6])
-            is_dup = False
-            for existing in existing_claims:
-                existing_lower = existing.replace("-", " ").lower()
-                if len(title_words & set(existing_lower.split()[:6])) < 2:
-                    continue
-                if _SM(None, title_lower, existing_lower).ratio() >= _DEDUP_THRESHOLD:
-                    logger.info("Extract-dedup: skipping near-duplicate '%s' (matches '%s')", cf["filename"], existing)
-                    is_dup = True
-                    break
-            if not is_dup:
-                filtered.append(cf)
-        if len(filtered) < len(claim_files):
-            logger.info("Extract-dedup: filtered %d/%d near-duplicates", len(claim_files) - len(filtered), len(claim_files))
-        claim_files = filtered
-
    # 7. Post-extraction validation
    if claim_files:
        kept_claims, rejected_claims, stats = validate_and_fix_claims(
@ -491,19 +408,8 @@ async def _extract_one_source(
            )
        claim_files = kept_claims

-    if not claim_files and not entity_files and not enrichments:
-        logger.info("No valid claims/entities/enrichments after validation for %s — archiving as null-result", source_file)
-        # Mark DB as null_result so queue scan won't re-extract even if file stays in queue
-        # (the main-worktree push in _archive_source frequently fails — DB is authoritative).
-        try:
-            conn.execute(
-                """INSERT INTO sources (path, status, updated_at) VALUES (?, 'null_result', datetime('now'))
-                   ON CONFLICT(path) DO UPDATE SET status='null_result', updated_at=datetime('now')""",
-                (source_path,),
-            )
-            conn.commit()
-        except Exception:
-            logger.debug("Failed to mark source as null_result in DB", exc_info=True)
+    if not claim_files and not entity_files:
+        logger.info("No valid claims/entities after validation for %s — archiving as null-result", source_file)
        await _archive_source(source_path, domain, "null-result")
        return 0, 0

@ -541,83 +447,13 @@ async def _extract_one_source(
        fpath.write_text(ef["content"], encoding="utf-8")
        files_written.append(f"entities/{domain}/{ef['filename']}")

-    # Write enrichments as modifications to existing claim files
-    for enr in enrichments:
-        target = enr.get("target_file", "")
-        evidence = enr.get("evidence", "")
-        enr_type = enr.get("type", "extend")  # confirm|challenge|extend
-        source_ref = enr.get("source_ref", source_file)
-        if not target or not evidence:
-            continue
-        # Find the target claim file in the worktree (search domains/)
-        target_stem = Path(target.replace(".md", "")).name
-        found = None
-        for domain_dir in (worktree / "domains").iterdir():
-            candidate = domain_dir / f"{target_stem}.md"
-            if candidate.exists():
-                found = candidate
-                break
-        if not found:
-            logger.debug("Enrichment target %s not found in worktree", target)
-            continue
-        # Append enrichment evidence to the claim file
-        existing = found.read_text(encoding="utf-8")
-        label = {"confirm": "Supporting", "challenge": "Challenging", "extend": "Extending"}.get(enr_type, "Additional")
-        enrichment_block = f"\n\n## {label} Evidence\n\n**Source:** {source_ref}\n\n{evidence}\n"
-        found.write_text(existing + enrichment_block, encoding="utf-8")
-        rel_path = str(found.relative_to(worktree))
-        if rel_path not in files_written:
-            files_written.append(rel_path)
-        logger.info("Enrichment applied to %s (%s)", target, enr_type)
-
    if not files_written:
        logger.info("No files written for %s — cleaning up", source_file)
-        # Path B null-result: enrichments existed but all targets missing in worktree.
-        # No PR, no cooldown match — without DB update this re-extracts every 60s.
-        # (Ganymede review, commit 469cb7f follow-up.)
-        try:
-            conn.execute(
-                """INSERT INTO sources (path, status, updated_at) VALUES (?, 'null_result', datetime('now'))
-                   ON CONFLICT(path) DO UPDATE SET status='null_result', updated_at=datetime('now')""",
-                (source_path,),
-            )
-            conn.commit()
-        except Exception:
-            logger.debug("Failed to mark source as null_result (path B)", exc_info=True)
        await _git("checkout", "main", cwd=str(EXTRACT_WORKTREE))
        await _git("branch", "-D", branch, cwd=str(EXTRACT_WORKTREE))
        await _archive_source(source_path, domain, "null-result")
        return 0, 0

-    # Post-write: connect new claims to existing KB via vector search (non-fatal)
-    claim_paths = [str(worktree / f) for f in files_written if f.startswith("domains/")]
-    if claim_paths:
-        try:
-            connect_stats = connect_new_claims(claim_paths)
-            if connect_stats["connected"] > 0:
-                logger.info(
-                    "Extract-connect: %d/%d claims → %d edges",
-                    connect_stats["connected"], len(claim_paths), connect_stats["edges_added"],
-                )
-        except Exception:
-            logger.warning("Extract-connect failed (non-fatal)", exc_info=True)
-
-    # Archive the source WITHIN the extract branch (not via separate push on main).
-    # Prevents the runaway-extraction race: when archive-to-main push fails (non-FF,
-    # non-pushable worktree state), file returns to queue and gets re-extracted every
-    # cycle. Moving the archive into the extract branch makes it atomic with the PR
-    # merge — when the PR merges, the source is archived automatically.
-    try:
-        archive_rel = _archive_source_in_worktree(
-            worktree, source_path, domain, "processed", agent_lower, extract_model,
-        )
-        if archive_rel:
-            files_written.append(archive_rel["new"])
-            # The queue file was deleted; git add handles the removal
-            await _git("add", "inbox/queue/", cwd=str(EXTRACT_WORKTREE))
-    except Exception:
-        logger.exception("In-branch archive failed for %s (continuing)", source_file)
-
    # Stage and commit
    for f in files_written:
        await _git("add", f, cwd=str(EXTRACT_WORKTREE))
@ -700,32 +536,17 @@ async def _extract_one_source(
            for c in claims_raw if c.get("title") or c.get("filename")
        )

-        # Success path: mark source as 'extracting' so queue scan's DB-status filter
-        # skips it between PR creation and merge. Without this, cooldown is load-bearing
-        # (Ganymede review, commit 469cb7f follow-up).
-        try:
-            conn.execute(
-                """INSERT INTO sources (path, status, updated_at) VALUES (?, 'extracting', datetime('now'))
-                   ON CONFLICT(path) DO UPDATE SET status='extracting', updated_at=datetime('now')""",
-                (source_path,),
-            )
-            conn.commit()
-        except Exception:
-            logger.debug("Failed to mark source as extracting", exc_info=True)
-
        # Upsert: if discover_external_prs already created the row, update it;
        # if not, create a partial row that discover will complete.
-        source_channel = classify_source_channel(branch)
        try:
            conn.execute(
-                """INSERT INTO prs (number, branch, status, submitted_by, source_path, description, source_channel)
-                   VALUES (?, ?, 'open', ?, ?, ?, ?)
+                """INSERT INTO prs (number, branch, status, submitted_by, source_path, description)
+                   VALUES (?, ?, 'open', ?, ?, ?)
                   ON CONFLICT(number) DO UPDATE SET
                     submitted_by = excluded.submitted_by,
                     source_path = excluded.source_path,
-                     description = COALESCE(excluded.description, prs.description),
-                     source_channel = COALESCE(prs.source_channel, excluded.source_channel)""",
-                (pr_num, branch, contributor, source_path, claim_titles, source_channel),
+                     description = COALESCE(excluded.description, prs.description)""",
+                (pr_num, branch, contributor, source_path, claim_titles),
            )
            conn.commit()
        except Exception:
@ -746,69 +567,12 @@ async def _extract_one_source(
    # Clean up extract worktree
    await _git("checkout", "main", cwd=str(EXTRACT_WORKTREE))

-    # Note: source archival happened in-branch before commit (see _archive_source_in_worktree).
-    # Do NOT call _archive_source() here — the broken main-worktree-push path caused the
-    # runaway extraction bug. Archive is now atomic with PR merge.
+    # 10. Archive source on main
+    await _archive_source(source_path, domain, "processed", agent_lower)

    return 1, 0


-def _archive_source_in_worktree(
-    worktree: Path,
-    source_path: str,
-    domain: str,
-    status: str,
-    agent: str | None,
-    extraction_model: str,
-) -> dict | None:
-    """Move source file from inbox/queue/ to inbox/archive/<domain>/ WITHIN extract worktree.
-
-    Updates frontmatter (status, processed_by, processed_date, extraction_model) and
-    returns {"old": old_rel_path, "new": new_rel_path} or None if not found.
-
-    The caller commits this change as part of the extract branch, so the archive lands
-    atomically with the PR merge — no separate push on main required.
-    """
-    queue_path = worktree / source_path
-    if not queue_path.exists():
-        logger.warning("Source %s not found in worktree queue — skipping in-branch archive", source_path)
-        return None
-
-    if status == "null-result":
-        dest_dir = worktree / "inbox" / "null-result"
-    else:
-        dest_dir = worktree / "inbox" / "archive" / (domain or "unknown")
-    dest_dir.mkdir(parents=True, exist_ok=True)
-    dest_path = dest_dir / queue_path.name
-
-    content = queue_path.read_text(encoding="utf-8")
-    today = date.today().isoformat()
-    content = re.sub(r"^status: unprocessed", f"status: {status}", content, flags=re.MULTILINE)
-    if agent and "processed_by:" not in content:
-        content = re.sub(
-            r"(^status: \w+)",
-            rf"\1\nprocessed_by: {agent}\nprocessed_date: {today}",
-            content,
-            count=1,
-            flags=re.MULTILINE,
-        )
-    if "extraction_model:" not in content:
-        content = re.sub(
-            r"(^status: \w+.*?)(\n---)",
-            rf'\1\nextraction_model: "{extraction_model}"\2',
-            content,
-            count=1,
-            flags=re.MULTILINE | re.DOTALL,
-        )
-
-    dest_path.write_text(content, encoding="utf-8")
-    queue_path.unlink()
-
-    old_rel = str(queue_path.relative_to(worktree))
-    new_rel = str(dest_path.relative_to(worktree))
-    return {"old": old_rel, "new": new_rel}
-
-
 async def _archive_source(
    source_path: str,
    domain: str,
@ -900,31 +664,18 @@ async def extract_cycle(conn, max_workers=None) -> tuple[int, int]:
    if not queue_dir.exists():
        return 0, 0

-    # DB-authoritative status filter: exclude sources where DB records non-unprocessed state.
-    # File frontmatter alone isn't reliable — archive pushes can fail, leaving stale file state.
-    # The sources table is the authoritative record of whether a source has been processed.
-    db_non_unprocessed = {
-        r["path"] for r in conn.execute(
-            "SELECT path FROM sources WHERE status != 'unprocessed'"
-        ).fetchall()
-    }
-
    unprocessed = []
    for f in sorted(queue_dir.glob("*.md")):
        try:
            content = f.read_text(encoding="utf-8")
            fm = _parse_source_frontmatter(content)
-            if fm.get("status") != "unprocessed":
-                continue
-            rel_path = str(f.relative_to(main))
-            if rel_path in db_non_unprocessed:
-                continue
-            unprocessed.append((rel_path, content, fm))
+            if fm.get("status") == "unprocessed":
+                unprocessed.append((str(f.relative_to(main)), content, fm))
        except Exception:
            logger.debug("Failed to read source %s", f, exc_info=True)

-    # Don't early-return here — re-extraction sources may exist even when queue is empty
-    # (the re-extraction check runs after open-PR filtering below)
+    if not unprocessed:
+        return 0, 0

    # Filter out sources that already have open extraction PRs
    open_pr_slugs = set()
@ -956,44 +707,10 @@ async def extract_cycle(conn, max_workers=None) -> tuple[int, int]:
        if skipped:
            logger.info("Skipped %d source(s) with existing open PRs", skipped)

-    # Cooldown: skip sources with ANY PR in last EXTRACTION_COOLDOWN_HOURS.
-    # Defense-in-depth for DB-status filter — catches the window between PR
-    # creation and DB status update if anything races.
-    if unprocessed:
-        cooldown_hours = config.EXTRACTION_COOLDOWN_HOURS
-        recent_source_paths = {
-            r["source_path"] for r in conn.execute(
-                """SELECT DISTINCT source_path FROM prs
-                   WHERE source_path IS NOT NULL
-                   AND created_at > datetime('now', ? || ' hours')""",
-                (f"-{cooldown_hours}",),
-            ).fetchall() if r["source_path"]
-        }
-        if recent_source_paths:
-            before = len(unprocessed)
-            unprocessed = [
-                (sp, c, f) for sp, c, f in unprocessed
-                if sp not in recent_source_paths
-            ]
-            cooled = before - len(unprocessed)
-            if cooled:
-                logger.info("Cooldown: skipped %d source(s) with PRs in last %dh", cooled, cooldown_hours)
-
-    # ── Check for re-extraction sources (must run even when queue is empty) ──
-    reextract_rows = conn.execute(
-        """SELECT path, feedback FROM sources
-           WHERE status = 'needs_reextraction' AND feedback IS NOT NULL
-           ORDER BY updated_at ASC LIMIT ?""",
-        (max(1, MAX_SOURCES - len(unprocessed)),),
-    ).fetchall()
-
-    if not unprocessed and not reextract_rows:
+    if not unprocessed:
        return 0, 0

-    if unprocessed:
-        logger.info("Extract cycle: %d unprocessed source(s) found, processing up to %d", len(unprocessed), MAX_SOURCES)
-    if reextract_rows:
-        logger.info("Extract cycle: %d source(s) queued for re-extraction", len(reextract_rows))
+    logger.info("Extract cycle: %d unprocessed source(s) found, processing up to %d", len(unprocessed), MAX_SOURCES)

    # Load existing claims for dedup
    existing_claims = load_existing_claims_from_repo(str(main))
@ -1006,6 +723,14 @@ async def extract_cycle(conn, max_workers=None) -> tuple[int, int]:
    total_ok = 0
    total_err = 0

+    # ── Re-extraction: pick up sources that failed eval and have feedback ──
+    reextract_rows = conn.execute(
+        """SELECT path, feedback FROM sources
+           WHERE status = 'needs_reextraction' AND feedback IS NOT NULL
+           ORDER BY updated_at ASC LIMIT ?""",
+        (max(1, MAX_SOURCES - len(unprocessed)),),
+    ).fetchall()
+
    for row in reextract_rows:
        reex_path = row["path"]
        # Source was archived — read from archive location
--- a/lib/extraction_prompt.py
+++ b/lib/extraction_prompt.py
@ -6,7 +6,7 @@ The extraction prompt focuses on WHAT to extract:
 - Identify entity data
 - Check for duplicates against KB index

-Mechanical enforcement (frontmatter format, dates, filenames)
+Mechanical enforcement (frontmatter format, wiki links, dates, filenames)
 is handled by post_extract.py AFTER the LLM returns.

 Design principle (Leo): mechanical rules in code, judgment in prompts.
@ -29,7 +29,6 @@ def build_extraction_prompt(
    proposed_by: str | None = None,
    prior_art: list[dict] | None = None,
    previous_feedback: dict | None = None,
-    source_format: str | None = None,
 ) -> str:
    """Build the lean extraction prompt.

@ -46,7 +45,6 @@ def build_extraction_prompt(
        prior_art: Qdrant search results — existing claims semantically similar to this source.
                   Each dict has: claim_title, claim_path, description, score.
                   Injected as connection candidates for extract-time linking.
-        source_format: Source format hint (e.g. "conversation" for Telegram chats).

    Returns:
        The complete prompt string
@ -98,7 +96,7 @@ Set `contributor_thesis_extractable: true` if you extracted the contributor's th
                    "factual_discrepancy": "Check facts carefully — verify dates, numbers, and attributions against the source text.",
                    "near_duplicate": "Check the KB index more carefully — this claim may already exist. Prefer enrichment over duplication.",
                    "scope_error": "Scope claims correctly — don't mix structural, functional, and causal claims in one.",
-                    "broken_wiki_links": "Do NOT use [[wiki links]] in body text. Use the connections and related_claims JSON fields instead.",
+                    "broken_wiki_links": "Ensure wiki links reference real entities/claims in the KB.",
                }
                guidance = issue_guidance.get(issue, f"Address: {issue}")
                feedback_lines.append(f"- **{issue}**: {guidance}")
@ -119,7 +117,6 @@ Set `contributor_thesis_extractable: true` if you extracted the contributor's th
            "These existing claims are topically related to this source. For each NEW claim you extract,",
            "check this list and specify connections in the `connections` array.\n",
        ]
-        high_sim = []
        for i, pa in enumerate(prior_art[:10], 1):
            title = pa.get("claim_title", "untitled")
            path = pa.get("claim_path", "")
@ -129,103 +126,11 @@ Set `contributor_thesis_extractable: true` if you extracted the contributor's th
            pa_lines.append(f"{i}. **{title}** (`{filename}`, similarity: {score:.2f})")
            if desc:
                pa_lines.append(f"   {desc}")
-            if score >= 0.75:
-                high_sim.append(title)
        pa_lines.append("")
-        if high_sim:
-            pa_lines.append("**WARNING — HIGH SIMILARITY MATCHES (score >= 0.75):**")
-            pa_lines.append("The following existing claims are very similar to themes in this source.")
-            pa_lines.append("Do NOT extract new claims that restate these — use ENRICHMENT instead:")
-            for hs in high_sim:
-                pa_lines.append(f"  - {hs}")
-            pa_lines.append("")
        connection_candidates = "\n".join(pa_lines)
    else:
        connection_candidates = ""

-    # Build conversation extraction section (for Telegram/chat sources)
-    if source_format and source_format.lower() == "conversation":
-        conversation_section = """
-## Conversation Source — Special Extraction Rules
-
-This source is a **conversation between a human domain expert and an AI agent**.
-The extraction rules are DIFFERENT from article sources:
-
-### Who said what matters
-
- **The human (@m3taversal / contributor)** is the domain expert. Their statements carry
-  authority — especially corrections, pushback, and factual assertions.
- **The AI agent's responses** are secondary. They are useful for context (what was being
-  discussed) and for confirming when the human's correction landed (look for "you're right",
-  "fair point", confidence drops).
-
-### Corrections are the HIGHEST-VALUE content
-
-When the human says "that's wrong", "not true", "you're wrong", "out of date", or similar:
-
-1. **Extract the correction as a claim or enrichment.** The human is correcting the KB's
-   understanding. This is precisely what the KB needs.
-2. **The correction itself IS the claim.** "Curated launches had significantly more committed
-   capital than permissionless launches" is a testable, disagreeable proposition — extract it
-   AS A CLAIM, not just an enrichment. If the correction states something specific enough to
-   disagree with, it's a claim. Extract it even if it's only one sentence.
-3. **Short corrections are HIGH value, not low value.** A 15-word correction that fixes a
-   factual error is worth more than a 500-word article that confirms what we already know.
-   NEVER null-result a conversation just because the human's message is short.
-4. **Map corrections to existing claims.** Search the KB index for claims that the correction
-   challenges. Output BOTH a new claim (the corrected understanding) AND an enrichment
-   (type: "challenge") targeting the existing claim. The enrichment links the correction
-   to what it corrects; the claim captures the corrected knowledge as a standalone proposition.
-
-### Bot LEARNING lines are extraction hints
-
-When the AI agent includes a `LEARNING:` line, it's a pre-extracted correction. Use it as
-a starting point — but reformulate it as a proper claim (the LEARNING line is often too
-casual or too specific to the conversation context).
-
-### Bot CONFIDENCE drops are signals
-
-When the AI agent drops its confidence score after a correction, that CONFIRMS the human
-was right. Low confidence (0.3-0.5) after pushback = strong signal the correction is valid.
-
-### Trust hierarchy for numbers and specifics
-
-**CRITICAL:** Neither the human NOR the AI agent should be treated as authoritative sources
-for specific numbers, dates, dollar amounts, or statistics UNLESS they cite a verifiable
-external source (on-chain data, official announcements, published reports).
-
- **Bot-generated numbers are ALWAYS unverified.** When the AI agent says "$25.6M committed
-  capital" or "15x oversubscription" — these are the bot's best guess, NOT verified data.
-  NEVER extract bot-generated numbers as evidence in a claim.
- **Human-asserted numbers are ALSO unverified** unless they cite a source. "It raised $11.4M"
-  from the human is a claim about a number, not proof of the number.
- **Extract the DIRECTIONAL insight, not the specific figures.** "Curated launches attracted
-  significantly more committed capital than permissionless launches" is extractable.
-  "$25.6M vs $11.4M" is not — unless the conversation cites where those numbers come from.
- **If specific figures are important to the claim, flag them.** Add a note in the claim body:
-  "Note: specific figures cited in conversation require verification against on-chain data."
-
-The goal: capture WHAT the human is asserting (the mechanism, the direction, the pattern)
-without laundering unverified numbers into the knowledge base as if they were evidence.
-
-### Anti-circularity rule
-
-If the AI agent is simply reflecting the human's thesis back (restating what the human said
-in different words), do NOT extract that as a claim sourced from the agent. That's circular.
-Only extract claims that either:
- Represent the human's ORIGINAL assertion (source it to the human)
- Introduce genuinely NEW information from the agent's knowledge (source it to the agent + context)
-
-### Retrieval-only conversations → null_result
-
-If the conversation is purely a lookup request ("what is X", "give me a list of Y",
-"what's the market cap of Z") with no analytical content, corrections, or novel claims,
-return an empty extraction (null_result). The dividing line: did the human ASSERT something
-or only ASK something?
-"""
-    else:
-        conversation_section = ""
-
    return f"""You are {agent}, extracting knowledge from a source for TeleoHumanity's collective knowledge base.

 ## Your Task
@ -290,16 +195,14 @@ Single source = experimental at most. Pitch rhetoric or marketing copy = specula
 **File:** {source_file}

 {source_content}
-{conversation_section}{contributor_directive}{previous_feedback_section}{connection_candidates}
-## KB Index (existing claims and entities — check for duplicates, enrichment targets, and connections)
+{contributor_directive}{previous_feedback_section}{connection_candidates}
+## KB Index (existing claims — check for duplicates and enrichment targets)

 {kb_index}

 ## Output Format

-Return valid JSON. The post-processor handles frontmatter formatting and dates — focus on the intellectual content.
-
-**Do NOT use [[wiki links]] in body text.** Express all cross-references through the `connections` and `related_claims` JSON fields instead. Inline [[links]] are stripped by the post-processor — use the structured JSON fields which capture relationship type and reason.
+Return valid JSON. The post-processor handles frontmatter formatting, wiki links, and dates — focus on the intellectual content.

 ```json
 {{
--- a/lib/fixer.py
+++ b/lib/fixer.py
@ -22,7 +22,6 @@ import logging
 from pathlib import Path

 from . import config, db
-from .pr_state import close_pr, reset_for_reeval, start_fixing
 from .validate import WIKI_LINK_RE, load_existing_claims

 logger = logging.getLogger("pipeline.fixer")
@ -63,9 +62,19 @@ async def _fix_wiki_links_in_pr(conn, pr_number: int) -> dict:
    between new claims in the same PR are preserved.
    """
    # Atomic claim — prevent concurrent fixers and evaluators
-    if not start_fixing(conn, pr_number):
+    cursor = conn.execute(
+        "UPDATE prs SET status = 'fixing', last_attempt = datetime('now') WHERE number = ? AND status = 'open'",
+        (pr_number,),
+    )
+    if cursor.rowcount == 0:
        return {"pr": pr_number, "skipped": True, "reason": "not_open"}

+    # Increment fix_attempts
+    conn.execute(
+        "UPDATE prs SET fix_attempts = COALESCE(fix_attempts, 0) + 1 WHERE number = ?",
+        (pr_number,),
+    )
+
    # Get PR branch from DB first, fall back to Forgejo API
    row = conn.execute("SELECT branch FROM prs WHERE number = ?", (pr_number,)).fetchone()
    branch = row["branch"] if row and row["branch"] else None
@ -168,7 +177,18 @@ async def _fix_wiki_links_in_pr(conn, pr_number: int) -> dict:
        # Reset eval state BEFORE push — if daemon crashes between push and
        # reset, the PR would be permanently stuck at max eval_attempts.
        # Reset-first: worst case is one wasted eval cycle on old content.
-        reset_for_reeval(conn, pr_number)
+        conn.execute(
+            """UPDATE prs SET
+               status = 'open',
+               eval_attempts = 0,
+               eval_issues = '[]',
+               tier0_pass = NULL,
+               domain_verdict = 'pending',
+               leo_verdict = 'pending',
+               last_error = NULL
+               WHERE number = ?""",
+            (pr_number,),
+        )

        rc, out = await _git("push", "origin", branch, cwd=worktree_path, timeout=30)
        if rc != 0:
@ -222,11 +242,15 @@ async def fix_cycle(conn, max_workers=None) -> tuple[int, int]:
            try:
                await _gc_forgejo("POST", _gc_repo_path(f"issues/{pr_num}/comments"),
                                  {"body": "Auto-closed: fix budget exhausted. Source will be re-extracted."})
-                await close_pr(conn, pr_num, last_error='fix budget exhausted — auto-closed')
+                await _gc_forgejo("PATCH", _gc_repo_path(f"pulls/{pr_num}"), {"state": "closed"})
                if branch:
                    await _gc_forgejo("DELETE", _gc_repo_path(f"branches/{branch}"))
            except Exception as e:
                logger.warning("GC: failed to close PR #%d on Forgejo: %s", pr_num, e)
+            conn.execute(
+                "UPDATE prs SET status = 'closed', last_error = 'fix budget exhausted — auto-closed' WHERE number = ?",
+                (pr_num,),
+            )
        logger.info("GC: closed %d exhausted PRs (DB + Forgejo + branch cleanup)", len(gc_rows))

    batch_limit = min(max_workers or config.MAX_FIX_PER_CYCLE, config.MAX_FIX_PER_CYCLE)
--- a/lib/frontmatter.py
+++ b/lib/frontmatter.py
@ -1,142 +0,0 @@
-"""Pure YAML frontmatter parsing and serialization for claim/entity files.
-
-Shared by merge (reweave merge, reciprocal edges) and reweave scripts.
-All functions are pure — zero I/O, zero async, zero DB.
-
-Extracted from merge.py Phase 6 of decomposition (Ganymede-approved plan).
-"""
-
-import yaml
-
-
-def _yaml_quote(value: str) -> str:
-    """Quote a YAML list value if it contains characters that would break parsing."""
-    s = str(value)
-    if ":" in s or s.startswith(("{", "[", "'", '"', "*", "&", "!", "|", ">")):
-        escaped = s.replace('"', '\\"')
-        return f'"{escaped}"'
-    return s
-
-
-# Edge field names recognized in claim frontmatter.
-# Order matters: serialize_edge_fields writes them in this order when appending new fields.
-REWEAVE_EDGE_FIELDS = ("supports", "challenges", "challenged_by", "depends_on", "related", "reweave_edges")
-
-# Reciprocal edge mapping: when A has edge_type → B, B gets reciprocal → A.
-# When A supports B, B also supports A (approximately symmetric).
-# When A challenges B, B is challenged_by A (NOT symmetric — direction matters).
-RECIPROCAL_EDGE_MAP = {
-    "supports": "supports",
-    "challenges": "challenged_by",
-    "related": "related",
-    "depends_on": "related",  # A depends_on B → B is related to A (not symmetric)
-}
-
-
-def parse_yaml_frontmatter(text: str) -> tuple[dict | None, str, str]:
-    """Parse YAML frontmatter from markdown text.
-
-    Returns (frontmatter_dict, raw_fm_text, body_text_including_closing_delimiter).
-    Returns (None, "", text) if no valid frontmatter found.
-    raw_fm_text is the text between the --- delimiters (no delimiters, no leading newline).
-    """
-    if not text.startswith("---"):
-        return None, "", text
-    end = text.find("\n---", 3)
-    if end == -1:
-        return None, "", text
-    try:
-        raw_fm_text = text[4:end]  # skip "---\n", stop before "\n---"
-        fm = yaml.safe_load(raw_fm_text)
-        body = text[end:]  # includes closing \n--- and body
-        return (fm if isinstance(fm, dict) else None), raw_fm_text, body
-    except Exception:
-        return None, "", text
-
-
-def union_edge_lists(main_edges: list, branch_edges: list) -> list:
-    """Union two edge lists, preserving order from main (append new at end).
-
-    Deduplicates by lowercase slug. Main's order is preserved; branch-only
-    edges are appended in their original order.
-    """
-    seen = set()
-    result = []
-    for edge in main_edges:
-        key = str(edge).strip().lower()
-        if key not in seen:
-            seen.add(key)
-            result.append(edge)
-    for edge in branch_edges:
-        key = str(edge).strip().lower()
-        if key not in seen:
-            seen.add(key)
-            result.append(edge)
-    return result
-
-
-def serialize_edge_fields(raw_fm_text: str, merged_edges: dict[str, list]) -> str:
-    """Splice merged edge fields into raw frontmatter text, preserving all other fields byte-identical.
-
-    Only modifies REWEAVE_EDGE_FIELDS lines. All other frontmatter (title, confidence, type, etc.)
-    stays exactly as it was in the source text — no yaml.dump reformatting.
-
-    Args:
-        raw_fm_text: The raw YAML text between the --- delimiters (no delimiters included).
-        merged_edges: {field_name: [edge_values]} for each edge field that should be present.
-    """
-    lines = raw_fm_text.split("\n")
-    result_lines = []
-    i = 0
-    fields_written = set()
-
-    while i < len(lines):
-        line = lines[i]
-        # Check if this line starts an edge field
-        matched_field = None
-        for field in REWEAVE_EDGE_FIELDS:
-            if line.startswith(f"{field}:"):
-                matched_field = field
-                break
-
-        if matched_field:
-            fields_written.add(matched_field)
-            # Skip the old field and its list items (may be indented with spaces)
-            i += 1
-            while i < len(lines) and lines[i] and (lines[i][0] in (' ', '-')):
-                i += 1
-            # Write the merged version
-            edges = merged_edges.get(matched_field, [])
-            if edges:
-                result_lines.append(f"{matched_field}:")
-                for edge in edges:
-                    result_lines.append(f"- {_yaml_quote(edge)}")
-            # Don't increment i — it's already past the old field
-            continue
-        else:
-            result_lines.append(line)
-            i += 1
-
-    # Append any new edge fields that didn't exist in the original
-    for field in REWEAVE_EDGE_FIELDS:
-        if field not in fields_written:
-            edges = merged_edges.get(field, [])
-            if edges:
-                result_lines.append(f"{field}:")
-                for edge in edges:
-                    result_lines.append(f"- {_yaml_quote(edge)}")
-
-    return "\n".join(result_lines)
-
-
-def serialize_frontmatter(raw_fm_text: str, merged_edges: dict[str, list], body: str) -> str:
-    """Rebuild markdown file: splice merged edges into raw frontmatter, append body.
-
-    Uses string-level surgery — only edge fields are modified. All other frontmatter
-    stays byte-identical to the source. No yaml.dump reformatting.
-    """
-    spliced = serialize_edge_fields(raw_fm_text, merged_edges)
-    # body starts with \n--- (closing delimiter + body text)
-    if body.startswith("\n"):
-        return f"---\n{spliced}{body}"
-    return f"---\n{spliced}\n{body}"
--- a/lib/github_feedback.py
+++ b/lib/github_feedback.py
@ -1,187 +0,0 @@
-"""GitHub PR feedback — posts pipeline status to GitHub PRs for external contributors.
-
-Three touchpoints:
-1. Discovery ack: when pipeline discovers a mirrored PR
-2. Eval review: when evaluation completes (approved or rejected with reasoning)
-3. Merge/close outcome: when PR is merged or permanently closed
-
-Only fires for PRs with a github_pr link (set by sync-mirror.sh).
-All calls are non-fatal — GitHub feedback never blocks the pipeline.
-"""
-
-import logging
-import os
-
-import aiohttp
-
-from . import config
-
-logger = logging.getLogger("pipeline.github_feedback")
-
-GITHUB_API = "https://api.github.com"
-GITHUB_REPO = "living-ip/teleo-codex"
-
-_BOT_ACCOUNTS = frozenset({"m3taversal", "teleo-bot", "teleo", "github-actions[bot]"})
-
-
-def _github_pat() -> str | None:
-    pat_file = config.SECRETS_DIR / "github-pat"
-    if pat_file.exists():
-        return pat_file.read_text().strip()
-    return os.environ.get("GITHUB_PAT")
-
-
-async def _post_comment(github_pr: int, body: str) -> bool:
-    pat = _github_pat()
-    if not pat:
-        logger.warning("No GitHub PAT — skipping feedback for GH PR #%d", github_pr)
-        return False
-
-    url = f"{GITHUB_API}/repos/{GITHUB_REPO}/issues/{github_pr}/comments"
-    headers = {
-        "Authorization": f"Bearer {pat}",
-        "Accept": "application/vnd.github+json",
-        "X-GitHub-Api-Version": "2022-11-28",
-    }
-
-    try:
-        async with aiohttp.ClientSession() as session:
-            async with session.post(
-                url, headers=headers, json={"body": body},
-                timeout=aiohttp.ClientTimeout(total=30),
-            ) as resp:
-                if resp.status >= 400:
-                    text = await resp.text()
-                    logger.error("GitHub comment on PR #%d failed: %d %s", github_pr, resp.status, text[:200])
-                    return False
-                logger.info("GitHub comment posted on PR #%d", github_pr)
-                return True
-    except Exception:
-        logger.exception("GitHub comment on PR #%d failed", github_pr)
-        return False
-
-
-async def _close_github_pr(github_pr: int) -> bool:
-    pat = _github_pat()
-    if not pat:
-        return False
-
-    url = f"{GITHUB_API}/repos/{GITHUB_REPO}/pulls/{github_pr}"
-    headers = {
-        "Authorization": f"Bearer {pat}",
-        "Accept": "application/vnd.github+json",
-        "X-GitHub-Api-Version": "2022-11-28",
-    }
-
-    try:
-        async with aiohttp.ClientSession() as session:
-            async with session.patch(
-                url, headers=headers, json={"state": "closed"},
-                timeout=aiohttp.ClientTimeout(total=30),
-            ) as resp:
-                if resp.status >= 400:
-                    text = await resp.text()
-                    logger.error("GitHub close PR #%d failed: %d %s", github_pr, resp.status, text[:200])
-                    return False
-                logger.info("GitHub PR #%d closed", github_pr)
-                return True
-    except Exception:
-        logger.exception("GitHub close PR #%d failed", github_pr)
-        return False
-
-
-def _get_github_pr(conn, forgejo_pr: int) -> int | None:
-    row = conn.execute(
-        "SELECT github_pr FROM prs WHERE number = ? AND github_pr IS NOT NULL",
-        (forgejo_pr,),
-    ).fetchone()
-    return row["github_pr"] if row else None
-
-
-async def on_discovery(conn, forgejo_pr: int):
-    """Post discovery acknowledgment to GitHub PR."""
-    gh_pr = _get_github_pr(conn, forgejo_pr)
-    if not gh_pr:
-        return
-
-    body = (
-        "Your contribution has been received by the Teleo evaluation pipeline. "
-        "It's queued for automated review (priority: high).\n\n"
-        "You'll receive updates here as it progresses through evaluation.\n\n"
-        "_Automated message from the [LivingIP](https://livingip.xyz) pipeline._"
-    )
-    await _post_comment(gh_pr, body)
-
-
-async def on_eval_complete(conn, forgejo_pr: int, *, outcome: str, review_text: str = None, issues: list[str] = None):
-    """Post evaluation result to GitHub PR.
-
-    outcome: 'approved', 'rejected', 'changes_requested'
-    """
-    gh_pr = _get_github_pr(conn, forgejo_pr)
-    if not gh_pr:
-        return
-
-    if outcome == "approved":
-        body = "**Evaluation: Approved**\n\nYour contribution passed automated review and is queued for merge."
-        if review_text:
-            safe_text = review_text[:3000].replace("</details>", "&lt;/details&gt;")
-            body += f"\n\n<details>\n<summary>Review details</summary>\n\n{safe_text}\n\n</details>"
-    elif outcome == "rejected":
-        body = "**Evaluation: Changes Requested**\n\n"
-        if issues:
-            body += "Issues found:\n"
-            for issue in issues:
-                body += f"- {issue}\n"
-        if review_text:
-            safe_text = review_text[:3000].replace("</details>", "&lt;/details&gt;")
-            body += f"\n<details>\n<summary>Full review</summary>\n\n{safe_text}\n\n</details>"
-        body += (
-            "\n\nThe pipeline will attempt automated fixes where possible. "
-            "If fixes fail, the PR will be closed — you're welcome to resubmit."
-        )
-    else:
-        body = f"**Evaluation: {outcome}**\n\n"
-        if review_text:
-            body += review_text[:3000]
-
-    body += "\n\n_Automated message from the [LivingIP](https://livingip.xyz) pipeline._"
-    await _post_comment(gh_pr, body)
-
-
-async def on_merged(conn, forgejo_pr: int, *, claims_count: int = None):
-    """Post merge confirmation and close GitHub PR."""
-    gh_pr = _get_github_pr(conn, forgejo_pr)
-    if not gh_pr:
-        return
-
-    body = "**Merged!** Your contribution has been merged into the knowledge base."
-    if claims_count and claims_count > 0:
-        body += f" ({claims_count} claim{'s' if claims_count != 1 else ''} added)"
-    body += (
-        "\n\nThank you for contributing to LivingIP. "
-        "Your attribution has been recorded.\n\n"
-        "_Automated message from the [LivingIP](https://livingip.xyz) pipeline._"
-    )
-    await _post_comment(gh_pr, body)
-    await _close_github_pr(gh_pr)
-
-
-async def on_closed(conn, forgejo_pr: int, *, reason: str = None):
-    """Post closure notification and close GitHub PR."""
-    gh_pr = _get_github_pr(conn, forgejo_pr)
-    if not gh_pr:
-        return
-
-    body = "**Closed.** "
-    if reason:
-        body += reason
-    else:
-        body += "This PR was closed after evaluation."
-    body += (
-        "\n\nYou're welcome to resubmit with changes. "
-        "See the evaluation feedback above for guidance.\n\n"
-        "_Automated message from the [LivingIP](https://livingip.xyz) pipeline._"
-    )
-    await _post_comment(gh_pr, body)
-    await _close_github_pr(gh_pr)
--- a/lib/merge.py
+++ b/lib/merge.py
--- a/lib/post_merge.py
+++ b/lib/post_merge.py
@ -1,518 +0,0 @@
-"""Post-merge effects: embedding, reciprocal edges, source archiving.
-
-All functions run after a PR is merged to main. Non-fatal failures
-are logged but do not block the pipeline.
-
-Extracted from merge.py Phase 6b of decomposition.
-"""
-
-import asyncio
-import hashlib
-import json
-import logging
-import os
-import re
-import shutil
-from pathlib import Path
-from typing import Callable
-
-from . import config
-from .frontmatter import (
-    REWEAVE_EDGE_FIELDS,
-    RECIPROCAL_EDGE_MAP,
-    parse_yaml_frontmatter,
-    serialize_edge_fields,
-)
-
-try:
-    from .worktree_lock import async_main_worktree_lock
-except ImportError:
-    from worktree_lock import async_main_worktree_lock
-
-logger = logging.getLogger(__name__)
-
-
-# Accumulates source moves during a merge cycle, batch-committed at the end
-_pending_source_moves: list[tuple[str, str]] = []  # (queue_path, archive_path)
-
-
-def update_source_frontmatter_status(path: str, new_status: str):
-    """Update the status field in a source file's frontmatter. (Ganymede: 5 lines)"""
-    try:
-        text = open(path).read()
-        text = re.sub(r"^status: .*$", f"status: {new_status}", text, count=1, flags=re.MULTILINE)
-        open(path, "w").write(text)
-    except Exception as e:
-        logger.warning("Failed to update source status in %s: %s", path, e)
-
-
-async def embed_merged_claims(main_sha: str, branch_sha: str, git_fn: Callable):
-    """Embed new/changed claim files from a merged PR into Qdrant.
-
-    Diffs main_sha (pre-merge main HEAD) against branch_sha (merged branch tip)
-    to find ALL changed files across the entire branch, not just the last commit.
-    Also deletes Qdrant vectors for files removed by the branch.
-
-    Non-fatal — embedding failure does not block the merge pipeline.
-    """
-    try:
-        # --- Embed added/changed files ---
-        rc, diff_out = await git_fn(
-            "diff", "--name-only", "--diff-filter=ACMR",
-            main_sha, branch_sha,
-            cwd=str(config.MAIN_WORKTREE),
-            timeout=10,
-        )
-        if rc != 0:
-            logger.warning("embed: diff failed (rc=%d), skipping", rc)
-            return
-
-        embed_dirs = {"domains/", "core/", "foundations/", "decisions/", "entities/"}
-        md_files = [
-            f for f in diff_out.strip().split("\n")
-            if f.endswith(".md")
-            and any(f.startswith(d) for d in embed_dirs)
-            and not f.split("/")[-1].startswith("_")
-        ]
-
-        embedded = 0
-        for fpath in md_files:
-            full_path = config.MAIN_WORKTREE / fpath
-            if not full_path.exists():
-                continue
-            proc = await asyncio.create_subprocess_exec(
-                "python3", "/opt/teleo-eval/embed-claims.py", "--file", str(full_path),
-                stdout=asyncio.subprocess.PIPE,
-                stderr=asyncio.subprocess.PIPE,
-            )
-            stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=30)
-            if proc.returncode == 0 and b"OK" in stdout:
-                embedded += 1
-            else:
-                logger.warning("embed: failed for %s: %s", fpath, stderr.decode()[:200])
-
-        if embedded:
-            logger.info("embed: %d/%d files embedded into Qdrant", embedded, len(md_files))
-
-        # --- Delete vectors for removed files (Ganymede: stale vector cleanup) ---
-        rc, del_out = await git_fn(
-            "diff", "--name-only", "--diff-filter=D",
-            main_sha, branch_sha,
-            cwd=str(config.MAIN_WORKTREE),
-            timeout=10,
-        )
-        if rc == 0 and del_out.strip():
-            deleted_files = [
-                f for f in del_out.strip().split("\n")
-                if f.endswith(".md")
-                and any(f.startswith(d) for d in embed_dirs)
-            ]
-            if deleted_files:
-                point_ids = [hashlib.md5(f.encode()).hexdigest() for f in deleted_files]
-                try:
-                    import urllib.request
-                    req = urllib.request.Request(
-                        "http://localhost:6333/collections/teleo-claims/points/delete",
-                        data=json.dumps({"points": point_ids}).encode(),
-                        headers={"Content-Type": "application/json"},
-                        method="POST",
-                    )
-                    urllib.request.urlopen(req, timeout=10)
-                    logger.info("embed: deleted %d stale vectors from Qdrant", len(point_ids))
-                except Exception:
-                    logger.warning("embed: failed to delete stale vectors (non-fatal)")
-    except Exception:
-        logger.exception("embed: post-merge embedding failed (non-fatal)")
-
-
-def find_claim_file(slug: str):
-    """Find a claim file on disk by its slug. Searches domains/, core/, foundations/.
-
-    Returns Path or None.
-    """
-    worktree = config.MAIN_WORKTREE
-    for search_dir in ("domains", "core", "foundations"):
-        base = worktree / search_dir
-        if not base.is_dir():
-            continue
-        # Direct match
-        for md in base.rglob(f"{slug}.md"):
-            if not md.name.startswith("_"):
-                return md
-    return None
-
-
-def add_edge_to_file(file_path, edge_type: str, target_slug: str) -> bool:
-    """Add a single edge to a file's frontmatter. Returns True if modified."""
-    try:
-        content = file_path.read_text()
-    except Exception:
-        return False
-
-    fm, raw_fm, body = parse_yaml_frontmatter(content)
-    if fm is None:
-        return False
-
-    # Check for existing edge (dedup)
-    existing = fm.get(edge_type, [])
-    if isinstance(existing, str):
-        existing = [existing]
-    if not isinstance(existing, list):
-        existing = []
-
-    if any(str(e).strip().lower() == target_slug.lower() for e in existing):
-        return False  # Already exists
-
-    # Build merged edges (all edge fields, only modifying the target one)
-    merged_edges = {}
-    for field in REWEAVE_EDGE_FIELDS:
-        vals = fm.get(field, [])
-        if isinstance(vals, str):
-            vals = [vals]
-        if not isinstance(vals, list):
-            vals = []
-        merged_edges[field] = list(vals)
-
-    merged_edges.setdefault(edge_type, []).append(target_slug)
-
-    # Serialize using the same string-surgery approach as reweave
-    new_fm = serialize_edge_fields(raw_fm, merged_edges)
-    if body.startswith("\n"):
-        new_content = f"---\n{new_fm}{body}"
-    else:
-        new_content = f"---\n{new_fm}\n{body}"
-
-    try:
-        file_path.write_text(new_content)
-        return True
-    except Exception:
-        return False
-
-
-async def reciprocal_edges(main_sha: str, branch_sha: str, git_fn: Callable):
-    """Add reciprocal edges on existing claims after a PR merges.
-
-    When a new claim A has `supports: [B]` in its frontmatter, B should have
-    `supports: [A]` added to its own frontmatter. This gives A an incoming link,
-    preventing it from being an orphan.
-
-    Runs on main after cherry-pick merge. Non-fatal — orphans are recoverable.
-    Only processes new files (diff-filter=A), not modified files.
-    """
-    EDGE_FIELDS = ("supports", "challenges", "related")
-
-    try:
-        # Find newly added claim files
-        rc, diff_out = await git_fn(
-            "diff", "--name-only", "--diff-filter=A",
-            main_sha, branch_sha,
-            cwd=str(config.MAIN_WORKTREE),
-            timeout=10,
-        )
-        if rc != 0:
-            logger.warning("reciprocal_edges: diff failed (rc=%d), skipping", rc)
-            return
-
-        claim_dirs = {"domains/", "core/", "foundations/"}
-        new_claims = [
-            f for f in diff_out.strip().split("\n")
-            if f.endswith(".md")
-            and any(f.startswith(d) for d in claim_dirs)
-            and not f.split("/")[-1].startswith("_")
-            and "/entities/" not in f
-            and "/decisions/" not in f
-        ]
-
-        if not new_claims:
-            return
-
-        reciprocals_added = 0
-        modified_files = set()
-        for claim_path in new_claims:
-            full_path = config.MAIN_WORKTREE / claim_path
-            if not full_path.exists():
-                continue
-
-            try:
-                content = full_path.read_text()
-            except Exception:
-                continue
-
-            fm, raw_fm, body = parse_yaml_frontmatter(content)
-            if fm is None:
-                continue
-
-            # Get the new claim's slug (filename without .md)
-            claim_slug = claim_path.rsplit("/", 1)[-1].replace(".md", "")
-
-            # Collect all edge targets from this new claim
-            for field in EDGE_FIELDS:
-                targets = fm.get(field, [])
-                if isinstance(targets, str):
-                    targets = [targets]
-                if not isinstance(targets, list):
-                    continue
-
-                for target_slug in targets:
-                    target_slug = str(target_slug).strip()
-                    if not target_slug:
-                        continue
-
-                    # Find the target file on disk
-                    target_file = find_claim_file(target_slug)
-                    if target_file is None:
-                        continue
-
-                    # Add reciprocal edge: target now has field: [new_claim_slug]
-                    reciprocal_type = RECIPROCAL_EDGE_MAP.get(field, "related")
-                    if add_edge_to_file(target_file, reciprocal_type, claim_slug):
-                        reciprocals_added += 1
-                        modified_files.add(str(target_file))
-
-        if reciprocals_added > 0:
-            # Stage only the files we modified (never git add -A in automation)
-            for f in modified_files:
-                await git_fn("add", f, cwd=str(config.MAIN_WORKTREE))
-            rc, out = await git_fn(
-                "commit", "-m", f"reciprocal edges: {reciprocals_added} edges from {len(new_claims)} new claims",
-                cwd=str(config.MAIN_WORKTREE),
-            )
-            if rc == 0:
-                # Push immediately — batch-extract-50.sh does reset --hard origin/main
-                # every 15 min, which destroys unpushed local commits
-                push_rc, push_out = await git_fn(
-                    "push", "origin", "main",
-                    cwd=str(config.MAIN_WORKTREE),
-                    timeout=30,
-                )
-                if push_rc == 0:
-                    logger.info("reciprocal_edges: %d edges pushed to main (%d new claims)", reciprocals_added, len(new_claims))
-                else:
-                    logger.warning("reciprocal_edges: push failed (commit is local only): %s", push_out[:200])
-            else:
-                logger.warning("reciprocal_edges: commit failed: %s", out[:200])
-
-    except Exception:
-        logger.exception("reciprocal_edges: failed (non-fatal)")
-
-
-async def backlink_source_claims(main_sha: str, branch_sha: str, git_fn: Callable):
-    """After merge, update source files with claims_extracted backlinks.
-
-    Reads sourced_from from merged claim frontmatter, finds the source file,
-    and appends the claim filename to its claims_extracted list.
-    Only runs for newly added claims (diff-filter=A).
-    """
-    try:
-        rc, diff_out = await git_fn(
-            "diff", "--name-only", "--diff-filter=A",
-            main_sha, branch_sha,
-            cwd=str(config.MAIN_WORKTREE),
-            timeout=10,
-        )
-        if rc != 0:
-            logger.warning("backlink_source_claims: diff failed (rc=%d), skipping", rc)
-            return
-
-        claim_dirs = {"domains/", "core/", "foundations/"}
-        new_claims = [
-            f for f in diff_out.strip().split("\n")
-            if f.endswith(".md")
-            and any(f.startswith(d) for d in claim_dirs)
-            and not f.split("/")[-1].startswith("_")
-            and "/entities/" not in f
-            and "/decisions/" not in f
-        ]
-
-        if not new_claims:
-            return
-
-        modified_sources = {}
-        for claim_path in new_claims:
-            full_path = config.MAIN_WORKTREE / claim_path
-            if not full_path.exists():
-                continue
-
-            try:
-                content = full_path.read_text()
-            except Exception:
-                continue
-
-            fm, raw_fm, body = parse_yaml_frontmatter(content)
-            if fm is None:
-                continue
-
-            sourced_from = fm.get("sourced_from", "")
-            if not sourced_from:
-                continue
-
-            source_path = config.MAIN_WORKTREE / "inbox" / "archive" / sourced_from
-            if not source_path.exists():
-                logger.debug("backlink_source_claims: source %s not found at %s", sourced_from, source_path)
-                continue
-
-            claim_filename = claim_path.rsplit("/", 1)[-1].replace(".md", "")
-
-            try:
-                source_content = source_path.read_text()
-            except Exception:
-                continue
-
-            source_fm, source_raw_fm, source_body = parse_yaml_frontmatter(source_content)
-            if source_fm is None:
-                continue
-
-            existing_claims = source_fm.get("claims_extracted", [])
-            if isinstance(existing_claims, str):
-                existing_claims = [existing_claims]
-            if not isinstance(existing_claims, list):
-                existing_claims = []
-
-            if claim_filename in existing_claims:
-                continue
-
-            existing_claims.append(claim_filename)
-            new_block = "claims_extracted:\n" + "\n".join(f"- {c}" for c in existing_claims)
-
-            lines = source_content.split("\n")
-            if "claims_extracted:" not in source_content:
-                end_idx = None
-                for i, line in enumerate(lines):
-                    if i > 0 and line.strip() == "---":
-                        end_idx = i
-                        break
-                if end_idx is None:
-                    continue
-                lines.insert(end_idx, new_block)
-            else:
-                start_idx = None
-                end_idx = None
-                for i, line in enumerate(lines):
-                    if line.startswith("claims_extracted:"):
-                        start_idx = i
-                    elif start_idx is not None and not line.startswith("- "):
-                        end_idx = i
-                        break
-                if start_idx is None:
-                    continue
-                if end_idx is None:
-                    end_idx = len(lines)
-                lines[start_idx:end_idx] = new_block.split("\n")
-
-            modified_sources[str(source_path)] = "\n".join(lines)
-            logger.info("backlink_source_claims: added %s to %s", claim_filename, sourced_from)
-
-        if modified_sources:
-            async with async_main_worktree_lock():
-                for sp, content in modified_sources.items():
-                    Path(sp).write_text(content)
-                    await git_fn("add", sp, cwd=str(config.MAIN_WORKTREE))
-                rc, out = await git_fn(
-                    "commit", "-m", f"backlink: update claims_extracted on {len(modified_sources)} source(s)",
-                    cwd=str(config.MAIN_WORKTREE),
-                    timeout=15,
-                )
-                if rc == 0:
-                    push_rc, push_out = await git_fn(
-                        "push", "origin", "main",
-                        cwd=str(config.MAIN_WORKTREE),
-                        timeout=30,
-                    )
-                    if push_rc == 0:
-                        logger.info("backlink_source_claims: %d source(s) updated and pushed", len(modified_sources))
-                    else:
-                        logger.warning("backlink_source_claims: push failed: %s", push_out[:200])
-                else:
-                    logger.warning("backlink_source_claims: commit failed: %s", out[:200])
-
-    except Exception:
-        logger.exception("backlink_source_claims: failed (non-fatal)")
-
-
-def archive_source_for_pr(branch: str, domain: str, merged: bool = True):
-    """Move source from queue/ to archive/{domain}/ after PR merge or close.
-
-    Only handles extract/ branches (Ganymede: skip research sessions).
-    Updates frontmatter: 'processed' for merged, 'rejected' for closed.
-    Accumulates moves for batch commit at end of merge cycle.
-    """
-    if not branch.startswith("extract/"):
-        return
-
-    source_slug = branch.replace("extract/", "", 1)
-    main_dir = config.MAIN_WORKTREE if hasattr(config, "MAIN_WORKTREE") else "/opt/teleo-eval/workspaces/main"
-    queue_path = os.path.join(main_dir, "inbox", "queue", f"{source_slug}.md")
-    archive_dir = os.path.join(main_dir, "inbox", "archive", domain or "unknown")
-    archive_path = os.path.join(archive_dir, f"{source_slug}.md")
-
-    # Already in archive? Delete queue duplicate
-    if os.path.exists(archive_path):
-        if os.path.exists(queue_path):
-            try:
-                os.remove(queue_path)
-                _pending_source_moves.append((queue_path, "deleted"))
-                logger.info("Source dedup: deleted queue/%s (already in archive/%s)", source_slug, domain)
-            except Exception as e:
-                logger.warning("Source dedup failed: %s", e)
-        return
-
-    # Move from queue to archive
-    if os.path.exists(queue_path):
-        # Update frontmatter before moving (Ganymede: distinguish merged vs rejected)
-        update_source_frontmatter_status(queue_path, "processed" if merged else "rejected")
-        os.makedirs(archive_dir, exist_ok=True)
-        try:
-            shutil.move(queue_path, archive_path)
-            _pending_source_moves.append((queue_path, archive_path))
-            logger.info("Source archived: queue/%s → archive/%s/ (status=%s)",
-                        source_slug, domain, "processed" if merged else "rejected")
-        except Exception as e:
-            logger.warning("Source archive failed: %s", e)
-
-
-async def commit_source_moves(git_fn: Callable):
-    """Batch commit accumulated source moves. Called at end of merge cycle.
-
-    Rhea review: fetch+reset before touching files, use main_worktree_lock,
-    crash gap is self-healing (reset --hard reverts uncommitted moves).
-    """
-    if not _pending_source_moves:
-        return
-
-    main_dir = config.MAIN_WORKTREE if hasattr(config, "MAIN_WORKTREE") else "/opt/teleo-eval/workspaces/main"
-    count = len(_pending_source_moves)
-    _pending_source_moves.clear()
-
-    # Acquire file lock — coordinates with telegram bot and other daemon stages (Ganymede: Option C)
-    try:
-        async with async_main_worktree_lock(timeout=10):
-            # Sync worktree with remote (Rhea: fetch+reset, not pull)
-            await git_fn("fetch", "origin", "main", cwd=main_dir, timeout=30)
-            await git_fn("reset", "--hard", "origin/main", cwd=main_dir, timeout=30)
-
-            await git_fn("add", "-A", "inbox/", cwd=main_dir)
-
-            rc, out = await git_fn(
-                "commit", "-m",
-                f"pipeline: archive {count} source(s) post-merge\n\n"
-                f"Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>",
-                cwd=main_dir,
-            )
-            if rc != 0:
-                if "nothing to commit" in out:
-                    return
-                logger.warning("Source archive commit failed: %s", out)
-                return
-
-            for attempt in range(3):
-                await git_fn("pull", "--rebase", "origin", "main", cwd=main_dir, timeout=30)
-                rc_push, _ = await git_fn("push", "origin", "main", cwd=main_dir, timeout=30)
-                if rc_push == 0:
-                    logger.info("Committed + pushed %d source archive moves", count)
-                    return
-                await asyncio.sleep(2)
-
-            logger.warning("Failed to push source archive moves after 3 attempts")
-            await git_fn("reset", "--hard", "origin/main", cwd=main_dir)
-    except TimeoutError:
-        logger.warning("Source archive commit skipped: worktree lock timeout")
--- a/lib/pr_state.py
+++ b/lib/pr_state.py
@ -1,241 +0,0 @@
-"""PR state transitions — single source of truth for all status changes.
-
-Every UPDATE prs SET status = ... MUST go through this module.
-
-Invariants enforced:
- close: always syncs Forgejo (opt-out for reconciliation only)
- approve: requires non-empty domain (ValueError)
- merged: always sets merged_at, clears last_error
- conflict: always increments merge_failures, sets merge_cycled
-
-Why this exists: 36 hand-crafted status transitions across evaluate.py
-and merge.py produced 3 incidents (domain NULL, Forgejo ghost PRs,
-merge_cycled missing). Centralizing eliminates the entire class of
-"forgot to update X in this one code path" bugs.
-"""
-
-import logging
-
-from .forgejo import api as forgejo_api, repo_path
-
-logger = logging.getLogger("pipeline.pr_state")
-
-
-async def close_pr(
-    conn,
-    pr_number: int,
-    *,
-    last_error: str = None,
-    merge_cycled: bool = False,
-    inc_merge_failures: bool = False,
-    close_on_forgejo: bool = True,
-) -> bool:
-    """Close a PR in DB and on Forgejo. Returns True on success, False on Forgejo failure.
-
-    Args:
-        close_on_forgejo: False only when caller already closed on Forgejo
-            (reconciliation, ghost PR cleanup after manual close).
-
-    If Forgejo API fails, the DB update is SKIPPED to prevent ghost PRs
-    (DB says closed, Forgejo says open). The reconciliation loop in
-    merge.py._reconcile_db_state catches any that slip through.
-    """
-    if close_on_forgejo:
-        result = await forgejo_api("PATCH", repo_path(f"pulls/{pr_number}"), {"state": "closed"})
-        if result is None:
-            logger.error("close_pr: Forgejo API failed for PR #%d, skipping DB update", pr_number)
-            return False
-
-    parts = ["status = 'closed'"]
-    params = []
-
-    if last_error is not None:
-        parts.append("last_error = ?")
-        params.append(last_error)
-
-    if merge_cycled:
-        parts.append("merge_cycled = 1")
-
-    if inc_merge_failures:
-        parts.append("merge_failures = COALESCE(merge_failures, 0) + 1")
-
-    params.append(pr_number)
-    conn.execute(f"UPDATE prs SET {', '.join(parts)} WHERE number = ?", params)
-    return True
-
-
-def approve_pr(
-    conn,
-    pr_number: int,
-    *,
-    domain: str,
-    auto_merge: int = 0,
-    leo_verdict: str = None,
-    domain_verdict: str = None,
-):
-    """Approve a PR. Raises ValueError if domain is empty/None."""
-    if not domain:
-        raise ValueError(f"Cannot approve PR #{pr_number} without domain")
-
-    parts = ["status = 'approved'", "domain = COALESCE(domain, ?)"]
-    params = [domain]
-
-    parts.append("auto_merge = ?")
-    params.append(auto_merge)
-
-    if leo_verdict is not None:
-        parts.append("leo_verdict = ?")
-        params.append(leo_verdict)
-
-    if domain_verdict is not None:
-        parts.append("domain_verdict = ?")
-        params.append(domain_verdict)
-
-    params.append(pr_number)
-    conn.execute(f"UPDATE prs SET {', '.join(parts)} WHERE number = ?", params)
-
-
-def mark_merged(conn, pr_number: int):
-    """Mark PR as merged. Always sets merged_at, clears last_error."""
-    conn.execute(
-        "UPDATE prs SET status = 'merged', merged_at = datetime('now'), "
-        "last_error = NULL WHERE number = ?",
-        (pr_number,),
-    )
-
-
-def mark_conflict(conn, pr_number: int, *, last_error: str = None):
-    """Mark PR as conflict. Always increments merge_failures, sets merge_cycled."""
-    conn.execute(
-        "UPDATE prs SET status = 'conflict', merge_cycled = 1, "
-        "merge_failures = COALESCE(merge_failures, 0) + 1, "
-        "last_error = ? WHERE number = ?",
-        (last_error, pr_number),
-    )
-
-
-def mark_conflict_permanent(
-    conn,
-    pr_number: int,
-    *,
-    last_error: str = None,
-    conflict_rebase_attempts: int = None,
-):
-    """Mark PR as permanently conflicted (no more retries)."""
-    parts = ["status = 'conflict_permanent'"]
-    params = []
-
-    if last_error is not None:
-        parts.append("last_error = ?")
-        params.append(last_error)
-
-    if conflict_rebase_attempts is not None:
-        parts.append("conflict_rebase_attempts = ?")
-        params.append(conflict_rebase_attempts)
-
-    params.append(pr_number)
-    conn.execute(f"UPDATE prs SET {', '.join(parts)} WHERE number = ?", params)
-
-
-def reopen_pr(
-    conn,
-    pr_number: int,
-    *,
-    leo_verdict: str = None,
-    domain_verdict: str = None,
-    last_error: str = None,
-    eval_issues: str = None,
-    dec_eval_attempts: bool = False,
-    reset_for_reeval: bool = False,
-    conflict_rebase_attempts: int = None,
-):
-    """Set PR back to open.
-
-    Covers all reopen scenarios:
-    - Transient failure (API error): no extra args
-    - Rejection: leo_verdict + last_error + eval_issues
-    - Batch overflow: dec_eval_attempts=True
-    - Conflict resolved: reset_for_reeval=True
-    """
-    parts = ["status = 'open'"]
-    params = []
-
-    if reset_for_reeval:
-        parts.extend([
-            "leo_verdict = 'pending'",
-            "domain_verdict = 'pending'",
-            "eval_attempts = 0",
-        ])
-    else:
-        if leo_verdict is not None:
-            parts.append("leo_verdict = ?")
-            params.append(leo_verdict)
-        if domain_verdict is not None:
-            parts.append("domain_verdict = ?")
-            params.append(domain_verdict)
-
-    if last_error is not None:
-        parts.append("last_error = ?")
-        params.append(last_error)
-
-    if eval_issues is not None:
-        parts.append("eval_issues = ?")
-        params.append(eval_issues)
-
-    if dec_eval_attempts:
-        parts.append("eval_attempts = COALESCE(eval_attempts, 1) - 1")
-
-    if conflict_rebase_attempts is not None:
-        parts.append("conflict_rebase_attempts = ?")
-        params.append(conflict_rebase_attempts)
-
-    params.append(pr_number)
-    conn.execute(f"UPDATE prs SET {', '.join(parts)} WHERE number = ?", params)
-
-
-def start_fixing(conn, pr_number: int) -> bool:
-    """Atomically claim PR for fixing (status open -> fixing).
-
-    Also increments fix_attempts and sets last_attempt in one statement.
-    Returns True if claimed, False if already claimed.
-    """
-    cursor = conn.execute(
-        "UPDATE prs SET status = 'fixing', "
-        "fix_attempts = COALESCE(fix_attempts, 0) + 1, "
-        "last_attempt = datetime('now') "
-        "WHERE number = ? AND status = 'open'",
-        (pr_number,),
-    )
-    return cursor.rowcount > 0
-
-
-def reset_for_reeval(conn, pr_number: int):
-    """Reset a PR for re-evaluation after a fix.
-
-    Clears all eval state so the PR goes through the full eval cycle again.
-    Used by both mechanical fixer and substantive fixer after successful fixes.
-    """
-    conn.execute(
-        """UPDATE prs SET
-           status = 'open',
-           eval_attempts = 0,
-           eval_issues = '[]',
-           tier0_pass = NULL,
-           domain_verdict = 'pending',
-           leo_verdict = 'pending',
-           last_error = NULL
-           WHERE number = ?""",
-        (pr_number,),
-    )
-
-
-def start_review(conn, pr_number: int) -> bool:
-    """Atomically claim PR for review (status open -> reviewing).
-
-    Returns True if claimed, False if already claimed by another worker.
-    """
-    cursor = conn.execute(
-        "UPDATE prs SET status = 'reviewing' WHERE number = ? AND status = 'open'",
-        (pr_number,),
-    )
-    return cursor.rowcount > 0
--- a/lib/stale_pr.py
+++ b/lib/stale_pr.py
@ -1,86 +1,220 @@
-"""Stale extraction PR cleanup — closes extraction PRs that produce no claims.
+"""Stale PR monitor — auto-close extraction PRs that produced no claims.

-When an extraction PR sits open >30 min with claims_count=0, it indicates:
- Extraction failed (model couldn't extract anything useful)
- Batch job stalled (no claims written)
- Source material is empty/junk
+Catches the failure mode where batch-extract creates a PR but extraction
+produces only source-file updates (no actual claims). These PRs sit open
+indefinitely, consuming merge queue bandwidth and confusing metrics.

-Auto-closing prevents zombie PRs from blocking the pipeline.
-Logs each close for root cause analysis (model failures, bad sources, etc.).
+Rules:
+  - PR branch starts with "extract/"
+  - PR is open for >30 minutes
+  - PR diff contains 0 files in domains/*/ or decisions/*/
+  → Auto-close with comment, log to audit_log as stale_extraction_closed

-Epimetheus owns this module.
+  - If same source branch has been stale-closed 2+ times
+  → Mark source as extraction_failed in pipeline.db sources table
+
+Called from the pipeline daemon (piggyback on validate_cycle interval)
+or standalone via: python3 -m lib.stale_pr
+
+Owner: Epimetheus
 """

-import json
 import logging
-from datetime import datetime, timezone
+import json
+import os
+import re
+import sqlite3
+import urllib.request
+from datetime import datetime, timedelta, timezone

-from . import config, db
-from .forgejo import api, repo_path
-from .pr_state import close_pr
+from . import config

 logger = logging.getLogger("pipeline.stale_pr")

-STALE_THRESHOLD_MINUTES = 45
+STALE_THRESHOLD_MINUTES = 30
+MAX_STALE_FAILURES = 2  # After this many stale closures, mark source as failed


-async def check_stale_prs(conn) -> tuple[int, int]:
-    """Auto-close extraction PRs open >30 min with zero claims.
+def _forgejo_api(method: str, path: str, body: dict | None = None) -> dict | list | None:
+    """Call Forgejo API. Returns parsed JSON or None on failure."""
+    token_file = config.FORGEJO_TOKEN_FILE
+    if not token_file.exists():
+        logger.error("No Forgejo token at %s", token_file)
+        return None
+    token = token_file.read_text().strip()

-    Returns (stale_closed, stale_errors) — count of closed PRs and close failures.
+    url = f"{config.FORGEJO_URL}/api/v1/{path}"
+    data = json.dumps(body).encode() if body else None
+    req = urllib.request.Request(
+        url,
+        data=data,
+        headers={
+            "Authorization": f"token {token}",
+            "Content-Type": "application/json",
+        },
+        method=method,
+    )
+    try:
+        with urllib.request.urlopen(req, timeout=15) as resp:
+            return json.loads(resp.read())
+    except Exception as e:
+        logger.warning("Forgejo API %s %s failed: %s", method, path, e)
+        return None
+
+
+def _pr_has_claim_files(pr_number: int) -> bool:
+    """Check if a PR's diff contains any files in domains/ or decisions/."""
+    diff_data = _forgejo_api("GET", f"repos/{config.FORGEJO_OWNER}/{config.FORGEJO_REPO}/pulls/{pr_number}/files")
+    if not diff_data or not isinstance(diff_data, list):
+        return False
+
+    for file_entry in diff_data:
+        filename = file_entry.get("filename", "")
+        if filename.startswith("domains/") or filename.startswith("decisions/"):
+            # Check it's a .md file, not a directory marker
+            if filename.endswith(".md"):
+                return True
+    return False
+
+
+def _close_pr(pr_number: int, reason: str) -> bool:
+    """Close a PR with a comment explaining why."""
+    # Add comment
+    _forgejo_api("POST",
+        f"repos/{config.FORGEJO_OWNER}/{config.FORGEJO_REPO}/issues/{pr_number}/comments",
+        {"body": f"Auto-closed by stale PR monitor: {reason}\n\nPentagon-Agent: Epimetheus"},
+    )
+    # Close PR
+    result = _forgejo_api("PATCH",
+        f"repos/{config.FORGEJO_OWNER}/{config.FORGEJO_REPO}/pulls/{pr_number}",
+        {"state": "closed"},
+    )
+    return result is not None
+
+
+def _log_audit(conn: sqlite3.Connection, pr_number: int, branch: str):
+    """Log stale closure to audit_log."""
+    try:
+        conn.execute(
+            "INSERT INTO audit_log (timestamp, stage, event, detail) VALUES (datetime('now'), ?, ?, ?)",
+            ("monitor", "stale_extraction_closed", json.dumps({"pr": pr_number, "branch": branch})),
+        )
+        conn.commit()
+    except Exception as e:
+        logger.warning("Audit log write failed: %s", e)
+
+
+def _count_stale_closures(conn: sqlite3.Connection, branch: str) -> int:
+    """Count how many times this branch has been stale-closed."""
+    try:
+        row = conn.execute(
+            "SELECT COUNT(*) FROM audit_log WHERE event = 'stale_extraction_closed' AND detail LIKE ?",
+            (f'%"branch": "{branch}"%',),
+        ).fetchone()
+        return row[0] if row else 0
+    except Exception:
+        return 0
+
+
+def _mark_source_failed(conn: sqlite3.Connection, branch: str):
+    """Mark the source as extraction_failed after repeated stale closures."""
+    # Extract source name from branch: extract/source-name → source-name
+    source_name = branch.removeprefix("extract/")
+    try:
+        conn.execute(
+            "UPDATE sources SET status = 'extraction_failed', last_error = 'repeated_stale_extraction', updated_at = datetime('now') WHERE path LIKE ?",
+            (f"%{source_name}%",),
+        )
+        conn.commit()
+        logger.info("Marked source %s as extraction_failed (repeated stale closures)", source_name)
+    except Exception as e:
+        logger.warning("Failed to mark source as failed: %s", e)
+
+
+def check_stale_prs(conn: sqlite3.Connection) -> tuple[int, int]:
+    """Check for and close stale extraction PRs.
+
+    Returns (closed_count, error_count).
    """
-    stale_closed = 0
-    stale_errors = 0
+    closed = 0
+    errors = 0

-    # Find extraction PRs: open >30 min, source has 0 claims
-    stale_prs = conn.execute(
-        """SELECT p.number, p.branch, p.source_path, p.created_at
-           FROM prs p
-           LEFT JOIN sources s ON p.source_path = s.path
-           WHERE p.status = 'open'
-           AND p.commit_type = 'extract'
-           AND datetime(p.created_at) < datetime('now', '-' || ? || ' minutes')
-           AND COALESCE(s.claims_count, 0) = 0""",
-        (STALE_THRESHOLD_MINUTES,),
-    ).fetchall()
+    # Fetch all open PRs (paginated)
+    page = 1
+    all_prs = []
+    while True:
+        prs = _forgejo_api("GET",
+            f"repos/{config.FORGEJO_OWNER}/{config.FORGEJO_REPO}/pulls?state=open&limit=50&page={page}")
+        if not prs:
+            break
+        all_prs.extend(prs)
+        if len(prs) < 50:
+            break
+        page += 1

-    for pr in stale_prs:
-        pr_num = pr["number"]
-        source_path = pr["source_path"] or "unknown"
+    now = datetime.now(timezone.utc)

+    for pr in all_prs:
+        branch = pr.get("head", {}).get("ref", "")
+        if not branch.startswith("extract/"):
+            continue
+
+        # Check age
+        created_str = pr.get("created_at", "")
+        if not created_str:
+            continue
        try:
-            closed = await close_pr(conn, pr_num,
-                                    last_error=f"stale: no claims after {STALE_THRESHOLD_MINUTES} min")
-            if not closed:
-                stale_errors += 1
-                logger.warning(
-                    "Failed to close stale extraction PR #%d (%s, %s)",
-                    pr_num, source_path, pr["branch"],
-                )
-                continue
+            # Forgejo returns ISO format with Z suffix
+            created = datetime.fromisoformat(created_str.replace("Z", "+00:00"))
+        except ValueError:
+            continue

-            db.audit(
-                conn,
-                "watchdog",
-                "stale_pr_closed",
-                json.dumps({
-                    "pr": pr_num,
-                    "branch": pr["branch"],
-                    "source": source_path,
-                    "open_minutes": STALE_THRESHOLD_MINUTES,
-                }),
-            )
-            stale_closed += 1
-            logger.info(
-                "WATCHDOG: closed stale extraction PR #%d (no claims after %d min): %s",
-                pr_num, STALE_THRESHOLD_MINUTES, source_path,
-            )
+        age_minutes = (now - created).total_seconds() / 60
+        if age_minutes < STALE_THRESHOLD_MINUTES:
+            continue

-        except Exception as e:
-            stale_errors += 1
-            logger.warning(
-                "Stale PR close exception for #%d: %s",
-                pr_num, e,
-            )
+        pr_number = pr["number"]

-    return stale_closed, stale_errors
+        # Check if PR has claim files
+        if _pr_has_claim_files(pr_number):
+            continue  # PR has claims — not stale
+
+        # PR is stale — close it
+        logger.info("Stale PR #%d: branch=%s, age=%.0f min, no claim files — closing",
+                     pr_number, branch, age_minutes)
+
+        if _close_pr(pr_number, f"No claim files after {int(age_minutes)} minutes. Branch: {branch}"):
+            closed += 1
+            _log_audit(conn, pr_number, branch)
+
+            # Check for repeated failures
+            failure_count = _count_stale_closures(conn, branch)
+            if failure_count >= MAX_STALE_FAILURES:
+                _mark_source_failed(conn, branch)
+                logger.warning("Source %s marked as extraction_failed after %d stale closures",
+                               branch, failure_count)
+        else:
+            errors += 1
+            logger.warning("Failed to close stale PR #%d", pr_number)
+
+    if closed:
+        logger.info("Stale PR monitor: closed %d PRs", closed)
+
+    return closed, errors
+
+
+# Allow standalone execution
+if __name__ == "__main__":
+    import sys
+    logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
+
+    db_path = config.DB_PATH
+    if not db_path.exists():
+        print(f"ERROR: Database not found at {db_path}", file=sys.stderr)
+        sys.exit(1)
+
+    conn = sqlite3.connect(str(db_path))
+    conn.row_factory = sqlite3.Row
+    closed, errs = check_stale_prs(conn)
+    print(f"Stale PR monitor: {closed} closed, {errs} errors")
+    conn.close()
--- a/lib/substantive_fixer.py
+++ b/lib/substantive_fixer.py
@ -24,7 +24,6 @@ from pathlib import Path

 from . import config, db
 from .forgejo import api as forgejo_api, get_agent_token, get_pr_diff, repo_path
-from .pr_state import close_pr, reset_for_reeval, start_fixing
 from .llm import openrouter_call

 logger = logging.getLogger("pipeline.substantive_fixer")
@ -226,10 +225,20 @@ def _classify_substantive(issues: list[str]) -> str:

 async def _fix_pr(conn, pr_number: int) -> dict:
    """Attempt a substantive fix on a single PR. Returns result dict."""
-    # Atomic claim — prevent concurrent fixers and evaluators
-    if not start_fixing(conn, pr_number):
+    # Atomic claim
+    cursor = conn.execute(
+        "UPDATE prs SET status = 'fixing', last_attempt = datetime('now') WHERE number = ? AND status = 'open'",
+        (pr_number,),
+    )
+    if cursor.rowcount == 0:
        return {"pr": pr_number, "skipped": True, "reason": "not_open"}

+    # Increment fix attempts
+    conn.execute(
+        "UPDATE prs SET fix_attempts = COALESCE(fix_attempts, 0) + 1 WHERE number = ?",
+        (pr_number,),
+    )
+
    row = conn.execute(
        "SELECT branch, source_path, domain, eval_issues, fix_attempts FROM prs WHERE number = ?",
        (pr_number,),
@ -262,7 +271,10 @@ async def _fix_pr(conn, pr_number: int) -> dict:

    if classification == "droppable":
        logger.info("PR #%d: droppable (%s) — closing", pr_number, issues)
-        await close_pr(conn, pr_number, last_error=f"droppable: {issues}")
+        conn.execute(
+            "UPDATE prs SET status = 'closed', last_error = ? WHERE number = ?",
+            (f"droppable: {issues}", pr_number),
+        )
        return {"pr": pr_number, "action": "closed_droppable", "issues": issues}

    # Refresh main worktree for source read (Ganymede: ensure freshness)
@ -290,8 +302,11 @@ async def _fix_pr(conn, pr_number: int) -> dict:
            conn, pr_number, claim_files, domain,
        )
        if result.get("converted"):
-            await close_pr(conn, pr_number,
-                           last_error=f"auto-enriched: {result['target_claim']} (sim={result['similarity']:.2f})")
+            conn.execute(
+                "UPDATE prs SET status = 'closed', last_error = ? WHERE number = ?",
+                (f"auto-enriched: {result['target_claim']} (sim={result['similarity']:.2f})", pr_number),
+            )
+            await forgejo_api("PATCH", repo_path(f"pulls/{pr_number}"), {"state": "closed"})
            await forgejo_api("POST", repo_path(f"issues/{pr_number}/comments"), {
                "body": (
                    f"**Auto-converted:** Evidence from this PR enriched "
@ -379,7 +394,18 @@ async def _fix_pr(conn, pr_number: int) -> dict:
            return {"pr": pr_number, "skipped": True, "reason": "nothing_to_commit"}

        # Reset eval state BEFORE push (same pattern as fixer.py)
-        reset_for_reeval(conn, pr_number)
+        conn.execute(
+            """UPDATE prs SET
+                status = 'open',
+                eval_attempts = 0,
+                eval_issues = '[]',
+                tier0_pass = NULL,
+                domain_verdict = 'pending',
+                leo_verdict = 'pending',
+                last_error = NULL
+                WHERE number = ?""",
+            (pr_number,),
+        )

        rc, out = await _git("push", "origin", branch, cwd=worktree_path, timeout=30)
        if rc != 0:
@ -473,7 +499,13 @@ async def _auto_convert_near_duplicate(

 async def _close_and_reextract(conn, pr_number: int, issues: list[str]):
    """Close PR and mark source for re-extraction with feedback."""
-    await close_pr(conn, pr_number, last_error=f"unfixable: {', '.join(issues)}")
+    await forgejo_api(
+        "PATCH", repo_path(f"pulls/{pr_number}"), {"state": "closed"},
+    )
+    conn.execute(
+        "UPDATE prs SET status = 'closed', last_error = ? WHERE number = ?",
+        (f"unfixable: {', '.join(issues)}", pr_number),
+    )
    conn.execute(
        """UPDATE sources SET status = 'needs_reextraction', feedback = ?,
           updated_at = datetime('now')
--- a/lib/validate.py
+++ b/lib/validate.py
@ -140,12 +140,7 @@ def validate_schema(fm: dict) -> list[str]:
    valid_conf = schema.get("valid_confidence")
    confidence = fm.get("confidence")
    if valid_conf and confidence and confidence not in valid_conf:
-        # Common LLM aliases — normalize before failing
-        _CONFIDENCE_ALIASES = {"high": "likely", "medium": "experimental", "low": "speculative", "very high": "proven", "moderate": "experimental"}
-        if isinstance(confidence, str) and confidence.lower().strip() in _CONFIDENCE_ALIASES:
-            pass  # Fixable by post-extract or fixer — don't gate on this
-        else:
-            violations.append(f"invalid_confidence:{confidence}")
+        violations.append(f"invalid_confidence:{confidence}")

    desc = fm.get("description")
    if isinstance(desc, str) and len(desc.strip()) < 10:
@ -555,16 +550,6 @@ def tier05_mechanical_check(diff: str, existing_claims: set[str] | None = None)
        is_new = filepath in new_files

        if is_new:
-            # Strip code fences — LLM agents sometimes wrap content in ```markdown or ```yaml
-            stripped = content.strip()
-            if stripped.startswith("```"):
-                first_nl = stripped.find("\n")
-                if first_nl != -1:
-                    stripped = stripped[first_nl + 1:]
-                if stripped.endswith("```"):
-                    stripped = stripped[:-3].strip()
-                content = stripped
-
            fm, body = parse_frontmatter(content)
            if fm is None:
                issues.append("frontmatter_schema")
@ -635,27 +620,6 @@ async def validate_pr(conn, pr_number: int) -> dict:
    # Extract claim files (domains/, core/, foundations/)
    claim_files = extract_claim_files_from_diff(diff)

-    # ── Backfill description (claim titles) if missing ──
-    # discover_external_prs creates rows without description. Extract H1 titles
-    # from the diff so the dashboard shows what the PR actually contains.
-    existing_desc = conn.execute(
-        "SELECT description FROM prs WHERE number = ?", (pr_number,)
-    ).fetchone()
-    if existing_desc and not (existing_desc["description"] or "").strip() and claim_files:
-        titles = []
-        for _fp, content in claim_files.items():
-            for line in content.split("\n"):
-                if line.startswith("# ") and len(line) > 3:
-                    titles.append(line[2:].strip())
-                    break
-        if titles:
-            desc = " | ".join(titles)
-            conn.execute(
-                "UPDATE prs SET description = ? WHERE number = ? AND (description IS NULL OR description = '')",
-                (desc, pr_number),
-            )
-            logger.info("PR #%d: backfilled description with %d claim titles", pr_number, len(titles))
-
    # ── Tier 0: per-claim validation ──
    # Only validates NEW files (not modified). Modified files have partial content
    # from diffs (only + lines) — frontmatter parsing fails on partial content,
--- a/lib/watchdog.py
+++ b/lib/watchdog.py
@ -104,83 +104,26 @@ async def watchdog_check(conn) -> dict:
            "action": "GC should auto-close these — check fixer.py GC logic",
        })

-    # 5. Tier0 blockage: auto-reset stuck PRs with retry cap
-    MAX_TIER0_RESETS = 3
-    TIER0_RESET_COOLDOWN_S = 3600
+    # 5. Tier0 blockage: many PRs with tier0_pass=0 (potential validation bug)
    tier0_blocked = conn.execute(
-        "SELECT number, branch FROM prs WHERE status = 'open' AND tier0_pass = 0"
-    ).fetchall()
-
-    if tier0_blocked:
-        reset_count = 0
-        permanent_count = 0
-
-        for pr in tier0_blocked:
-            row = conn.execute(
-                """SELECT COUNT(*) as n, MAX(timestamp) as last_ts FROM audit_log
-                   WHERE stage = 'watchdog' AND event = 'tier0_reset'
-                   AND json_extract(detail, '$.pr') = ?""",
-                (pr["number"],),
-            ).fetchone()
-            prior_resets = row["n"]
-
-            if prior_resets >= MAX_TIER0_RESETS:
-                permanent_count += 1
-                continue
-
-            last_reset = row["last_ts"]
-
-            if last_reset:
-                try:
-                    last_ts = datetime.fromisoformat(last_reset).replace(tzinfo=timezone.utc)
-                    age = (datetime.now(timezone.utc) - last_ts).total_seconds()
-                    if age < TIER0_RESET_COOLDOWN_S:
-                        continue
-                except (ValueError, TypeError):
-                    pass
-
-            conn.execute(
-                "UPDATE prs SET tier0_pass = NULL WHERE number = ?",
-                (pr["number"],),
-            )
-            db.audit(
-                conn, "watchdog", "tier0_reset",
-                json.dumps({
-                    "pr": pr["number"],
-                    "branch": pr["branch"],
-                    "attempt": prior_resets + 1,
-                    "max": MAX_TIER0_RESETS,
-                }),
-            )
-            reset_count += 1
-            logger.info(
-                "WATCHDOG: auto-reset tier0 for PR #%d (attempt %d/%d)",
-                pr["number"], prior_resets + 1, MAX_TIER0_RESETS,
-            )
-
-        if reset_count:
-            issues.append({
-                "type": "tier0_reset",
-                "severity": "info",
-                "detail": f"Auto-reset {reset_count} PRs stuck at tier0_pass=0 for re-validation",
-                "action": "Monitor — if same PRs fail again, check validate.py",
-            })
-        if permanent_count:
-            issues.append({
-                "type": "tier0_permanent_failure",
-                "severity": "warning",
-                "detail": f"{permanent_count} PRs exhausted {MAX_TIER0_RESETS} tier0 retries — manual intervention needed",
-                "action": "Inspect PR content or close stale PRs",
-            })
+        "SELECT COUNT(*) as n FROM prs WHERE status = 'open' AND tier0_pass = 0"
+    ).fetchone()["n"]
+    if tier0_blocked >= 5:
+        issues.append({
+            "type": "tier0_blockage",
+            "severity": "warning",
+            "detail": f"{tier0_blocked} PRs blocked at tier0_pass=0",
+            "action": "Check validate.py — may be the modified-file or wiki-link bug recurring",
+        })

    # 6. Stale extraction PRs: open >30 min with no claim files
    try:
-        stale_closed, stale_errors = await check_stale_prs(conn)
+        stale_closed, stale_errors = check_stale_prs(conn)
        if stale_closed > 0:
            issues.append({
                "type": "stale_prs_closed",
                "severity": "info",
-                "detail": f"Auto-closed {stale_closed} stale extraction PRs (no claims after 30 min)",
+                "detail": f"Auto-closed {stale_closed} stale extraction PRs (no claims after {30} min)",
                "action": "Check batch-extract logs for extraction failures",
            })
        if stale_errors > 0:
--- a/scripts/migrate-entity-schema.py
+++ b/scripts/migrate-entity-schema.py
--- a/scripts/migrate-source-archive.py
+++ b/scripts/migrate-source-archive.py
--- a/docs/multi-model-eval-architecture.md
+++ b/docs/multi-model-eval-architecture.md
--- a/observations/personality-layer-may-need-separation-from-knowledge-base.md
+++ b/observations/personality-layer-may-need-separation-from-knowledge-base.md
--- a/scripts/openrouter-extract-v2.py
+++ b/scripts/openrouter-extract-v2.py
--- a/ops/backfill-contributor-roles.py
+++ b/ops/backfill-contributor-roles.py
@ -1,113 +0,0 @@
-#!/usr/bin/env python3
-"""Backfill contributor role counts from prs.commit_type.
-
-Resets all role counts to 0, then re-derives them from the prs table's
-commit_type column using the COMMIT_TYPE_TO_ROLE mapping. This corrects
-the bug where all contributors were recorded as 'extractor' regardless
-of their actual commit_type.
-
-Usage:
-    python3 ops/backfill-contributor-roles.py [--dry-run]
-"""
-
-import argparse
-import sqlite3
-import sys
-import os
-
-sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
-from lib.contributor import COMMIT_TYPE_TO_ROLE, commit_type_to_role
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-
-
-def backfill(db_path: str, dry_run: bool = False):
-    conn = sqlite3.connect(db_path)
-    conn.row_factory = sqlite3.Row
-
-    # Get all merged PRs with commit_type and agent
-    prs = conn.execute("""
-        SELECT number, commit_type, agent, branch
-        FROM prs
-        WHERE status = 'merged' AND agent IS NOT NULL
-        ORDER BY number
-    """).fetchall()
-
-    print(f"Processing {len(prs)} merged PRs...")
-
-    # Reset all role counts
-    if not dry_run:
-        conn.execute("""
-            UPDATE contributors SET
-                extractor_count = 0,
-                challenger_count = 0,
-                synthesizer_count = 0,
-                sourcer_count = 0
-        """)
-        print("Reset all role counts to 0")
-
-    # Tally roles from commit_type
-    role_counts: dict[str, dict[str, int]] = {}
-    for pr in prs:
-        agent = pr["agent"].lower() if pr["agent"] else None
-        if not agent or agent in ("external", "pipeline"):
-            continue
-
-        commit_type = pr["commit_type"] or "extract"
-        role = commit_type_to_role(commit_type)
-
-        if agent not in role_counts:
-            role_counts[agent] = {
-                "extractor_count": 0, "challenger_count": 0,
-                "synthesizer_count": 0, "sourcer_count": 0,
-                "reviewer_count": 0,
-            }
-        role_col = f"{role}_count"
-        if role_col in role_counts[agent]:
-            role_counts[agent][role_col] += 1
-
-    # Apply tallied counts
-    for handle, counts in sorted(role_counts.items()):
-        non_zero = {k: v for k, v in counts.items() if v > 0}
-        print(f"  {handle}: {non_zero or '(no knowledge PRs)'}")
-        if not dry_run and non_zero:
-            set_clauses = ", ".join(f"{k} = {v}" for k, v in non_zero.items())
-            conn.execute(
-                f"UPDATE contributors SET {set_clauses}, updated_at = datetime('now') WHERE handle = ?",
-                (handle,),
-            )
-
-    if not dry_run:
-        conn.commit()
-        print("\nBackfill committed.")
-    else:
-        print("\n[DRY RUN] No changes made.")
-
-    # Print summary
-    print("\nRole distribution across all contributors:")
-    if not dry_run:
-        rows = conn.execute("""
-            SELECT handle, extractor_count, challenger_count, synthesizer_count,
-                   sourcer_count, reviewer_count
-            FROM contributors
-            ORDER BY (extractor_count + challenger_count + synthesizer_count) DESC
-        """).fetchall()
-        for r in rows:
-            parts = []
-            if r["extractor_count"]: parts.append(f"extract:{r['extractor_count']}")
-            if r["challenger_count"]: parts.append(f"challenge:{r['challenger_count']}")
-            if r["synthesizer_count"]: parts.append(f"synthesize:{r['synthesizer_count']}")
-            if r["sourcer_count"]: parts.append(f"source:{r['sourcer_count']}")
-            if r["reviewer_count"]: parts.append(f"review:{r['reviewer_count']}")
-            if parts:
-                print(f"  {r['handle']}: {', '.join(parts)}")
-
-    conn.close()
-
-
-if __name__ == "__main__":
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--dry-run", action="store_true")
-    parser.add_argument("--db", default=DB_PATH)
-    args = parser.parse_args()
-    backfill(args.db, args.dry_run)
--- a/scripts/backfill-descriptions.py
+++ b/scripts/backfill-descriptions.py
--- a/scripts/nightly-reweave.sh
+++ b/scripts/nightly-reweave.sh
@ -14,8 +14,8 @@ REWEAVE_SCRIPT="${PIPELINE_DIR}/reweave.py"
 LOG_DIR="/opt/teleo-eval/logs"
 LOCK_FILE="/opt/teleo-eval/workspaces/.reweave-nightly.lock"

-# Batch size per night — 200 orphans is ~$0.20 in Haiku calls
-BATCH_SIZE=200
+# Batch size per night — 50 orphans is ~$0.05 in Haiku calls
+BATCH_SIZE=50

 echo "=== Nightly reweave started at $(date -u +%Y-%m-%dT%H:%M:%SZ) ==="

--- a/scripts/reconcile-source-status.sh
+++ b/scripts/reconcile-source-status.sh
--- a/scripts/vector-gc.py
+++ b/scripts/vector-gc.py
--- a/research/prompts/changelog.md
+++ b/research/prompts/changelog.md
--- a/research/prompts/rio-system-v1.md
+++ b/research/prompts/rio-system-v1.md
--- a/docs/queue.md
+++ b/docs/queue.md
--- a/scripts/reconcile-sources.py
+++ b/scripts/reconcile-sources.py
--- a/research/prompts/research-prompt-leo-synthesis.md
+++ b/research/prompts/research-prompt-leo-synthesis.md
--- a/research/prompts/research-prompt-v2.md
+++ b/research/prompts/research-prompt-v2.md
--- a/research/research-session.sh
+++ b/research/research-session.sh
--- a/research/entity-session.sh
+++ b/research/entity-session.sh
@ -1,92 +0,0 @@
-#!/bin/bash
-set -e
-
-AGENT="rio"
-BRANCH="${AGENT}/entity-population-$(date +%Y-%m-%d)"
-WORKSPACE="/opt/teleo-eval/workspaces/entity-${AGENT}"
-LOG="/opt/teleo-eval/logs/entity-${AGENT}.log"
-BRIEF="/opt/teleo-eval/entity-research-brief.md"
-SCHEMA="/opt/teleo-eval/entity-schema.md"
-
-log() { echo "[$(date -Iseconds)] $1" | tee -a "$LOG"; }
-
-# Setup workspace
-if [ ! -d "$WORKSPACE" ]; then
-    log "Cloning fresh workspace..."
-    git clone http://localhost:3000/teleo/teleo-codex.git "$WORKSPACE"
-fi
-
-cd "$WORKSPACE"
-git checkout main
-git pull origin main
-git checkout -b "$BRANCH"
-
-# Copy schema into workspace
-cp "$SCHEMA" schemas/entity.md
-
-# Create entities directory
-mkdir -p entities/internet-finance
-
-log "On branch $BRANCH"
-log "Starting Claude entity population session..."
-
-# Build the prompt
-PROMPT="You are Rio, the internet finance domain agent for the Teleo Codex knowledge base.
-
-Your task: populate the first entity files for the knowledge base, focusing on the futarchic ecosystem.
-
-RESEARCH BRIEF:
-$(cat "$BRIEF")
-
-ENTITY SCHEMA:
-$(cat "$SCHEMA")
-
-INSTRUCTIONS:
-1. Read the research brief carefully
-2. Read the entity schema at schemas/entity.md
-3. Read existing claims in domains/internet-finance/ for context
-4. Read relevant source archives in inbox/archive/
-5. Use web search to find current data for each entity (market caps, metrics, recent events)
-6. Create entity files in entities/internet-finance/ following the schema exactly
-7. Start with the companies and people listed in the brief
-8. Create the market entity for futarchic markets
-9. Make sure all wiki links point to real existing files
-10. Add timeline events with dates
-11. Include competitive positioning for companies
-12. Include known positions and credibility basis for people
-
-Create all 12 entities listed in the brief. Quality over speed."
-
-# Run Claude
-timeout 5400 /home/teleo/.local/bin/claude -p "$PROMPT" \
-    --model opus \
-    --allowedTools Read,Write,Edit,Glob,Grep,WebSearch,WebFetch \
-     2>&1 | tee -a "$LOG" || true
-
-# Commit and push
-log "Session complete. Committing..."
-git add entities/ schemas/entity.md
-ENTITY_COUNT=$(find entities/ -name "*.md" | wc -l)
-git commit -m "rio: populate ${ENTITY_COUNT} entity files — futarchic ecosystem
-
- What: First entity population using new entity schema
- Why: Cory directive — agents need industry analysis, not just claims
- Schema: entities track companies, people, markets with temporal data
-
-Pentagon-Agent: Rio <CE7B8202-2877-4C70-8AAB-B05F832F50EA>" || log "Nothing to commit"
-
-git push -u origin "$BRANCH" || log "Push failed"
-
-# Create PR
-PR_URL=$(curl -s -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
-    -H "Authorization: token $(cat /opt/teleo-eval/secrets/forgejo-admin-token)" \
-    -H "Content-Type: application/json" \
-    -d "{
-        \"title\": \"rio: entity schema + ${ENTITY_COUNT} entity files — futarchic ecosystem\",
-        \"body\": \"## Summary\n\nNew entity schema + first population of entity files for the futarchic ecosystem.\n\nEntities track companies, people, and markets as dynamic objects with temporal attributes — a parallel input to beliefs alongside claims.\n\n### Entities created:\n- Companies: MetaDAO, Solomon, Ranger Finance, MycoRealms, Futardio, Aave, Polymarket\n- People: Stani Kulechov, Proph3t, Gabriel Shapiro, Felipe Montealegre\n- Markets: Futarchic Markets ecosystem\n\nDesigned by Leo, populated by Rio.\",
-        \"head\": \"${BRANCH}\",
-        \"base\": \"main\"
-    }" | python3 -c "import sys,json; print(json.load(sys.stdin).get(html_url,no url))")
-
-log "PR opened: $PR_URL"
-log "=== Entity session complete for ${AGENT} ==="
--- a/research/vida-directed-session.sh
+++ b/research/vida-directed-session.sh
@ -1,212 +0,0 @@
-#!/bin/bash
-# Directed research session for Vida — MA/Senior Care/International
-# Wraps research-session.sh with a custom brief injected into the prompt
-set -euo pipefail
-
-AGENT="vida"
-MODEL="opus"
-REPO_DIR="/opt/teleo-eval/workspaces/research-${AGENT}"
-FORGEJO_URL="http://localhost:3000"
-FORGEJO_ADMIN_TOKEN=$(cat /opt/teleo-eval/secrets/forgejo-admin-token)
-AGENT_TOKEN=$(cat "/opt/teleo-eval/secrets/forgejo-${AGENT}-token")
-CLAUDE_BIN="/home/teleo/.local/bin/claude"
-LOG="/opt/teleo-eval/logs/research-${AGENT}.log"
-LOCKFILE="/tmp/research-${AGENT}.lock"
-DATE=$(date +%Y-%m-%d)
-BRANCH="${AGENT}/research-ma-senior-care-${DATE}"
-BRIEF_FILE="/opt/teleo-eval/vida-research-brief.md"
-DOMAIN="health"
-
-log() { echo "[$(date -Iseconds)] $*" >> "$LOG"; }
-
-# Lock
-if [ -f "$LOCKFILE" ]; then
-    pid=$(cat "$LOCKFILE" 2>/dev/null)
-    if kill -0 "$pid" 2>/dev/null; then
-        log "SKIP: research session already running for $AGENT (pid $pid)"
-        exit 0
-    fi
-    rm -f "$LOCKFILE"
-fi
-echo $$ > "$LOCKFILE"
-trap 'rm -f "$LOCKFILE"' EXIT
-
-log "=== Starting DIRECTED research session for $AGENT (model: $MODEL) ==="
-log "Topic: Medicare Advantage, Senior Care, International Comparisons"
-
-# Ensure repo
-if [ ! -d "$REPO_DIR/.git" ]; then
-    git -c http.extraHeader="Authorization: token $FORGEJO_ADMIN_TOKEN" \
-        clone "${FORGEJO_URL}/teleo/teleo-codex.git" "$REPO_DIR" >> "$LOG" 2>&1
-fi
-
-cd "$REPO_DIR"
-git config credential.helper "!f() { echo username=m3taversal; echo password=$FORGEJO_ADMIN_TOKEN; }; f"
-git remote set-url origin "${FORGEJO_URL}/teleo/teleo-codex.git" 2>/dev/null || true
-git checkout main >> "$LOG" 2>&1
-git pull --rebase >> "$LOG" 2>&1 || { git rebase --abort 2>/dev/null; git reset --hard origin/main >> "$LOG" 2>&1; }
-
-# Create branch
-git branch -D "$BRANCH" 2>/dev/null || true
-git checkout -b "$BRANCH" >> "$LOG" 2>&1
-
-# Read the brief
-BRIEF=$(cat "$BRIEF_FILE")
-
-RESEARCH_PROMPT="You are Vida, a Teleo knowledge base agent specializing in health and human flourishing.
-
-## Your Task: Directed Research Session
-
-You have a SPECIFIC research brief from the collective. This is not self-directed — follow the brief.
-
-### Step 1: Orient (5 min)
-Read these files:
- agents/vida/identity.md
- agents/vida/beliefs.md
- agents/vida/reasoning.md
- domains/health/_map.md
-
-### Step 2: Read Your Research Brief
-
-${BRIEF}
-
-### Step 3: Research via Web (75 min)
-
-For each track, use the WebSearch and WebFetch tools to find the specific sources listed in the brief. Archive everything substantive.
-
-**Search strategy:**
- Start with the named sources (MedPAC, KFF, Commonwealth Fund, etc.)
- Follow citations to primary data
- Look for recent (2024-2026) analysis that synthesizes historical data
- Don't just find one article per question — find the BEST source per question
-
-For each source found, create an archive file at:
-inbox/archive/YYYY-MM-DD-{author-or-org}-{brief-slug}.md
-
-Use this frontmatter:
---
-type: source
-title: \"Descriptive title\"
-author: \"Author or Organization\"
-url: https://original-url
-date: YYYY-MM-DD
-domain: health
-secondary_domains: []
-format: report | paper | article | data
-status: unprocessed
-priority: high | medium | low
-tags: [topic1, topic2]
---
-
-## Content
-[Key excerpts, data points, findings — enough for an extractor to work with]
-
-## Agent Notes
-**Why this matters:** [1-2 sentences connecting to beliefs]
-**What surprised me:** [Anything unexpected]
-**KB connections:** [Which existing health claims relate?]
-**Extraction hints:** [What claims should the extractor focus on?]
-
-## Curator Notes
-PRIMARY CONNECTION: [existing claim this most relates to]
-WHY ARCHIVED: [what gap this fills]
-EXTRACTION HINT: [scope the extractor's attention]
-
-### Step 3 Rules:
- Archive EVERYTHING substantive — do NOT extract claims yourself
- Set all sources to status: unprocessed
- Aim for 15-25 source archives across the three tracks
- Prioritize Track 1 (MA history) — that's the anchor
- Check inbox/archive/ for existing sources before creating duplicates
-
-### Step 4: Write Research Musing (5 min)
-Write to agents/vida/musings/research-ma-senior-care-${DATE}.md:
- What you found across the three tracks
- Key surprises or gaps
- Follow-up directions for next session
- Which of your beliefs got stronger or weaker
-
-### Step 5: Update Research Journal (3 min)
-Append to agents/vida/research-journal.md (create if needed):
-## Session ${DATE} — Medicare Advantage & Senior Care
-**Question:** [primary research question]
-**Key finding:** [most important thing learned]
-**Confidence shift:** [belief updates]
-
-### Step 6: Stop
-When done archiving and writing notes, STOP. Do not commit or push."
-
-log "Starting Claude Opus session..."
-timeout 5400 "$CLAUDE_BIN" -p "$RESEARCH_PROMPT" \
-    --allowedTools 'Read,Write,Edit,Glob,Grep,WebSearch,WebFetch' \
-    --model "$MODEL" \
-    --permission-mode bypassPermissions \
-    >> "$LOG" 2>&1 || {
-    log "WARN: Research session failed or timed out"
-    # Still try to commit whatever was produced
-}
-
-log "Claude session complete"
-
-# Check for changes
-CHANGED_FILES=$(git status --porcelain)
-if [ -z "$CHANGED_FILES" ]; then
-    log "No sources archived"
-    git checkout main >> "$LOG" 2>&1
-    exit 0
-fi
-
-# Stage and commit
-git add inbox/archive/ agents/vida/musings/ agents/vida/research-journal.md 2>/dev/null || true
-
-if git diff --cached --quiet; then
-    log "No valid changes to commit"
-    git checkout main >> "$LOG" 2>&1
-    exit 0
-fi
-
-SOURCE_COUNT=$(git diff --cached --name-only | grep -c "^inbox/archive/" || echo "0")
-git commit -m "vida: directed research — MA, senior care, international comparisons
-
- ${SOURCE_COUNT} sources archived across 3 tracks
- Track 1: Medicare Advantage history & structure
- Track 2: Senior care infrastructure
- Track 3: International health system comparisons
-
-Pentagon-Agent: Vida <HEADLESS>" >> "$LOG" 2>&1
-
-git push -u origin "$BRANCH" --force >> "$LOG" 2>&1
-log "Pushed $BRANCH"
-
-# Open PR
-EXISTING_PR=$(curl -s "${FORGEJO_URL}/api/v1/repos/teleo/teleo-codex/pulls?state=open" \
-    -H "Authorization: token $AGENT_TOKEN" \
-    | jq -r ".[] | select(.head.ref == \"$BRANCH\") | .number" 2>/dev/null)
-
-if [ -n "$EXISTING_PR" ]; then
-    log "PR already exists (#$EXISTING_PR)"
-else
-    PR_JSON=$(jq -n \
-        --arg title "vida: directed research — Medicare Advantage, senior care, international comparisons" \
-        --arg body "## Directed Research Session
-
-Three-track investigation commissioned by Cory:
-
-**Track 1:** Medicare Advantage — full history from 1965 to present, risk adjustment, market structure, vertical integration
-**Track 2:** Senior care infrastructure — home health, PACE, caregiver crisis, aging demographics
-**Track 3:** International comparisons — Commonwealth Fund, Singapore, Costa Rica, NHS, Japan LTCI
-
-Sources archived for extraction by the claim pipeline." \
-        --arg base "main" \
-        --arg head "$BRANCH" \
-        '{title: $title, body: $body, base: $base, head: $head}')
-
-    curl -s -X POST "${FORGEJO_URL}/api/v1/repos/teleo/teleo-codex/pulls" \
-        -H "Authorization: token $AGENT_TOKEN" \
-        -H "Content-Type: application/json" \
-        -d "$PR_JSON" >> "$LOG" 2>&1
-    log "PR opened"
-fi
-
-git checkout main >> "$LOG" 2>&1
-log "=== Directed research session complete ==="
--- a/reweave.py
+++ b/reweave.py
@ -50,7 +50,7 @@ EDGE_FIELDS = ("supports", "challenges", "challenged_by", "depends_on", "related
 WIKI_LINK_RE = re.compile(r"\[\[([^\]]+)\]\]")

 # Thresholds (from calibration data — Mar 28)
-DEFAULT_THRESHOLD = 0.55       # Lowered from 0.70 — text-embedding-3-small scores 0.50-0.60 on conceptual matches
+DEFAULT_THRESHOLD = 0.70       # Elbow in score distribution
 DEFAULT_MAX_ORPHANS = 50       # Keep PRs reviewable
 DEFAULT_MAX_NEIGHBORS = 3      # Don't over-connect
 HAIKU_CONFIDENCE_FLOOR = 0.85  # Below this → default to "related"
@ -535,9 +535,8 @@ def _write_edge_regex(neighbor_path: Path, fm_text: str, body_text: str,
    field_re = re.compile(rf"^{edge_type}:\s*$", re.MULTILINE)
    inline_re = re.compile(rf'^{edge_type}:\s*\[', re.MULTILINE)

-    from lib.frontmatter import _yaml_quote
-    entry_line = f'- {_yaml_quote(orphan_title)}'
-    rw_line = f'- {_yaml_quote(orphan_title + "|" + edge_type + "|" + date_str)}'
+    entry_line = f'- {orphan_title}'
+    rw_line = f'- {orphan_title}|{edge_type}|{date_str}'

    if field_re.search(fm_text):
        # Multi-line list exists — find end of list, append
--- a/docs/schema-change-protocol.md
+++ b/docs/schema-change-protocol.md
--- a/scripts/audit-wiki-links.py
+++ b/scripts/audit-wiki-links.py
@ -1,259 +0,0 @@
-#!/usr/bin/env python3
-"""Audit wiki-links across the teleo-codex knowledge base.
-
-Crawls domains/, foundations/, core/, decisions/ for [[wiki-links]].
-Resolves each link against known claim files, entity files, and _map files.
-Reports dead links, orphaned claims, and link counts.
-
-Output: JSON to stdout with dead links, orphans, and per-file link counts.
-"""
-
-import json
-import os
-import re
-import sys
-import unicodedata
-from pathlib import Path
-
-CODEX_ROOT = Path(os.environ.get("CODEX_ROOT", "/opt/teleo-eval/workspaces/main"))
-CLAIM_DIRS = ["domains", "foundations", "core", "decisions"]
-ENTITY_DIR = "entities"
-
-WIKI_LINK_RE = re.compile(r"\[\[([^\]]+)\]\]")
-
-
-def slugify(title: str) -> str:
-    """Convert a wiki-link title to the kebab-case slug used for filenames."""
-    s = title.strip().lower()
-    s = unicodedata.normalize("NFKD", s)
-    s = re.sub(r"[^\w\s-]", "", s)
-    s = re.sub(r"[\s_]+", "-", s)
-    s = re.sub(r"-+", "-", s)
-    return s.strip("-")
-
-
-def build_index(codex: Path) -> dict:
-    """Build a lookup index of all resolvable targets.
-
-    Returns dict mapping normalized slug -> file path.
-    Also maps raw stem (filename without .md) -> file path.
-    """
-    index = {}
-
-    # Index claim files across all claim directories
-    for claim_dir in CLAIM_DIRS:
-        d = codex / claim_dir
-        if not d.exists():
-            continue
-        for md in d.rglob("*.md"):
-            stem = md.stem
-            rel = str(md.relative_to(codex))
-            # Map by stem (exact filename match)
-            index[stem.lower()] = rel
-            # Map by slugified stem
-            index[slugify(stem)] = rel
-
-    # Index entity files
-    entity_root = codex / ENTITY_DIR
-    if entity_root.exists():
-        for md in entity_root.rglob("*.md"):
-            stem = md.stem
-            rel = str(md.relative_to(codex))
-            index[stem.lower()] = rel
-            index[slugify(stem)] = rel
-
-    # Index maps/ directory (MOC-style overview docs)
-    maps_root = codex / "maps"
-    if maps_root.exists():
-        for md in maps_root.rglob("*.md"):
-            stem = md.stem
-            rel = str(md.relative_to(codex))
-            index[stem.lower()] = rel
-            index[slugify(stem)] = rel
-
-    # Index top-level docs that might be link targets
-    for special in ["overview.md", "livingip-overview.md"]:
-        p = codex / special
-        if p.exists():
-            index[p.stem.lower()] = str(p.relative_to(codex))
-
-    # Index agents/ beliefs and positions (sometimes linked)
-    agents_dir = codex / "agents"
-    if agents_dir.exists():
-        for md in agents_dir.rglob("*.md"):
-            stem = md.stem
-            rel = str(md.relative_to(codex))
-            index[stem.lower()] = rel
-
-    return index
-
-
-def resolve_link(link_text: str, index: dict, source_dir: str) -> str | None:
-    """Try to resolve a wiki-link target. Returns file path or None."""
-    text = link_text.strip()
-
-    # Special case: [[_map]] resolves to _map.md in the same domain directory
-    if text == "_map":
-        parts = source_dir.split("/")
-        if len(parts) >= 2:
-            candidate = f"{parts[0]}/{parts[1]}/_map.md"
-            if (CODEX_ROOT / candidate).exists():
-                return candidate
-        return None
-
-    # Path-style references like [[domains/health/_map]]
-    if "/" in text:
-        candidate = text.rstrip("/")
-        if not candidate.endswith(".md"):
-            candidate += ".md"
-        if (CODEX_ROOT / candidate).exists():
-            return candidate
-        return None
-
-    # Try exact stem match (lowercased)
-    key = text.lower()
-    if key in index:
-        return index[key]
-
-    # Try slugified version
-    slug = slugify(text)
-    if slug in index:
-        return index[slug]
-
-    # Try with common variations
-    for variant in [
-        slug.replace("metadaos", "metadao"),
-        slug.replace("ais", "ai"),
-    ]:
-        if variant in index:
-            return index[variant]
-
-    return None
-
-
-def audit(codex: Path) -> dict:
-    """Run the full wiki-link audit."""
-    index = build_index(codex)
-
-    dead_links = []       # {file, link, line_number}
-    link_counts = {}      # file -> {outbound: N, targets: []}
-    all_targets = set()   # files that are linked TO
-    all_files = set()     # all claim/foundation files
-
-    # Scan all markdown files in claim directories
-    for claim_dir in CLAIM_DIRS:
-        d = codex / claim_dir
-        if not d.exists():
-            continue
-        for md in d.rglob("*.md"):
-            rel = str(md.relative_to(codex))
-            all_files.add(rel)
-            source_dir = str(md.parent.relative_to(codex))
-
-            try:
-                content = md.read_text(encoding="utf-8")
-            except Exception:
-                continue
-
-            links_in_file = []
-            for i, line in enumerate(content.split("\n"), 1):
-                for match in WIKI_LINK_RE.finditer(line):
-                    link_text = match.group(1)
-                    # Skip links with | (display text aliases) - take the target part
-                    if "|" in link_text:
-                        link_text = link_text.split("|")[0].strip()
-
-                    resolved = resolve_link(link_text, index, source_dir)
-                    if resolved:
-                        all_targets.add(resolved)
-                        links_in_file.append(resolved)
-                    else:
-                        dead_links.append({
-                            "file": rel,
-                            "link": link_text,
-                            "line": i,
-                        })
-
-            link_counts[rel] = {
-                "outbound": len(links_in_file),
-                "targets": links_in_file,
-            }
-
-    # Find orphaned claims (no inbound links AND no outbound links)
-    files_with_outbound = {f for f, c in link_counts.items() if c["outbound"] > 0}
-    orphaned = sorted(
-        f for f in all_files
-        if f not in all_targets
-        and f not in files_with_outbound
-        and not f.endswith("_map.md")  # MOC files are structural, not orphans
-    )
-
-    # Compute inbound link counts
-    inbound_counts = {}
-    for f, c in link_counts.items():
-        for target in c["targets"]:
-            inbound_counts[target] = inbound_counts.get(target, 0) + 1
-
-    # Claims with high outbound (good connectivity)
-    high_connectivity = sorted(
-        [(f, c["outbound"]) for f, c in link_counts.items() if c["outbound"] >= 3],
-        key=lambda x: -x[1],
-    )
-
-    # Summary stats
-    total_links = sum(c["outbound"] for c in link_counts.values())
-    files_with_links = sum(1 for c in link_counts.values() if c["outbound"] > 0)
-
-    # Domain breakdown of dead links
-    dead_by_domain = {}
-    for dl in dead_links:
-        parts = dl["file"].split("/")
-        domain = parts[1] if len(parts) >= 3 else parts[0]
-        dead_by_domain[domain] = dead_by_domain.get(domain, 0) + 1
-
-    # Domain breakdown of orphans
-    orphan_by_domain = {}
-    for o in orphaned:
-        parts = o.split("/")
-        domain = parts[1] if len(parts) >= 3 else parts[0]
-        orphan_by_domain[domain] = orphan_by_domain.get(domain, 0) + 1
-
-    return {
-        "summary": {
-            "total_files": len(all_files),
-            "total_links": total_links,
-            "files_with_links": files_with_links,
-            "files_without_links": len(all_files) - files_with_links,
-            "dead_link_count": len(dead_links),
-            "orphan_count": len(orphaned),
-            "avg_links_per_file": round(total_links / max(len(all_files), 1), 2),
-            "high_connectivity_count": len(high_connectivity),
-        },
-        "dead_links": dead_links,
-        "dead_by_domain": dict(sorted(dead_by_domain.items(), key=lambda x: -x[1])),
-        "orphaned": orphaned,
-        "orphan_by_domain": dict(sorted(orphan_by_domain.items(), key=lambda x: -x[1])),
-        "high_connectivity": [{"file": f, "outbound_links": n} for f, n in high_connectivity[:20]],
-        "inbound_top20": sorted(
-            [{"file": f, "inbound_links": n} for f, n in inbound_counts.items()],
-            key=lambda x: -x["inbound_links"],
-        )[:20],
-    }
-
-
-if __name__ == "__main__":
-    codex = Path(sys.argv[1]) if len(sys.argv) > 1 else CODEX_ROOT
-    result = audit(codex)
-    json.dump(result, sys.stdout, indent=2)
-    print()
-
-    # Print human-readable summary to stderr
-    s = result["summary"]
-    print(f"\n=== Wiki-Link Audit ===", file=sys.stderr)
-    print(f"Files scanned: {s['total_files']}", file=sys.stderr)
-    print(f"Total links: {s['total_links']}", file=sys.stderr)
-    print(f"Files with links: {s['files_with_links']} ({100*s['files_with_links']//max(s['total_files'],1)}%)", file=sys.stderr)
-    print(f"Dead links: {s['dead_link_count']}", file=sys.stderr)
-    print(f"Orphaned claims: {s['orphan_count']}", file=sys.stderr)
-    print(f"Avg links/file: {s['avg_links_per_file']}", file=sys.stderr)
-    print(f"High connectivity (≥3 links): {s['high_connectivity_count']}", file=sys.stderr)
--- a/scripts/backfill-events.py
+++ b/scripts/backfill-events.py
@ -1,618 +0,0 @@
-#!/usr/bin/env python3
-"""Backfill contribution_events by replaying merged PRs from pipeline.db + worktree.
-
-For each merged PR:
-  - Derive author from prs.submitted_by → git author → branch prefix
-  - Emit author event (role=author, weight=0.30, claim_path=NULL)
-  - For each claim file under a knowledge prefix, parse frontmatter and emit
-    originator events for sourcer entries that differ from the author
-  - Emit evaluator events for Leo (when leo_verdict='approve') and domain_agent
-    (when domain_verdict='approve' and not Leo)
-  - Emit challenger/synthesizer events for Pentagon-Agent trailers on
-    agent-owned branches (theseus/*, rio/*, etc.) based on commit_type
-
-Idempotent via the partial UNIQUE indexes on contribution_events. Safe to re-run.
-
-Usage:
-  python3 scripts/backfill-events.py --dry-run     # Count events without writing
-  python3 scripts/backfill-events.py               # Apply
-
-Runs read-only against the git worktree; only writes to pipeline.db.
-"""
-import argparse
-import os
-import re
-import sqlite3
-import subprocess
-import sys
-from collections import Counter
-from pathlib import Path
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-REPO_DIR = os.environ.get("REPO_DIR", "/opt/teleo-eval/workspaces/main")
-
-# Role weights — must match lib/contributor.py ROLE_WEIGHTS.
-ROLE_WEIGHTS = {
-    "author": 0.30,
-    "challenger": 0.25,
-    "synthesizer": 0.20,
-    "originator": 0.15,
-    "evaluator": 0.05,
-}
-
-PENTAGON_AGENTS = frozenset({
-    "rio", "leo", "theseus", "vida", "clay", "astra",
-    "oberon", "argus", "rhea", "ganymede", "epimetheus", "hermes", "ship",
-    "pipeline",
-})
-
-# Keep in sync with lib/attribution.AGENT_BRANCH_PREFIXES.
-# Duplicated here because this script runs standalone (no pipeline package import).
-AGENT_BRANCH_PREFIXES = (
-    "rio/", "theseus/", "leo/", "vida/", "astra/", "clay/", "oberon/",
-)
-
-TRAILER_EVENT_ROLE = {
-    "challenge": "challenger",
-    "enrich": "synthesizer",
-    "research": "synthesizer",
-    "reweave": "synthesizer",
-}
-
-KNOWLEDGE_PREFIXES = ("domains/", "core/", "foundations/", "decisions/")
-
-BOT_AUTHORS = frozenset({
-    "teleo", "teleo-bot", "pipeline",
-    "github-actions[bot]", "forgejo-actions",
-})
-
-
-def normalize_handle(conn: sqlite3.Connection, handle: str) -> str:
-    if not handle:
-        return ""
-    h = handle.strip().lower().lstrip("@")
-    row = conn.execute("SELECT canonical FROM contributor_aliases WHERE alias = ?", (h,)).fetchone()
-    if row:
-        return row[0]
-    return h
-
-
-def classify_kind(handle: str) -> str:
-    h = handle.strip().lower().lstrip("@")
-    return "agent" if h in PENTAGON_AGENTS else "person"
-
-
-def parse_frontmatter(text: str):
-    """Minimal YAML frontmatter parser using PyYAML when available."""
-    if not text.startswith("---"):
-        return None
-    end = text.find("---", 3)
-    if end == -1:
-        return None
-    raw = text[3:end]
-    try:
-        import yaml
-        fm = yaml.safe_load(raw)
-        return fm if isinstance(fm, dict) else None
-    except ImportError:
-        return None
-    except Exception:
-        return None
-
-
-def extract_sourcers_from_file(path: Path) -> list[str]:
-    """Return the sourcer handles from a claim file's frontmatter.
-
-    Matches three formats:
-      1. Block: `attribution: { sourcer: [{handle: "x"}, ...] }`
-      2. Bare-key flat: `sourcer: alexastrum`
-      3. Prefix-keyed: `attribution_sourcer: alexastrum`
-    """
-    try:
-        content = path.read_text(encoding="utf-8")
-    except (FileNotFoundError, PermissionError, UnicodeDecodeError):
-        return []
-    fm = parse_frontmatter(content)
-    if not fm:
-        return []
-
-    handles: list[str] = []
-
-    attr = fm.get("attribution")
-    if isinstance(attr, dict):
-        entries = attr.get("sourcer", [])
-        if isinstance(entries, list):
-            for e in entries:
-                if isinstance(e, dict) and "handle" in e:
-                    handles.append(e["handle"])
-                elif isinstance(e, str):
-                    handles.append(e)
-        elif isinstance(entries, str):
-            handles.append(entries)
-        return handles
-
-    flat = fm.get("attribution_sourcer")
-    if flat:
-        if isinstance(flat, str):
-            handles.append(flat)
-        elif isinstance(flat, list):
-            handles.extend(v for v in flat if isinstance(v, str))
-        if handles:
-            return handles
-
-    bare = fm.get("sourcer")
-    if bare:
-        if isinstance(bare, str):
-            handles.append(bare)
-        elif isinstance(bare, list):
-            handles.extend(v for v in bare if isinstance(v, str))
-
-    return handles
-
-
-_HANDLE_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,38}$")
-
-
-def valid_handle(h: str) -> bool:
-    if not h:
-        return False
-    lower = h.strip().lower().lstrip("@")
-    if lower.endswith("-") or lower.endswith("_"):
-        return False
-    return bool(_HANDLE_RE.match(lower))
-
-
-def git(*args, cwd: str = REPO_DIR, timeout: int = 30) -> str:
-    """Run a git command, return stdout. Returns empty string on failure."""
-    try:
-        result = subprocess.run(
-            ["git", *args],
-            cwd=cwd, capture_output=True, text=True, timeout=timeout, check=False,
-        )
-        return result.stdout
-    except (subprocess.TimeoutExpired, OSError):
-        return ""
-
-
-def git_first_commit_author(pr_branch: str, merged_at: str) -> str:
-    """Best-effort: find git author of first non-merge commit on the branch.
-
-    PR branches are usually deleted after merge. We fall back to scanning main
-    commits around merged_at for commits matching the branch slug.
-    """
-    # Post-merge branches are cleaned up. For the backfill, we accept that this
-    # path rarely yields results and rely on submitted_by + branch prefix.
-    return ""
-
-
-def derive_author(conn: sqlite3.Connection, pr: dict) -> str | None:
-    """Author precedence: submitted_by → branch-prefix agent for agent-owned branches."""
-    if pr.get("submitted_by"):
-        cand = pr["submitted_by"].strip().lower().lstrip("@")
-        if cand and cand not in BOT_AUTHORS:
-            return cand
-    branch = pr.get("branch") or ""
-    if "/" in branch:
-        prefix = branch.split("/", 1)[0].lower()
-        if prefix in ("rio", "theseus", "leo", "vida", "clay", "astra", "oberon"):
-            return prefix
-    return None
-
-
-def find_pr_for_claim(
-    conn: sqlite3.Connection,
-    repo: Path,
-    md: Path,
-) -> tuple[int | None, str]:
-    """Recover the Forgejo PR number that introduced a claim file.
-
-    Returns (pr_number, strategy) — strategy is one of:
-      'sourced_from' — frontmatter sourced_from matched prs.source_path
-      'git_subject'  — git log first-add commit message matched a branch pattern
-      'title_desc'   — filename stem matched a title in prs.description
-      'github_pr'    — recovery commit mentioned GitHub PR # → prs.github_pr
-      'none'         — no strategy found a match
-
-    Order is chosen by reliability:
-      1. sourced_from (explicit provenance, most reliable when present)
-      2. git_subject  (covers Leo research, Cameron challenges, Theseus contrib)
-      3. title_desc   (current fallback — brittle when description is NULL)
-      4. github_pr    (recovery commits referencing erased GitHub PRs)
-    """
-    rel = str(md.relative_to(repo))
-
-    # Strategy 1: sourced_from frontmatter → prs.source_path
-    try:
-        content = md.read_text(encoding="utf-8")
-    except (FileNotFoundError, PermissionError, UnicodeDecodeError):
-        content = ""
-    fm = parse_frontmatter(content) if content else None
-    if fm:
-        sourced = fm.get("sourced_from")
-        candidate_paths: list[str] = []
-        if isinstance(sourced, str) and sourced:
-            candidate_paths.append(sourced)
-        elif isinstance(sourced, list):
-            candidate_paths.extend(s for s in sourced if isinstance(s, str))
-        for sp in candidate_paths:
-            stem = Path(sp).stem
-            if not stem:
-                continue
-            row = conn.execute(
-                """SELECT number FROM prs
-                   WHERE source_path LIKE ? AND status='merged'
-                   ORDER BY merged_at ASC LIMIT 1""",
-                (f"%{stem}.md",),
-            ).fetchone()
-            if row:
-                return row["number"], "sourced_from"
-
-    # Strategy 2: git log first-add commit → subject pattern → prs.branch
-    # Default log order is reverse-chronological; take the last line (oldest)
-    # to get the original addition, not later rewrites.
-    log_out = git(
-        "log", "--diff-filter=A", "--follow",
-        "--format=%H|||%s|||%b", "--", rel,
-    )
-    if log_out.strip():
-        # Split on the delimiter we chose. Each commit produces 3 fields but
-        # %b can contain blank lines — group by lines that look like a SHA.
-        blocks: list[tuple[str, str, str]] = []
-        current: list[str] = []
-        for line in log_out.splitlines():
-            if re.match(r"^[a-f0-9]{40}\|\|\|", line):
-                if current:
-                    parts = "\n".join(current).split("|||", 2)
-                    if len(parts) == 3:
-                        blocks.append((parts[0], parts[1], parts[2]))
-                current = [line]
-            else:
-                current.append(line)
-        if current:
-            parts = "\n".join(current).split("|||", 2)
-            if len(parts) == 3:
-                blocks.append((parts[0], parts[1], parts[2]))
-        if blocks:
-            # Oldest addition — git log defaults to reverse-chronological
-            _oldest_sha, subject, body = blocks[-1]
-
-            # Pattern: "<agent>: extract claims from <slug>"
-            m = re.match(r"^(\w+):\s*extract\s+claims\s+from\s+(\S+)", subject)
-            if m:
-                slug = m.group(2).rstrip(".md").rstrip(".")
-                row = conn.execute(
-                    """SELECT number FROM prs
-                       WHERE branch LIKE ? AND status='merged'
-                       ORDER BY merged_at ASC LIMIT 1""",
-                    (f"extract/{slug}%",),
-                ).fetchone()
-                if row:
-                    return row["number"], "git_subject"
-
-            # Pattern: "<agent>: research session <date>"
-            m = re.match(r"^(\w+):\s*research\s+session\s+(\d{4}-\d{2}-\d{2})", subject)
-            if m:
-                agent = m.group(1).lower()
-                date = m.group(2)
-                row = conn.execute(
-                    """SELECT number FROM prs
-                       WHERE branch LIKE ? AND status='merged'
-                       ORDER BY merged_at ASC LIMIT 1""",
-                    (f"{agent}/research-{date}%",),
-                ).fetchone()
-                if row:
-                    return row["number"], "git_subject"
-
-            # Pattern: "<agent>: challenge" / contrib challenges / entity batches
-            m = re.match(r"^(\w+):\s*(?:challenge|contrib|entity|synthesize)", subject)
-            if m:
-                agent = m.group(1).lower()
-                row = conn.execute(
-                    """SELECT number FROM prs
-                       WHERE branch LIKE ? AND status='merged'
-                       ORDER BY merged_at ASC LIMIT 1""",
-                    (f"{agent}/%",),
-                ).fetchone()
-                if row:
-                    return row["number"], "git_subject"
-
-            # Recovery commits referencing erased GitHub PRs (Alex/Cameron).
-            # Subject: "Recover <who> contribution from GitHub PR #NN (...)".
-            # Match only when a corresponding prs row exists with github_pr=NN —
-            # otherwise the claims were direct-to-main without a Forgejo PR
-            # record, which requires a synthetic PR row (follow-up, not in
-            # this script's scope).
-            gh_match = re.search(r"GitHub\s+PR\s+#(\d+)", subject + "\n" + body)
-            if gh_match:
-                gh_pr = int(gh_match.group(1))
-                row = conn.execute(
-                    "SELECT number FROM prs WHERE github_pr = ? AND status='merged' LIMIT 1",
-                    (gh_pr,),
-                ).fetchone()
-                if row:
-                    return row["number"], "github_pr"
-
-            # Pattern: bare "Extract N claims from <source-fragment>" (no
-            # agent prefix). Used in early research PRs like Shaga's claims
-            # at PR #2025. Fall back to time-proximity: find the earliest
-            # agent-branch PR merged within 24h AFTER this commit's date.
-            m = re.match(r"^Extract\s+\d+\s+claims\s+from\b", subject)
-            if m:
-                # Get commit author date
-                date_out = git(
-                    "log", "-1", "--format=%aI", _oldest_sha, timeout=10,
-                )
-                commit_date = date_out.strip() if date_out.strip() else None
-                if commit_date:
-                    # git %aI returns ISO 8601 with T-separator; prs.merged_at
-                    # uses SQLite's space-separator. Lexicographic comparison
-                    # fails across formats (space<T), so normalize commit_date
-                    # via datetime() before comparing. Without this, PRs merged
-                    # within the same calendar day but earlier than the commit
-                    # hour are silently excluded (caught by Ganymede review —
-                    # Shaga's #2025 was dropped in favor of later #2032).
-                    row = conn.execute(
-                        """SELECT number FROM prs
-                           WHERE status='merged'
-                             AND merged_at >= datetime(?)
-                             AND merged_at <= datetime(datetime(?), '+24 hours')
-                             AND (branch LIKE 'leo/%' OR branch LIKE 'theseus/%'
-                                  OR branch LIKE 'rio/%' OR branch LIKE 'astra/%'
-                                  OR branch LIKE 'vida/%' OR branch LIKE 'clay/%')
-                           ORDER BY merged_at ASC LIMIT 1""",
-                        (commit_date, commit_date),
-                    ).fetchone()
-                    if row:
-                        return row["number"], "git_time_proximity"
-
-    return None, "none"
-
-
-def emit(conn, counts, dry_run, handle, role, pr_number, claim_path, domain, channel, timestamp):
-    canonical = normalize_handle(conn, handle)
-    if not valid_handle(canonical):
-        return
-    kind = classify_kind(canonical)
-    weight = ROLE_WEIGHTS[role]
-    counts[(role, "attempt")] += 1
-    if dry_run:
-        counts[(role, "would_insert")] += 1
-        return
-    cur = conn.execute(
-        """INSERT OR IGNORE INTO contribution_events
-           (handle, kind, role, weight, pr_number, claim_path, domain, channel, timestamp)
-           VALUES (?, ?, ?, ?, ?, ?, ?, ?, COALESCE(?, datetime('now')))""",
-        (canonical, kind, role, weight, pr_number, claim_path, domain, channel, timestamp),
-    )
-    if cur.rowcount > 0:
-        counts[(role, "inserted")] += 1
-    else:
-        counts[(role, "skipped_dup")] += 1
-
-
-def files_added_in_pr(pr_number: int, branch: str) -> list[str]:
-    """Best-effort: list added .md files in the PR.
-
-    Uses prs.source_path as a fallback signal (the claim being added). If the
-    branch no longer exists post-merge, this will return []; we accept the loss
-    for historical PRs where the granular per-claim events can't be recovered —
-    PR-level author/evaluator events still land correctly.
-    """
-    # Post-merge PR branches are deleted from Forgejo so we can't diff them.
-    # For the backfill we use prs.source_path — for extract/* PRs this points to
-    # the source inbox file; we can glob the claim files from the extract branch
-    # commit on main. But main's commits don't track which files a given PR touched.
-    # Accept the loss: backfill emits only PR-level events (author, evaluator,
-    # challenger/synthesizer). Originator events come from parsing claim files
-    # attributed to the branch via description field which lists claim titles.
-    return []
-
-
-def main():
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--dry-run", action="store_true")
-    parser.add_argument("--limit", type=int, default=0, help="Process at most N PRs (0 = all)")
-    args = parser.parse_args()
-
-    if not Path(DB_PATH).exists():
-        print(f"ERROR: DB not found at {DB_PATH}", file=sys.stderr)
-        sys.exit(1)
-
-    conn = sqlite3.connect(DB_PATH, timeout=30)
-    conn.row_factory = sqlite3.Row
-
-    # Sanity: contribution_events exists (v24 migration applied)
-    try:
-        conn.execute("SELECT 1 FROM contribution_events LIMIT 1")
-    except sqlite3.OperationalError:
-        print("ERROR: contribution_events table missing. Run migration v24 first.", file=sys.stderr)
-        sys.exit(2)
-
-    # Walk all merged knowledge PRs
-    query = """
-        SELECT number, branch, domain, source_channel, submitted_by,
-               leo_verdict, domain_verdict, domain_agent,
-               commit_type, merged_at
-        FROM prs
-        WHERE status = 'merged'
-        ORDER BY merged_at ASC
-    """
-    if args.limit:
-        query += f" LIMIT {args.limit}"
-    prs = conn.execute(query).fetchall()
-    print(f"Replaying {len(prs)} merged PRs (dry_run={args.dry_run})...")
-
-    counts: Counter = Counter()
-    repo = Path(REPO_DIR)
-
-    for pr in prs:
-        pr_number = pr["number"]
-        branch = pr["branch"] or ""
-        domain = pr["domain"]
-        channel = pr["source_channel"]
-        merged_at = pr["merged_at"]
-
-        # Skip pipeline-only branches for author credit (extract/*, reweave/*,
-        # fix/*, ingestion/*, epimetheus/*) — those are infrastructure. But
-        # evaluator credit for Leo/domain_agent still applies.
-        is_pipeline_branch = branch.startswith((
-            "extract/", "reweave/", "fix/", "ingestion/", "epimetheus/",
-        ))
-
-        # ── AUTHOR ──
-        # For pipeline branches, submitted_by carries the real author (the
-        # human who submitted the source via Telegram/etc). For agent branches,
-        # the agent is author. For external branches (gh-pr-*), git author is
-        # in submitted_by from the sync-mirror pipeline.
-        author = derive_author(conn, dict(pr))
-        if author:
-            emit(conn, counts, args.dry_run, author, "author", pr_number,
-                 None, domain, channel, merged_at)
-
-        # ── EVALUATOR ──
-        if pr["leo_verdict"] == "approve":
-            emit(conn, counts, args.dry_run, "leo", "evaluator", pr_number,
-                 None, domain, channel, merged_at)
-        if pr["domain_verdict"] == "approve" and pr["domain_agent"]:
-            dagent = pr["domain_agent"].strip().lower()
-            if dagent and dagent != "leo":
-                emit(conn, counts, args.dry_run, dagent, "evaluator", pr_number,
-                     None, domain, channel, merged_at)
-
-        # ── CHALLENGER / SYNTHESIZER from branch+commit_type ──
-        # Only fires on agent-owned branches. Pipeline branches aren't creditable
-        # work (they're machine extraction, evaluator already captures the review).
-        if branch.startswith(AGENT_BRANCH_PREFIXES):
-            prefix = branch.split("/", 1)[0].lower()
-            event_role = TRAILER_EVENT_ROLE.get(pr["commit_type"] or "")
-            if event_role:
-                emit(conn, counts, args.dry_run, prefix, event_role, pr_number,
-                     None, domain, channel, merged_at)
-
-        # ── ORIGINATOR per claim ──
-        # Walk claim files currently on main whose content was added in this PR.
-        # We can't diff old branches (deleted post-merge), but for extract PRs
-        # the source_path + description carry claim titles — too lossy to build
-        # per-claim events reliably. Strategy: walk ALL claim files that have a
-        # sourcer in their frontmatter and assign them to the PR whose
-        # source_path matches (via description or filename heuristic).
-        # DEFERRED: per-claim originator events require branch introspection
-        # that fails on deleted branches. Backfill emits PR-level events only.
-        # Forward traffic (post-deploy) gets per-claim originator events via
-        # record_contributor_attribution's added-files walk.
-
-    if not args.dry_run:
-        conn.commit()
-
-    # Originator is emitted in the claim-level pass below, not the PR-level pass.
-    # Previous summary listed it here with attempted=0 which confused operators.
-    print("\n=== PR-level events (author, evaluator, challenger, synthesizer) ===")
-    for role in ("author", "challenger", "synthesizer", "evaluator"):
-        att = counts[(role, "attempt")]
-        if args.dry_run:
-            wi = counts[(role, "would_insert")]
-            print(f"  {role:12s} attempted={att:5d} would_insert={wi:5d}")
-        else:
-            ins = counts[(role, "inserted")]
-            skip = counts[(role, "skipped_dup")]
-            print(f"  {role:12s} attempted={att:5d} inserted={ins:5d} skipped_dup={skip:5d}")
-
-    # ── Per-claim originator pass ──
-    # Walk the knowledge tree, parse sourcer attribution, and attach each claim
-    # to its merging PR via find_pr_for_claim's multi-strategy recovery.
-    # Apr 24 rewrite (Ganymede-approved): replaces the single-strategy
-    # title→description match with four strategies in reliability order.
-    # Previous script missed PRs with NULL description (Cameron #3377) and
-    # cross-context claims (Shaga's Leo research). Fallback title-match is
-    # preserved to recover anything the git-log path misses.
-    print("\n=== Claim-level originator pass ===")
-    # Build title → pr_number map from prs.description (strategy 3 fallback)
-    title_to_pr: dict[str, int] = {}
-    for r in conn.execute(
-        "SELECT number, description FROM prs WHERE status='merged' AND description IS NOT NULL AND description != ''"
-    ).fetchall():
-        desc = r["description"] or ""
-        for title in desc.split(" | "):
-            title = title.strip()
-            if title:
-                # Last-writer wins. Conflicts are rare (titles unique in practice).
-                title_to_pr[title.lower()] = r["number"]
-
-    claim_counts = Counter()
-    strategy_counts = Counter()
-    claim_count = 0
-    originator_count = 0
-    for md in sorted(repo.glob("domains/**/*.md")) + \
-              sorted(repo.glob("core/**/*.md")) + \
-              sorted(repo.glob("foundations/**/*.md")) + \
-              sorted(repo.glob("decisions/**/*.md")):
-        rel = str(md.relative_to(repo))
-        stem = md.stem
-
-        # Strategies 1, 2, 4 via the helper (sourced_from, git_subject, github_pr).
-        pr_number, strategy = find_pr_for_claim(conn, repo, md)
-
-        # Strategy 3 (fallback): title-match against prs.description.
-        if not pr_number:
-            pr_number = title_to_pr.get(stem.lower())
-            if not pr_number:
-                pr_number = title_to_pr.get(stem.replace("-", " ").lower())
-            if pr_number:
-                strategy = "title_desc"
-
-        if not pr_number:
-            claim_counts["no_pr_match"] += 1
-            continue
-
-        sourcers = extract_sourcers_from_file(md)
-        if not sourcers:
-            claim_counts["no_sourcer"] += 1
-            continue
-
-        claim_count += 1
-        strategy_counts[strategy] += 1
-        # Look up author for this PR to skip self-credit
-        pr_row = conn.execute(
-            "SELECT submitted_by, branch, domain, source_channel, merged_at FROM prs WHERE number = ?",
-            (pr_number,),
-        ).fetchone()
-        if not pr_row:
-            continue
-        author = derive_author(conn, dict(pr_row))
-        author_canonical = normalize_handle(conn, author) if author else None
-
-        for src_handle in sourcers:
-            src_canonical = normalize_handle(conn, src_handle)
-            if not valid_handle(src_canonical):
-                claim_counts["invalid_handle"] += 1
-                continue
-            if src_canonical == author_canonical:
-                claim_counts["skip_self"] += 1
-                continue
-            emit(conn, counts, args.dry_run, src_handle, "originator", pr_number,
-                 rel, pr_row["domain"], pr_row["source_channel"], pr_row["merged_at"])
-            originator_count += 1
-
-    if not args.dry_run:
-        conn.commit()
-
-    print(f"  Claims processed: {claim_count}")
-    print(f"  Originator events emitted: {originator_count}")
-    print(f"  Breakdown: {dict(claim_counts)}")
-    print(f"  Strategy hits: {dict(strategy_counts)}")
-    att = counts[("originator", "attempt")]
-    if args.dry_run:
-        wi = counts[("originator", "would_insert")]
-        print(f"  {'originator':12s} attempted={att:5d} would_insert={wi:5d}")
-    else:
-        ins = counts[("originator", "inserted")]
-        skip = counts[("originator", "skipped_dup")]
-        print(f"  {'originator':12s} attempted={att:5d} inserted={ins:5d} skipped_dup={skip:5d}")
-
-    if not args.dry_run:
-        total = conn.execute("SELECT COUNT(*) FROM contribution_events").fetchone()[0]
-        print(f"\nTotal contribution_events rows: {total}")
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/backfill-reviewer-count.py
+++ b/scripts/backfill-reviewer-count.py
@ -1,143 +0,0 @@
-#!/usr/bin/env python3
-"""Backfill reviewer_count in contributors table from prs review data.
-
-Sources of review data:
-1. leo_verdict in prs table (approve/request_changes = Leo reviewed)
-2. domain_verdict + domain_agent in prs table (domain agent reviewed)
-3. Forgejo API reviews (agents that submitted reviews via Forgejo)
-
-Deduplication: If the same agent is both leo_verdict reviewer and domain_agent
-on the same PR, count it once per PR.
-"""
-import sqlite3
-import json
-import os
-import sys
-import urllib.request
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-FORGEJO_URL = "http://localhost:3000/api/v1"
-REPO = "teleo/teleo-codex"
-
-def get_forgejo_token():
-    token_path = "/opt/teleo-eval/secrets/forgejo-admin-token"
-    if os.path.exists(token_path):
-        return open(token_path).read().strip()
-    return os.environ.get("FORGEJO_TOKEN", "")
-
-def fetch_forgejo_reviews(pr_number, token):
-    """Fetch reviews from Forgejo API for a single PR."""
-    url = f"{FORGEJO_URL}/repos/{REPO}/pulls/{pr_number}/reviews"
-    req = urllib.request.Request(url, headers={"Authorization": f"token {token}"})
-    try:
-        with urllib.request.urlopen(req, timeout=5) as resp:
-            return json.loads(resp.read())
-    except Exception:
-        return []
-
-def main():
-    dry_run = "--dry-run" in sys.argv
-    skip_forgejo = "--skip-forgejo" in sys.argv
-
-    conn = sqlite3.connect(DB_PATH)
-    conn.row_factory = sqlite3.Row
-
-    # Step 1: Collect review events from prs table
-    # reviewer -> set of PR numbers they reviewed
-    reviewer_prs = {}
-
-    # Leo reviews (leo_verdict = approve or request_changes)
-    rows = conn.execute("""
-        SELECT number FROM prs
-        WHERE status='merged' AND leo_verdict IN ('approve', 'request_changes')
-    """).fetchall()
-    leo_prs = {r["number"] for r in rows}
-    if leo_prs:
-        reviewer_prs["leo"] = leo_prs
-    print(f"Leo reviews from leo_verdict: {len(leo_prs)}")
-
-    # Domain agent reviews
-    rows = conn.execute("""
-        SELECT number, domain_agent FROM prs
-        WHERE status='merged' AND domain_verdict IN ('approve', 'request_changes')
-        AND domain_agent IS NOT NULL AND domain_agent != ''
-    """).fetchall()
-    for r in rows:
-        agent = r["domain_agent"].lower()
-        if agent not in reviewer_prs:
-            reviewer_prs[agent] = set()
-        reviewer_prs[agent].add(r["number"])
-
-    # Print domain agent counts (before dedup with Leo)
-    for agent in sorted(reviewer_prs):
-        if agent != "leo":
-            print(f"  {agent} domain reviews: {len(reviewer_prs[agent])}")
-
-    # Leo as domain_agent overlaps with leo_verdict — already deduped by using sets
-    leo_domain = conn.execute("""
-        SELECT COUNT(*) as cnt FROM prs
-        WHERE status='merged' AND domain_agent='Leo'
-        AND domain_verdict IN ('approve', 'request_changes')
-    """).fetchone()["cnt"]
-    print(f"  Leo as domain_agent: {leo_domain} (deduplicated into Leo's total)")
-
-    # Step 2: Optionally fetch Forgejo API reviews
-    if not skip_forgejo:
-        token = get_forgejo_token()
-        if token:
-            # Get all merged PR numbers
-            merged = conn.execute(
-                "SELECT number FROM prs WHERE status='merged'"
-            ).fetchall()
-            merged_numbers = [r["number"] for r in merged]
-
-            print(f"\nFetching Forgejo reviews for {len(merged_numbers)} merged PRs...")
-            forgejo_count = 0
-            for i, pr_num in enumerate(merged_numbers):
-                if i % 100 == 0 and i > 0:
-                    print(f"  ...{i}/{len(merged_numbers)}")
-                reviews = fetch_forgejo_reviews(pr_num, token)
-                for review in reviews:
-                    if review.get("state") in ("APPROVED", "REQUEST_CHANGES"):
-                        login = review["user"]["login"].lower()
-                        if login not in reviewer_prs:
-                            reviewer_prs[login] = set()
-                        reviewer_prs[login].add(pr_num)
-                        forgejo_count += 1
-            print(f"  Forgejo API reviews found: {forgejo_count}")
-        else:
-            print("\nNo Forgejo token found, skipping API reviews")
-    else:
-        print("\nSkipping Forgejo API reviews (--skip-forgejo)")
-
-    # Step 3: Compute final counts
-    print("\n--- Final reviewer counts ---")
-    existing = {r["handle"]: r["reviewer_count"] for r in
-                conn.execute("SELECT handle, reviewer_count FROM contributors").fetchall()}
-
-    updates = {}
-    for reviewer, prs in sorted(reviewer_prs.items()):
-        count = len(prs)
-        current = existing.get(reviewer, None)
-        if current is not None:
-            updates[reviewer] = count
-            print(f"  {reviewer}: {current} -> {count} ({count - current:+d})")
-        else:
-            print(f"  {reviewer}: {count} reviews (no contributor record, skipping)")
-
-    # Step 4: Apply updates
-    if dry_run:
-        print(f"\n[DRY RUN] Would update {len(updates)} contributors")
-    else:
-        for handle, count in updates.items():
-            conn.execute(
-                "UPDATE contributors SET reviewer_count = ?, updated_at = datetime('now') WHERE handle = ?",
-                (count, handle)
-            )
-        conn.commit()
-        print(f"\nUpdated {len(updates)} contributors")
-
-    conn.close()
-
-if __name__ == "__main__":
-    main()
--- a/scripts/backfill-sourcer-attribution.py
+++ b/scripts/backfill-sourcer-attribution.py
@ -1,261 +0,0 @@
-#!/usr/bin/env python3
-"""Backfill sourcer/extractor/etc. attribution from claim frontmatter.
-
-Walks every merged knowledge file under domains/, entities/, decisions/,
-foundations/, convictions/, core/ and re-runs the canonical attribution
-parser (lib/attribution.py). For each parsed (handle, role) pair, increments
-the corresponding *_count column on the contributors table.
-
-Why this is needed (Apr 24 incident):
-  - lib/contributor.py used a diff-line regex parser that handled neither
-    the bare-key flat format (`sourcer: alexastrum`, ~42% of claims) nor
-    the nested `attribution: { sourcer: [...] }` block format used by Leo's
-    manual extractions (Shaga's claims).
-  - Result: alexastrum, thesensatore, cameron-s1, and similar handles were
-    silently dropped at merge time. Their contributor rows either don't
-    exist or are stuck at zero counts.
-
-Usage:
-    python3 backfill-sourcer-attribution.py --dry-run    # report deltas, no writes
-    python3 backfill-sourcer-attribution.py              # apply (additive: max(db, truth))
-    python3 backfill-sourcer-attribution.py --reset      # destructive: set absolute truth
-
-Default mode is ADDITIVE for safety: per-role count is set to max(current_db, truth).
-This preserves any existing high counts that came from non-frontmatter sources
-(e.g., m3taversal.sourcer=1011 reflects Telegram-curator credit accumulated via
-a different code path; truncating to the file-walk truth would be destructive).
-
-Use --reset to set absolute truth from the file walk only — this clobbers
-all existing role counts including legitimate non-frontmatter credit.
-
-Idempotency: additive mode is safe to re-run. --reset run is gated by an
-audit_log marker; pass --force to override.
-"""
-import argparse
-import os
-import sqlite3
-import sys
-from collections import defaultdict
-from pathlib import Path
-
-# Allow running from anywhere — point at pipeline lib
-PIPELINE_ROOT = Path(__file__).resolve().parent.parent
-sys.path.insert(0, str(PIPELINE_ROOT))
-
-from lib.attribution import parse_attribution_from_file, VALID_ROLES  # noqa: E402
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-REPO = Path(os.environ.get("REPO_DIR", "/opt/teleo-eval/workspaces/main"))
-KNOWLEDGE_PREFIXES = (
-    "domains", "entities", "decisions", "foundations", "convictions", "core",
-)
-
-
-def collect_attributions(repo_root: Path) -> dict[str, dict[str, int]]:
-    """Walk all knowledge files; return {handle: {role: count}}."""
-    counts: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int))
-    files_scanned = 0
-    files_with_attribution = 0
-
-    for prefix in KNOWLEDGE_PREFIXES:
-        base = repo_root / prefix
-        if not base.exists():
-            continue
-        for path in base.rglob("*.md"):
-            if path.name.startswith("_"):
-                continue
-            files_scanned += 1
-            attr = parse_attribution_from_file(str(path))
-            had_any = False
-            for role, entries in attr.items():
-                for entry in entries:
-                    handle = entry.get("handle")
-                    if handle:
-                        counts[handle][role] += 1
-                        had_any = True
-            if had_any:
-                files_with_attribution += 1
-
-    print(f"  Scanned {files_scanned} knowledge files", file=sys.stderr)
-    print(f"  {files_with_attribution} had parseable attribution", file=sys.stderr)
-    return counts
-
-
-def existing_contributors(conn) -> dict[str, dict[str, int]]:
-    """Return {handle: {role: count}} from the current DB."""
-    rows = conn.execute(
-        "SELECT handle, sourcer_count, extractor_count, challenger_count, "
-        "synthesizer_count, reviewer_count, claims_merged FROM contributors"
-    ).fetchall()
-    out = {}
-    for r in rows:
-        out[r["handle"]] = {
-            "sourcer": r["sourcer_count"] or 0,
-            "extractor": r["extractor_count"] or 0,
-            "challenger": r["challenger_count"] or 0,
-            "synthesizer": r["synthesizer_count"] or 0,
-            "reviewer": r["reviewer_count"] or 0,
-            "claims_merged": r["claims_merged"] or 0,
-        }
-    return out
-
-
-def claims_merged_for(role_counts: dict[str, int]) -> int:
-    """Mirror upsert_contributor logic: claims_merged += sourcer + extractor."""
-    return role_counts.get("sourcer", 0) + role_counts.get("extractor", 0)
-
-
-def main():
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--dry-run", action="store_true",
-                        help="Report deltas without writing")
-    parser.add_argument("--reset", action="store_true",
-                        help="Destructive: set absolute truth from file walk "
-                             "(default is additive max(db, truth))")
-    parser.add_argument("--force", action="store_true",
-                        help="Re-run even if a previous --reset marker exists")
-    args = parser.parse_args()
-
-    if not REPO.exists():
-        print(f"ERROR: repo not found at {REPO}", file=sys.stderr)
-        sys.exit(1)
-
-    print(f"DB: {DB_PATH}", file=sys.stderr)
-    print(f"Repo: {REPO}", file=sys.stderr)
-    print("", file=sys.stderr)
-    print("Walking knowledge tree...", file=sys.stderr)
-
-    truth = collect_attributions(REPO)
-    print(f"  Found attributions for {len(truth)} unique handles", file=sys.stderr)
-    print("", file=sys.stderr)
-
-    conn = sqlite3.connect(DB_PATH, timeout=30)
-    conn.row_factory = sqlite3.Row
-    current = existing_contributors(conn)
-
-    # Compute deltas: new handles + handles with role-count mismatches
-    new_handles: list[tuple[str, dict[str, int]]] = []
-    role_deltas: list[tuple[str, dict[str, int], dict[str, int]]] = []
-
-    for handle, roles in truth.items():
-        if handle not in current:
-            new_handles.append((handle, dict(roles)))
-        else:
-            cur = current[handle]
-            mismatches = {r: roles.get(r, 0) for r in VALID_ROLES
-                          if roles.get(r, 0) != cur.get(r, 0)}
-            if mismatches:
-                role_deltas.append((handle, dict(roles), cur))
-
-    print(f"=== {len(new_handles)} NEW contributors to insert ===")
-    for handle, roles in sorted(new_handles, key=lambda x: -sum(x[1].values()))[:20]:
-        roles_str = ", ".join(f"{r}={c}" for r, c in roles.items() if c > 0)
-        print(f"  + {handle}: {roles_str} (claims_merged={claims_merged_for(roles)})")
-    if len(new_handles) > 20:
-        print(f"  ... and {len(new_handles) - 20} more")
-    print()
-
-    print(f"=== {len(role_deltas)} EXISTING contributors with count drift ===")
-    for handle, truth_roles, cur_roles in sorted(
-        role_deltas,
-        key=lambda x: -sum(x[1].values()),
-    )[:20]:
-        for role in VALID_ROLES:
-            t = truth_roles.get(role, 0)
-            c = cur_roles.get(role, 0)
-            if t != c:
-                print(f"  ~ {handle}.{role}: db={c} → truth={t} (Δ{t - c:+d})")
-    if len(role_deltas) > 20:
-        print(f"  ... and {len(role_deltas) - 20} more")
-    print()
-
-    if args.dry_run:
-        mode = "RESET" if args.reset else "ADDITIVE"
-        print(f"Dry run ({mode} mode) — no changes written.")
-        if not args.reset:
-            print("Default is ADDITIVE: existing high counts (e.g. m3taversal=1011) preserved.")
-            print("Pass --reset to clobber existing counts with file-walk truth.")
-        return
-
-    # Idempotency: --reset is gated by audit marker. Additive mode is always safe.
-    if args.reset:
-        marker = conn.execute(
-            "SELECT 1 FROM audit_log WHERE event = 'sourcer_attribution_backfill_reset' LIMIT 1"
-        ).fetchone()
-        if marker and not args.force:
-            print("ERROR: --reset has already run (audit marker present).")
-            print("Pass --force to re-run.")
-            sys.exit(2)
-
-    inserted = 0
-    updated = 0
-    preserved_higher = 0
-    for handle, roles in truth.items():
-        truth_counts = {
-            "sourcer": roles.get("sourcer", 0),
-            "extractor": roles.get("extractor", 0),
-            "challenger": roles.get("challenger", 0),
-            "synthesizer": roles.get("synthesizer", 0),
-            "reviewer": roles.get("reviewer", 0),
-        }
-
-        if handle in current:
-            cur = current[handle]
-            if args.reset:
-                # Preserve reviewer_count even on reset (PR-level not file-level)
-                final = dict(truth_counts)
-                final["reviewer"] = max(truth_counts["reviewer"], cur.get("reviewer", 0))
-            else:
-                # Additive: max of db vs truth, per role
-                final = {
-                    role: max(truth_counts[role], cur.get(role, 0))
-                    for role in truth_counts
-                }
-                if any(cur.get(r, 0) > truth_counts[r] for r in truth_counts):
-                    preserved_higher += 1
-
-            cm = final["sourcer"] + final["extractor"]
-            conn.execute(
-                """UPDATE contributors SET
-                    sourcer_count = ?,
-                    extractor_count = ?,
-                    challenger_count = ?,
-                    synthesizer_count = ?,
-                    reviewer_count = ?,
-                    claims_merged = ?,
-                    updated_at = datetime('now')
-                WHERE handle = ?""",
-                (final["sourcer"], final["extractor"], final["challenger"],
-                 final["synthesizer"], final["reviewer"], cm, handle),
-            )
-            updated += 1
-        else:
-            cm = truth_counts["sourcer"] + truth_counts["extractor"]
-            conn.execute(
-                """INSERT INTO contributors (
-                    handle, sourcer_count, extractor_count, challenger_count,
-                    synthesizer_count, reviewer_count, claims_merged,
-                    first_contribution, last_contribution, tier
-                ) VALUES (?, ?, ?, ?, ?, ?, ?, date('now'), date('now'), 'new')""",
-                (handle, truth_counts["sourcer"], truth_counts["extractor"],
-                 truth_counts["challenger"], truth_counts["synthesizer"],
-                 truth_counts["reviewer"], cm),
-            )
-            inserted += 1
-
-    event = "sourcer_attribution_backfill_reset" if args.reset else "sourcer_attribution_backfill"
-    conn.execute(
-        "INSERT INTO audit_log (stage, event, detail) VALUES (?, ?, ?)",
-        ("contributor", event,
-         f'{{"inserted": {inserted}, "updated": {updated}, '
-         f'"preserved_higher": {preserved_higher}, "mode": '
-         f'"{"reset" if args.reset else "additive"}"}}'),
-    )
-    conn.commit()
-    print(f"Done ({'RESET' if args.reset else 'ADDITIVE'}). "
-          f"Inserted {inserted} new, updated {updated} existing, "
-          f"preserved {preserved_higher} higher-than-truth values.")
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/backfill-synthetic-recovery-prs.py
+++ b/scripts/backfill-synthetic-recovery-prs.py
@ -1,148 +0,0 @@
-#!/usr/bin/env python3
-"""Reconstruct synthetic `prs` rows for historical GitHub PRs lost pre-mirror-wiring.
-
-Two PRs merged on GitHub before our sync-mirror.sh tracked `github_pr`:
-  - GitHub PR #68: alexastrum — 6 claims, merged 2026-03-09 via GitHub squash,
-    recovered to Forgejo via commit dba00a79 (Apr 16, after mirror erased files)
-  - GitHub PR #88: Cameron-S1 — 1 claim, recovered via commit da64f805
-
-The recovery commits wrote the files directly to main, so our `prs` table has
-no row to attach originator events to — the backfill-events.py strategies all
-return NULL. We reconstruct one synthetic `prs` row per historical GitHub PR so
-the events pipeline (and `github_pr` strategy in backfill-events) can credit
-Alex and Cameron properly.
-
-Numbers 900000+ are clearly synthetic and won't collide with real Forgejo PRs.
-
-Idempotent via INSERT OR IGNORE.
-
-Usage:
-  python3 scripts/backfill-synthetic-recovery-prs.py --dry-run
-  python3 scripts/backfill-synthetic-recovery-prs.py
-"""
-import argparse
-import os
-import sqlite3
-import sys
-from pathlib import Path
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-
-# Historical GitHub PRs recovered via direct-to-main commits.
-# Original GitHub merge dates come from the recovery commit messages.
-RECOVERY_PRS = [
-    {
-        "number": 900068,
-        "github_pr": 68,
-        "branch": "gh-pr-68",
-        "status": "merged",
-        "domain": "ai-alignment",
-        "commit_type": "knowledge",
-        "tier": "STANDARD",
-        "leo_verdict": "approve",
-        "domain_verdict": "approve",
-        "submitted_by": "alexastrum",
-        "source_channel": "github",
-        # origin='human' matches lib/merge.py convention for external contributors
-        # (default is 'pipeline' which misclassifies us as machine-authored).
-        "origin": "human",
-        "priority": "high",
-        "description": "Multi-agent git workflows production maturity | Cryptographic agent trust ratings | Defense in depth for AI agent oversight | Deterministic policy engines below LLM layer | Knowledge validation four-layer architecture | Structurally separating proposer and reviewer agents",
-        "merged_at": "2026-03-09 00:00:00",
-        "created_at": "2026-03-08 00:00:00",
-        "last_error": "synthetic_recovery: GitHub PR #68 pre-mirror-wiring reconstruction (commit dba00a79)",
-    },
-    {
-        "number": 900088,
-        "github_pr": 88,
-        "branch": "gh-pr-88",
-        "status": "merged",
-        "domain": "ai-alignment",
-        "commit_type": "knowledge",
-        "tier": "STANDARD",
-        "leo_verdict": "approve",
-        "domain_verdict": "approve",
-        "submitted_by": "cameron-s1",
-        "source_channel": "github",
-        "origin": "human",
-        "priority": "high",
-        "description": "Orthogonality is an artefact of specification architectures not a property of intelligence itself",
-        "merged_at": "2026-04-01 00:00:00",
-        "created_at": "2026-04-01 00:00:00",
-        "last_error": "synthetic_recovery: GitHub PR #88 pre-mirror-wiring reconstruction (commit da64f805)",
-    },
-]
-
-
-def main():
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--dry-run", action="store_true")
-    args = parser.parse_args()
-
-    if not Path(DB_PATH).exists():
-        print(f"ERROR: DB not found at {DB_PATH}", file=sys.stderr)
-        sys.exit(1)
-
-    conn = sqlite3.connect(DB_PATH, timeout=30)
-    conn.row_factory = sqlite3.Row
-
-    # Guard against synthetic-range colonization (Ganymede review): check for
-    # any row in the synthetic range that isn't one of ours. INSERT OR IGNORE on
-    # the specific numbers is the real collision defense; this is belt-and-suspenders.
-    max_real = conn.execute(
-        "SELECT MAX(number) FROM prs WHERE number < 900000"
-    ).fetchone()[0] or 0
-    print(f"Max real Forgejo PR number: {max_real}")
-    synth_conflict = conn.execute(
-        "SELECT number FROM prs WHERE number >= 900000 AND number NOT IN (900068, 900088) LIMIT 1"
-    ).fetchone()
-    if synth_conflict:
-        print(f"ERROR: PR #{synth_conflict[0]} already exists in synthetic range. "
-              f"Pick a new range before running.", file=sys.stderr)
-        sys.exit(2)
-
-    inserted = 0
-    skipped = 0
-    for row in RECOVERY_PRS:
-        existing = conn.execute(
-            "SELECT number FROM prs WHERE number = ? OR github_pr = ?",
-            (row["number"], row["github_pr"]),
-        ).fetchone()
-        if existing:
-            print(f"  PR #{row['number']} (github_pr={row['github_pr']}): already exists — skip")
-            skipped += 1
-            continue
-        print(f"  {'(dry-run) ' if args.dry_run else ''}INSERT synthetic PR #{row['number']} "
-              f"(github_pr={row['github_pr']}, submitted_by={row['submitted_by']}, "
-              f"merged_at={row['merged_at']})")
-        if not args.dry_run:
-            conn.execute(
-                """INSERT INTO prs (
-                    number, github_pr, branch, status, domain, commit_type, tier,
-                    leo_verdict, domain_verdict, submitted_by, source_channel,
-                    origin, priority,
-                    description, merged_at, created_at, last_error
-                ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
-                (
-                    row["number"], row["github_pr"], row["branch"], row["status"],
-                    row["domain"], row["commit_type"], row["tier"],
-                    row["leo_verdict"], row["domain_verdict"],
-                    row["submitted_by"], row["source_channel"],
-                    row["origin"], row["priority"],
-                    row["description"], row["merged_at"], row["created_at"],
-                    row["last_error"],
-                ),
-            )
-            inserted += 1
-
-    if not args.dry_run:
-        conn.commit()
-
-    print(f"\nInserted {inserted}, skipped {skipped}")
-    if not args.dry_run and inserted:
-        print("\nNext step: re-run backfill-events.py to attach originator events")
-        print("  python3 ops/backfill-events.py")
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/classify-contributors.py
+++ b/scripts/classify-contributors.py
@ -1,426 +0,0 @@
-#!/usr/bin/env python3
-"""Classify `contributors` rows into {keep_person, keep_agent, move_to_publisher, delete_garbage}.
-
-Reads current contributors table, proposes reclassification per v26 schema design:
-  - Real humans + Pentagon agents stay in contributors (kind='person'|'agent')
-  - News orgs, publications, venues move to publishers table (new v26)
-  - Multi-word hyphenated garbage (parsing artifacts) gets deleted
-  - Their contribution_events are handled per category:
-      * Publishers: DELETE events (orgs shouldn't have credit)
-      * Garbage: DELETE events (bogus data)
-      * Persons/agents: keep events untouched
-
-Classification is heuristic — uses explicit allowlists + regex patterns + length gates.
-Ambiguous cases default to 'review_needed' (human decision).
-
-Usage:
-  python3 scripts/classify-contributors.py              # dry-run analysis + report
-  python3 scripts/classify-contributors.py --apply      # write changes
-  python3 scripts/classify-contributors.py --show <handle>  # inspect a single row
-
-Writes to pipeline.db only. Does NOT modify claim files.
-"""
-import argparse
-import json
-import os
-import re
-import sqlite3
-import sys
-from collections import Counter
-from pathlib import Path
-
-DB_PATH = os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")
-
-# Pentagon agents: kind='agent'. Authoritative list.
-PENTAGON_AGENTS = frozenset({
-    "rio", "leo", "theseus", "vida", "clay", "astra",
-    "oberon", "argus", "rhea", "ganymede", "epimetheus", "hermes", "ship",
-    "pipeline",
-})
-
-# Publisher/news-org handles seen in current contributors table.
-# Grouped by kind for the publishers row. Classified by inspection.
-# NOTE: This list is hand-curated — add to it as new orgs appear.
-PUBLISHERS_NEWS = {
-    # News outlets / brands
-    "cnbc", "al-jazeera", "axios", "bloomberg", "reuters", "bettorsinsider",
-    "fortune", "techcrunch", "coindesk", "coindesk-staff", "coindesk-research",
-    "coindesk research", "coindesk staff",
-    "defense-one", "thedefensepost", "theregister", "the-intercept",
-    "the-meridiem", "variety", "variety-staff", "variety staff", "spacenews",
-    "nasaspaceflight", "thedonkey", "insidedefense", "techpolicypress",
-    "morganlewis", "casinoorg", "deadline", "animationmagazine",
-    "defensepost", "casino-org", "casino.org",
-    "air & space forces magazine", "ieee spectrum", "techcrunch-staff",
-    "blockworks", "blockworks-staff", "decrypt", "ainvest", "banking-dive", "banking dive",
-    "cset-georgetown", "cset georgetown",
-    "kff", "kff-health-news", "kff health news", "kff-health-news---cbo",
-    "kff-health-news-/-cbo", "kff health news / cbo", "kffhealthnews",
-    "bloomberg-law",
-    "norton-rose-fulbright", "norton rose fulbright",
-    "defence-post", "the-defensepost",
-    "wilmerhale", "mofo", "sciencedirect",
-    "yogonet", "csr", "aisi-uk", "aisi", "aisi_gov", "rand",
-    "armscontrol", "eclinmed", "solana-compass", "solana compass",
-    "pmc11919318", "pmc11780016",
-    "healthverity", "natrium", "form-energy",
-    "courtlistener", "curtis-schiff", "curtis-schiff-prediction-markets",
-    "prophetx", "techpolicypress-staff",
-    "npr", "venturebeat", "geekwire", "payloadspace", "the-ankler",
-    "theankler", "tubefilter", "emarketer", "dagster",
-    "numerai",  # fund/project brand, not person
-    "psl", "multistate",
-}
-PUBLISHERS_ACADEMIC = {
-    # Academic orgs, labs, papers, journals, institutions
-    "arxiv", "metr", "metr_evals", "apollo-research", "apollo research", "apolloresearch",
-    "jacc-study-authors", "jacc-data-report-authors",
-    "anthropic-fellows-program", "anthropic-fellows",
-    "anthropic-fellows-/-alignment-science-team", "anthropic-research",
-    "jmir-2024", "jmir 2024",
-    "oettl-et-al.,-journal-of-experimental-orthopaedics",
-    "oettl et al., journal of experimental orthopaedics",
-    "jacc", "nct06548490", "pmc",
-    "conitzer-et-al.-(2024)", "aquino-michaels-2026", "pan-et-al.",
-    "pan-et-al.-'natural-language-agent-harnesses'",
-    "stanford", "stanford-meta-harness",
-    "hendershot", "annals-im",
-    "nellie-liang,-brookings-institution", "nellie liang, brookings institution",
-    "penn-state", "american-heart-association", "american heart association",
-    "molt_cornelius", "molt-cornelius",
-    # Companies / labs / brand-orgs (not specific humans)
-    "anthropic", "anthropicai", "openai", "nasa", "icrc", "ecri",
-    "epochairesearch", "metadao", "iapam", "icer",
-    "who", "ama", "uspstf", "unknown",
-    "futard.io",  # protocol/platform
-    "oxford-martin-ai-governance-initiative",
-    "oxford-martin-ai-governance",
-    "u.s.-food-and-drug-administration",
-    "jitse-goutbeek,-european-policy-centre",  # cited person+org string → publisher
-    "adepoju-et-al.",  # paper citation
-    # Formal-citation names (Firstname-Lastname or Lastname-et-al) — classified
-    # as academic citations, not reachable contributors. They'd need an @ handle
-    # to get CI credit per Cory's growth-loop design.
-    "senator-elissa-slotkin",
-    "bostrom", "hanson", "kaufmann", "noah-smith", "doug-shapiro",
-    "shayon-sengupta", "shayon sengupta",
-    "robin-hanson", "robin hanson", "eliezer-yudkowsky",
-    "leopold-aschenbrenner", "aschenbrenner",
-    "ramstead", "larsson", "heavey",
-    "dan-slimmon", "van-leeuwaarden", "ward-whitt", "adams",
-    "tamim-ansary", "spizzirri",
-    "dario-amodei",  # formal-citation form (real @ is @darioamodei)
-    "corless", "oxranga", "vlahakis",
-    # Brand/project/DAO tokens — not individuals
-    "areal-dao", "areal", "theiaresearch", "futard-io", "dhrumil",
-    # Classic formal-citation names — famous academics/economists cited by surname.
-    # Reachable via @ handle if/when they join (e.g. Ostrom has no X, Hayek deceased,
-    # Friston has an institutional affiliation not an @ handle we'd track).
-    "clayton-christensen", "hidalgo", "coase", "wiener", "juarrero",
-    "ostrom", "centola", "hayek", "marshall-mcluhan", "blackmore",
-    "knuth", "friston", "aquino-michaels", "conitzer", "bak",
-}
-# NOTE: pseudonymous X handles that MAY be real contributors stay in keep_person:
-#   karpathy, simonw, swyx, metaproph3t, metanallok, mmdhrumil, sjdedic,
-#   ceterispar1bus — these are real X accounts and match Cory's growth loop.
-# They appear without @ prefix because extraction frontmatter didn't normalize.
-# Auto-creating them as contributors tier='cited' is correct (A-path from earlier).
-PUBLISHERS_SOCIAL = {
-    "x", "twitter", "telegram", "x.com",
-}
-PUBLISHERS_INTERNAL = {
-    "teleohumanity-manifesto", "strategy-session-journal",
-    "living-capital-thesis-development", "attractor-state-historical-backtesting",
-    "web-research-compilation", "architectural-investing",
-    "governance---meritocratic-voting-+-futarchy",  # title artifact
-    "sec-interpretive-release-s7-2026-09-(march-17",  # title artifact
-    "mindstudio",  # tooling/platform, not contributor
-}
-# Merge into one kind→set map for classification
-PUBLISHER_KIND_MAP = {}
-for h in PUBLISHERS_NEWS:
-    PUBLISHER_KIND_MAP[h.lower()] = "news"
-for h in PUBLISHERS_ACADEMIC:
-    PUBLISHER_KIND_MAP[h.lower()] = "academic"
-for h in PUBLISHERS_SOCIAL:
-    PUBLISHER_KIND_MAP[h.lower()] = "social_platform"
-for h in PUBLISHERS_INTERNAL:
-    PUBLISHER_KIND_MAP[h.lower()] = "internal"
-
-
-# Garbage: handles that are clearly parse artifacts, not real names.
-# Pattern: contains parens, special chars, or >50 chars.
-def is_garbage(handle: str) -> bool:
-    h = handle.strip()
-    if len(h) > 50:
-        return True
-    if re.search(r"[()\[\]<>{}\/\\|@#$%^&*=?!:;\"']", h):
-        # But @ can appear legitimately in handles like @thesensatore — allow if @ is only prefix
-        if h.startswith("@") and not re.search(r"[()\[\]<>{}\/\\|#$%^&*=?!:;\"']", h):
-            return False
-        return True
-    # Multi-word hyphenated with very specific artifact shape: 3+ hyphens in a row or trailing noise
-    if "---" in h or "---meritocratic" in h or h.endswith("(march") or h.endswith("-(march"):
-        return True
-    return False
-
-
-def classify(handle: str) -> tuple[str, str | None]:
-    """Return (category, publisher_kind).
-
-    category ∈ {'keep_agent', 'keep_person', 'publisher', 'garbage', 'review_needed'}
-    publisher_kind ∈ {'news','academic','social_platform','internal', None}
-    """
-    h = handle.strip().lower().lstrip("@")
-
-    if h in PENTAGON_AGENTS:
-        return ("keep_agent", None)
-
-    if h in PUBLISHER_KIND_MAP:
-        return ("publisher", PUBLISHER_KIND_MAP[h])
-
-    if is_garbage(handle):
-        return ("garbage", None)
-
-    # @-prefixed handles or short-slug real-looking names → keep as person
-    # (Auto-create rule from Cory: @ handles auto-join as tier='cited'.)
-    if handle.startswith("@"):
-        return ("keep_person", None)
-
-    # Plausible handles (<=39 chars, alphanum + underscore/hyphen): treat as person.
-    # 39-char ceiling matches GitHub's handle limit and the writer path in
-    # contributor.py::_HANDLE_RE, so a valid 21-39 char real handle won't fall
-    # through to review_needed and block --apply.
-    if re.match(r"^[a-z0-9][a-z0-9_-]{0,38}$", h):
-        return ("keep_person", None)
-
-    # Everything else: needs human review
-    return ("review_needed", None)
-
-
-def main():
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--apply", action="store_true", help="Write changes to DB")
-    parser.add_argument("--show", type=str, help="Inspect a single handle")
-    parser.add_argument("--delete-events", action="store_true",
-                        help="DELETE contribution_events for publishers+garbage (default: keep for audit)")
-    args = parser.parse_args()
-
-    if not Path(DB_PATH).exists():
-        print(f"ERROR: DB not found at {DB_PATH}", file=sys.stderr)
-        sys.exit(1)
-
-    conn = sqlite3.connect(DB_PATH, timeout=30)
-    conn.row_factory = sqlite3.Row
-
-    # Sanity: publishers table must exist (v26 migration applied)
-    try:
-        conn.execute("SELECT 1 FROM publishers LIMIT 1")
-    except sqlite3.OperationalError:
-        print("ERROR: publishers table missing. Run migration v26 first.", file=sys.stderr)
-        sys.exit(2)
-
-    rows = conn.execute(
-        "SELECT handle, kind, tier, claims_merged FROM contributors ORDER BY claims_merged DESC"
-    ).fetchall()
-
-    if args.show:
-        target = args.show.strip().lower().lstrip("@")
-        for r in rows:
-            if r["handle"].lower().lstrip("@") == target:
-                category, pkind = classify(r["handle"])
-                events_count = conn.execute(
-                    "SELECT COUNT(*) FROM contribution_events WHERE handle = ?",
-                    (r["handle"].lower().lstrip("@"),),
-                ).fetchone()[0]
-                print(f"handle:         {r['handle']}")
-                print(f"current_kind:   {r['kind']}")
-                print(f"current_tier:   {r['tier']}")
-                print(f"claims_merged:  {r['claims_merged']}")
-                print(f"events:         {events_count}")
-                print(f"→ category:     {category}")
-                if pkind:
-                    print(f"→ publisher:    kind={pkind}")
-                return
-        print(f"No match for '{args.show}'")
-        return
-
-    # Classify all
-    buckets: dict[str, list[dict]] = {
-        "keep_agent": [],
-        "keep_person": [],
-        "publisher": [],
-        "garbage": [],
-        "review_needed": [],
-    }
-    for r in rows:
-        category, pkind = classify(r["handle"])
-        buckets[category].append({
-            "handle": r["handle"],
-            "kind_now": r["kind"],
-            "tier": r["tier"],
-            "claims": r["claims_merged"] or 0,
-            "publisher_kind": pkind,
-        })
-
-    print("=== Classification summary ===")
-    for cat, items in buckets.items():
-        print(f"  {cat:18s}  {len(items):5d}")
-
-    print("\n=== Sample of each category ===")
-    for cat, items in buckets.items():
-        print(f"\n--- {cat} (showing up to 10) ---")
-        for item in items[:10]:
-            tag = f" → {item['publisher_kind']}" if item["publisher_kind"] else ""
-            print(f"  {item['handle']:50s} claims={item['claims']:5d}{tag}")
-
-    print("\n=== Full review_needed list ===")
-    for item in buckets["review_needed"]:
-        print(f"  {item['handle']:50s} claims={item['claims']:5d}")
-
-    # Diagnostic: orphan alias count for handles we're about to delete.
-    # Contributor_aliases has no FK (SQLite FKs require PRAGMA to enforce anyway),
-    # so aliases pointing to deleted canonical handles become orphans. Surface
-    # the count so the --delete-events decision is informed.
-    doomed = [item["handle"].lower().lstrip("@") for item in buckets["garbage"] + buckets["publisher"]]
-    if doomed:
-        placeholders = ",".join("?" * len(doomed))
-        orphan_count = conn.execute(
-            f"SELECT COUNT(*) FROM contributor_aliases WHERE canonical IN ({placeholders})",
-            doomed,
-        ).fetchone()[0]
-        print(f"\n=== Alias orphan check ===")
-        print(f"  contributor_aliases rows pointing to deletable canonicals: {orphan_count}")
-        if orphan_count:
-            print(f"  (cleanup requires --delete-events; without it, aliases stay as orphans)")
-
-    if not args.apply:
-        print("\n(dry-run — no writes. Re-run with --apply to execute.)")
-        return
-
-    # ── Apply changes ──
-    print("\n=== Applying changes ===")
-    if buckets["review_needed"]:
-        print(f"ABORT: {len(buckets['review_needed'])} rows need human review. Fix classifier before --apply.")
-        sys.exit(3)
-
-    inserted_publishers = 0
-    reclassified_agents = 0
-    deleted_garbage = 0
-    deleted_publisher_rows = 0
-    deleted_events = 0
-    deleted_aliases = 0
-
-    # Single transaction — if any step errors, roll back. This prevents the failure
-    # mode where a publisher insert fails silently and we still delete the contributor
-    # row, losing data.
-    try:
-        conn.execute("BEGIN")
-
-        # 1. Insert publishers. Track which ones succeeded so step 4 only deletes those.
-        # Counter uses cur.rowcount so replay runs (where publishers already exist)
-        # report accurate inserted=0 instead of falsely claiming the full set.
-        # moved_to_publisher is unconditional — the contributors row still needs to
-        # be deleted even when the publishers row was added in a prior run.
-        moved_to_publisher = set()
-        for item in buckets["publisher"]:
-            name = item["handle"].strip().lower().lstrip("@")
-            cur = conn.execute(
-                "INSERT OR IGNORE INTO publishers (name, kind) VALUES (?, ?)",
-                (name, item["publisher_kind"]),
-            )
-            if cur.rowcount > 0:
-                inserted_publishers += 1
-            moved_to_publisher.add(item["handle"])
-
-        # 2. Ensure Pentagon agents have kind='agent' (idempotent after v25 patch)
-        for item in buckets["keep_agent"]:
-            conn.execute(
-                "UPDATE contributors SET kind = 'agent' WHERE handle = ?",
-                (item["handle"].lower().lstrip("@"),),
-            )
-            reclassified_agents += 1
-
-        # 3. Delete garbage handles from contributors (and their events + aliases)
-        for item in buckets["garbage"]:
-            canonical_lower = item["handle"].lower().lstrip("@")
-            if args.delete_events:
-                cur = conn.execute(
-                    "DELETE FROM contribution_events WHERE handle = ?",
-                    (canonical_lower,),
-                )
-                deleted_events += cur.rowcount
-                cur = conn.execute(
-                    "DELETE FROM contributor_aliases WHERE canonical = ?",
-                    (canonical_lower,),
-                )
-                deleted_aliases += cur.rowcount
-            cur = conn.execute(
-                "DELETE FROM contributors WHERE handle = ?",
-                (item["handle"],),
-            )
-            deleted_garbage += cur.rowcount
-
-        # 4. Delete publisher rows from contributors — ONLY for those successfully
-        # inserted into publishers above. Guards against partial failure.
-        # Aliases pointing to publisher-classified handles get cleaned under the
-        # same --delete-events gate: publishers live in their own table now, any
-        # leftover aliases in contributor_aliases are orphans.
-        for item in buckets["publisher"]:
-            if item["handle"] not in moved_to_publisher:
-                continue
-            canonical_lower = item["handle"].lower().lstrip("@")
-            if args.delete_events:
-                cur = conn.execute(
-                    "DELETE FROM contribution_events WHERE handle = ?",
-                    (canonical_lower,),
-                )
-                deleted_events += cur.rowcount
-                cur = conn.execute(
-                    "DELETE FROM contributor_aliases WHERE canonical = ?",
-                    (canonical_lower,),
-                )
-                deleted_aliases += cur.rowcount
-            cur = conn.execute(
-                "DELETE FROM contributors WHERE handle = ?",
-                (item["handle"],),
-            )
-            deleted_publisher_rows += cur.rowcount
-
-        # 5. Audit log entry for the destructive operation (Ganymede Q5).
-        conn.execute(
-            "INSERT INTO audit_log (timestamp, stage, event, detail) VALUES (datetime('now'), ?, ?, ?)",
-            (
-                "schema_v26",
-                "classify_contributors",
-                json.dumps({
-                    "publishers_inserted": inserted_publishers,
-                    "agents_updated": reclassified_agents,
-                    "garbage_deleted": deleted_garbage,
-                    "publisher_rows_deleted": deleted_publisher_rows,
-                    "events_deleted": deleted_events,
-                    "aliases_deleted": deleted_aliases,
-                    "delete_events_flag": bool(args.delete_events),
-                }),
-            ),
-        )
-
-        conn.commit()
-    except Exception as e:
-        conn.rollback()
-        print(f"ERROR: Transaction failed, rolled back. {e}", file=sys.stderr)
-        sys.exit(4)
-
-    print(f"  publishers inserted:          {inserted_publishers}")
-    print(f"  agents kind='agent' ensured:  {reclassified_agents}")
-    print(f"  garbage rows deleted:         {deleted_garbage}")
-    print(f"  publisher rows removed from contributors: {deleted_publisher_rows}")
-    if args.delete_events:
-        print(f"  contribution_events deleted:  {deleted_events}")
-        print(f"  contributor_aliases deleted:  {deleted_aliases}")
-    else:
-        print(f"  (events + aliases kept — re-run with --delete-events to clean them)")
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/contributor-graph.py
+++ b/scripts/contributor-graph.py
@ -1,137 +0,0 @@
-#!/usr/bin/env python3
-"""Generate cumulative contributor + claims PNG for Twitter embedding."""
-
-import json
-import subprocess
-import sys
-from datetime import datetime, timedelta
-from pathlib import Path
-
-import matplotlib
-matplotlib.use("Agg")
-import matplotlib.pyplot as plt
-import matplotlib.dates as mdates
-from matplotlib.ticker import MaxNLocator
-
-ACCENT = "#00d4aa"
-PURPLE = "#7c3aed"
-BG = "#0a0a0a"
-TEXT = "#e0e0e0"
-SUBTLE = "#555555"
-OUTPUT = Path("/opt/teleo-eval/static/contributor-graph.png")
-
-
-def get_data():
-    """Fetch from local API."""
-    import urllib.request
-    with urllib.request.urlopen("http://localhost:8081/api/contributor-growth") as r:
-        return json.loads(r.read())
-
-
-def build_continuous_series(milestones, start_date, end_date):
-    """Expand milestone-only contributor data into daily series."""
-    dates = []
-    values = []
-    current = 0
-    milestone_map = {}
-    for m in milestones:
-        d = datetime.strptime(m["date"], "%Y-%m-%d").date()
-        milestone_map[d] = m["cumulative"]
-
-    d = start_date
-    while d <= end_date:
-        if d in milestone_map:
-            current = milestone_map[d]
-        dates.append(d)
-        values.append(current)
-        d += timedelta(days=1)
-    return dates, values
-
-
-def render(data, output_path):
-    fig, ax1 = plt.subplots(figsize=(12, 6.3), dpi=100)
-    fig.patch.set_facecolor(BG)
-    ax1.set_facecolor(BG)
-
-    claims = data["cumulative_claims"]
-    contribs = data["cumulative_contributors"]
-
-    claim_dates = [datetime.strptime(c["date"], "%Y-%m-%d").date() for c in claims]
-    claim_values = [c["cumulative"] for c in claims]
-
-    start = min(claim_dates)
-    end = max(claim_dates)
-
-    contrib_dates, contrib_values = build_continuous_series(contribs, start, end)
-
-    # Claims line (left y-axis)
-    ax1.fill_between(claim_dates, claim_values, alpha=0.15, color=ACCENT)
-    ax1.plot(claim_dates, claim_values, color=ACCENT, linewidth=2.5, label="Claims")
-    ax1.set_ylabel("Claims", color=ACCENT, fontsize=12, fontweight="bold")
-    ax1.tick_params(axis="y", colors=ACCENT, labelsize=10)
-    ax1.set_ylim(bottom=0)
-
-    # Contributors line (right y-axis)
-    ax2 = ax1.twinx()
-    ax2.set_facecolor("none")
-    ax2.fill_between(contrib_dates, contrib_values, alpha=0.1, color=PURPLE, step="post")
-    ax2.step(contrib_dates, contrib_values, color=PURPLE, linewidth=2.5,
-             where="post", label="Contributors")
-    ax2.set_ylabel("Contributors", color=PURPLE, fontsize=12, fontweight="bold")
-    ax2.tick_params(axis="y", colors=PURPLE, labelsize=10)
-    ax2.yaxis.set_major_locator(MaxNLocator(integer=True))
-    ax2.set_ylim(bottom=0, top=max(contrib_values) * 1.8)
-
-    # Annotate contributor milestones with staggered offsets to avoid overlap
-    offsets = {}
-    for i, m in enumerate(contribs):
-        d = datetime.strptime(m["date"], "%Y-%m-%d").date()
-        val = m["cumulative"]
-        names = [n["name"] for n in m["new"]]
-        if len(names) <= 2:
-            label = ", ".join(names)
-        else:
-            label = f"+{len(names)}"
-        y_off = 8 + (i % 2) * 14
-        ax2.annotate(label, (d, val),
-                     textcoords="offset points", xytext=(5, y_off),
-                     fontsize=7, color=PURPLE, alpha=0.8)
-
-    # Hero stats
-    total_claims = data["summary"]["total_claims"]
-    total_contribs = data["summary"]["total_contributors"]
-    days = data["summary"]["days_active"]
-    fig.text(0.14, 0.88, f"{total_claims:,} claims", fontsize=22,
-             color=ACCENT, fontweight="bold", ha="left")
-    fig.text(0.14, 0.82, f"{total_contribs} contributors · {days} days",
-             fontsize=13, color=TEXT, ha="left", alpha=0.7)
-
-    # X-axis
-    ax1.xaxis.set_major_formatter(mdates.DateFormatter("%b %d"))
-    ax1.xaxis.set_major_locator(mdates.WeekdayLocator(interval=2))
-    ax1.tick_params(axis="x", colors=SUBTLE, labelsize=9, rotation=0)
-
-    # Remove spines
-    for ax in [ax1, ax2]:
-        for spine in ax.spines.values():
-            spine.set_visible(False)
-
-    # Subtle grid on claims axis only
-    ax1.grid(axis="y", color=SUBTLE, alpha=0.2, linewidth=0.5)
-    ax1.set_axisbelow(True)
-
-    # Branding
-    fig.text(0.98, 0.02, "livingip.xyz", fontsize=9, color=SUBTLE,
-             ha="right", style="italic")
-
-    plt.tight_layout(rect=[0, 0.03, 1, 0.78])
-    output_path.parent.mkdir(parents=True, exist_ok=True)
-    fig.savefig(output_path, facecolor=BG, bbox_inches="tight", pad_inches=0.3)
-    plt.close(fig)
-    print(f"Saved to {output_path} ({output_path.stat().st_size:,} bytes)")
-
-
-if __name__ == "__main__":
-    out = Path(sys.argv[1]) if len(sys.argv) > 1 else OUTPUT
-    data = get_data()
-    render(data, out)
--- a/scripts/cumulative-growth.py
+++ b/scripts/cumulative-growth.py
@ -1,223 +0,0 @@
-#!/usr/bin/env python3
-"""Generate cumulative growth time-series data for public dashboard.
-
-Produces JSON with three series:
-  - cumulative_contributors: unique git authors over time
-  - cumulative_claims: domain claim files added over time
-  - github_stars: star count snapshots (requires GitHub API)
-
-Data sources: git log (codex repo), GitHub API.
-Output: JSON to stdout or file, suitable for Chart.js line charts.
-
-Usage:
-  python3 cumulative-growth.py --codex-path /path/to/teleo-codex [--output /path/to/output.json]
-  python3 cumulative-growth.py --codex-path /path/to/teleo-codex --format csv
-"""
-
-import argparse
-import json
-import subprocess
-import sys
-from collections import defaultdict
-from datetime import datetime, timedelta
-
-# Map bot/service accounts to their human principal or exclude them.
-# "Teleo Agents" and "Teleo Pipeline" are bot accounts — attribute to system.
-CONTRIBUTOR_ALIASES = {
-    "Teleo Agents": None,   # system automation, not a contributor
-    "Teleo Pipeline": None, # pipeline bot
-}
-
-# Founding contributors get a badge — anyone who contributed before this date.
-FOUNDING_CUTOFF = "2026-03-15"
-
-
-def git_log_contributors(codex_path: str) -> list[dict]:
-    """Extract per-commit author and date from git log."""
-    result = subprocess.run(
-        ["git", "log", "--format=%ad|%an", "--date=format:%Y-%m-%d", "--all"],
-        capture_output=True, text=True, cwd=codex_path
-    )
-    if result.returncode != 0:
-        print(f"git log failed: {result.stderr}", file=sys.stderr)
-        sys.exit(1)
-
-    entries = []
-    for line in result.stdout.strip().split("\n"):
-        if "|" not in line:
-            continue
-        date, author = line.split("|", 1)
-        canonical = CONTRIBUTOR_ALIASES.get(author, author)
-        if canonical is None:
-            continue
-        entries.append({"date": date, "author": canonical})
-    return entries
-
-
-def git_log_claims(codex_path: str) -> list[dict]:
-    """Extract claim file additions over time from git log."""
-    result = subprocess.run(
-        ["git", "log", "--format=%ad", "--date=format:%Y-%m-%d",
-         "--all", "--diff-filter=A", "--", "domains/*.md"],
-        capture_output=True, text=True, cwd=codex_path
-    )
-    if result.returncode != 0:
-        print(f"git log failed: {result.stderr}", file=sys.stderr)
-        sys.exit(1)
-
-    counts = defaultdict(int)
-    for line in result.stdout.strip().split("\n"):
-        line = line.strip()
-        if line:
-            counts[line] += 1
-    return [{"date": d, "count": c} for d, c in sorted(counts.items())]
-
-
-def github_stars(repo: str = "living-ip/teleo-codex") -> int | None:
-    """Fetch current star count from GitHub API. Returns None on failure."""
-    try:
-        result = subprocess.run(
-            ["gh", "api", f"repos/{repo}", "--jq", ".stargazers_count"],
-            capture_output=True, text=True, timeout=10
-        )
-        if result.returncode == 0:
-            return int(result.stdout.strip())
-    except (subprocess.TimeoutExpired, ValueError):
-        pass
-    return None
-
-
-def build_cumulative_contributors(entries: list[dict]) -> list[dict]:
-    """Build cumulative unique contributor count by date."""
-    first_seen = {}
-    for e in entries:
-        author, date = e["author"], e["date"]
-        if author not in first_seen or date < first_seen[author]:
-            first_seen[author] = date
-
-    by_date = defaultdict(list)
-    for author, date in first_seen.items():
-        by_date[date].append(author)
-
-    timeline = []
-    seen = set()
-    for date in sorted(by_date.keys()):
-        new_authors = by_date[date]
-        seen.update(new_authors)
-        is_founding = date <= FOUNDING_CUTOFF
-        timeline.append({
-            "date": date,
-            "cumulative": len(seen),
-            "new": [
-                {"name": a, "founding": is_founding}
-                for a in sorted(new_authors)
-            ],
-        })
-    return timeline
-
-
-def build_cumulative_claims(claim_entries: list[dict]) -> list[dict]:
-    """Build cumulative claim count by date."""
-    timeline = []
-    cumulative = 0
-    for entry in claim_entries:
-        cumulative += entry["count"]
-        timeline.append({
-            "date": entry["date"],
-            "cumulative": cumulative,
-            "added": entry["count"],
-        })
-    return timeline
-
-
-def build_daily_commits(entries: list[dict]) -> list[dict]:
-    """Build daily commit volume by contributor."""
-    daily = defaultdict(lambda: defaultdict(int))
-    for e in entries:
-        daily[e["date"]][e["author"]] += 1
-
-    timeline = []
-    for date in sorted(daily.keys()):
-        authors = daily[date]
-        timeline.append({
-            "date": date,
-            "total": sum(authors.values()),
-            "by_contributor": dict(sorted(authors.items())),
-        })
-    return timeline
-
-
-def generate_report(codex_path: str) -> dict:
-    entries = git_log_contributors(codex_path)
-    claim_entries = git_log_claims(codex_path)
-    stars = github_stars()
-
-    contributors_timeline = build_cumulative_contributors(entries)
-    claims_timeline = build_cumulative_claims(claim_entries)
-    commits_timeline = build_daily_commits(entries)
-
-    all_contributors = set(e["author"] for e in entries)
-    founding = [
-        a for a in all_contributors
-        if any(
-            e["date"] <= FOUNDING_CUTOFF and e["author"] == a
-            for e in entries
-        )
-    ]
-
-    return {
-        "generated_at": datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ"),
-        "summary": {
-            "total_contributors": len(all_contributors),
-            "founding_contributors": sorted(founding),
-            "total_claims": claims_timeline[-1]["cumulative"] if claims_timeline else 0,
-            "github_stars": stars,
-            "codex_start_date": "2026-03-05",
-            "days_active": (datetime.utcnow() - datetime(2026, 3, 5)).days,
-        },
-        "cumulative_contributors": contributors_timeline,
-        "cumulative_claims": claims_timeline,
-        "daily_activity": commits_timeline,
-    }
-
-
-def format_csv(report: dict) -> str:
-    lines = ["date,cumulative_contributors,cumulative_claims"]
-    contrib_map = {e["date"]: e["cumulative"] for e in report["cumulative_contributors"]}
-    claims_map = {e["date"]: e["cumulative"] for e in report["cumulative_claims"]}
-
-    all_dates = sorted(set(list(contrib_map.keys()) + list(claims_map.keys())))
-
-    last_contrib = 0
-    last_claims = 0
-    for d in all_dates:
-        last_contrib = contrib_map.get(d, last_contrib)
-        last_claims = claims_map.get(d, last_claims)
-        lines.append(f"{d},{last_contrib},{last_claims}")
-    return "\n".join(lines)
-
-
-def main():
-    parser = argparse.ArgumentParser(description="Generate cumulative growth data")
-    parser.add_argument("--codex-path", required=True, help="Path to teleo-codex repo")
-    parser.add_argument("--output", help="Output file path (default: stdout)")
-    parser.add_argument("--format", choices=["json", "csv"], default="json")
-    args = parser.parse_args()
-
-    report = generate_report(args.codex_path)
-
-    if args.format == "csv":
-        output = format_csv(report)
-    else:
-        output = json.dumps(report, indent=2)
-
-    if args.output:
-        with open(args.output, "w") as f:
-            f.write(output)
-        print(f"Written to {args.output}", file=sys.stderr)
-    else:
-        print(output)
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/scoring_digest.py
+++ b/scripts/scoring_digest.py
@ -1,561 +0,0 @@
-#!/usr/bin/env python3
-"""Daily scoring digest — classify, score, and broadcast KB contributions.
-
-Runs daily at 8:07 AM London via cron.
-Queries pipeline.db for merged PRs in last 24h, classifies each as
-CREATE/ENRICH/CHALLENGE, scores with importance multiplier and connectivity
-bonus, updates contributors table, posts summary to Telegram.
-
-Spec: Pentagon/sprints/contribution-scoring-algorithm.md
-"""
-
-import json
-import logging
-import os
-import re
-import sqlite3
-import subprocess
-import sys
-import urllib.request
-from datetime import datetime, timezone, timedelta
-from pathlib import Path
-from zoneinfo import ZoneInfo
-
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s [%(levelname)s] %(message)s",
-)
-log = logging.getLogger("scoring_digest")
-
-# --- Configuration ---
-BASE_DIR = Path(os.environ.get("PIPELINE_BASE", "/opt/teleo-eval"))
-DB_PATH = BASE_DIR / "pipeline" / "pipeline.db"
-CODEX_DIR = BASE_DIR / "workspaces" / "main"
-TELEGRAM_TOKEN_FILE = BASE_DIR / "secrets" / "telegram-bot-token"
-TELEGRAM_CHAT_ID = 2091295364
-DIGEST_JSON_PATH = BASE_DIR / "logs" / "scoring-digest-latest.json"
-LONDON_TZ = ZoneInfo("Europe/London")
-
-# --- Action weights (Leo spec Apr 20) ---
-ACTION_WEIGHTS = {
-    "challenge": 0.40,
-    "create": 0.35,
-    "enrich": 0.25,
-}
-
-# --- Confidence → base importance mapping ---
-CONFIDENCE_BASE = {
-    "proven": 2.0,
-    "likely": 1.5,
-    "experimental": 1.0,
-    "speculative": 1.0,
-    "possible": 1.0,
-    "plausible": 1.0,
-    "medium": 1.5,
-}
-
-DOMAIN_CLAIM_COUNTS: dict[str, int] = {}
-ENTITY_SLUGS: set[str] = set()
-CLAIM_SLUGS: set[str] = set()
-MAP_FILES: set[str] = set()
-
-
-def _slugify(title: str) -> str:
-    s = title.lower().strip()
-    s = re.sub(r"[^\w\s-]", "", s)
-    s = re.sub(r"[\s_]+", "-", s)
-    return s.strip("-")
-
-
-def _init_link_index():
-    """Build indexes for wiki-link resolution."""
-    global ENTITY_SLUGS, CLAIM_SLUGS, MAP_FILES
-
-    entities_dir = CODEX_DIR / "entities"
-    if entities_dir.exists():
-        for f in entities_dir.glob("*.md"):
-            ENTITY_SLUGS.add(f.stem.lower())
-
-    for domain_dir in (CODEX_DIR / "domains").iterdir():
-        if not domain_dir.is_dir():
-            continue
-        for f in domain_dir.glob("*.md"):
-            CLAIM_SLUGS.add(f.stem.lower())
-        map_file = domain_dir / "_map.md"
-        if map_file.exists():
-            MAP_FILES.add("_map")
-            MAP_FILES.add(f"domains/{domain_dir.name}/_map")
-
-    for f in (CODEX_DIR / "foundations").glob("*.md") if (CODEX_DIR / "foundations").exists() else []:
-        CLAIM_SLUGS.add(f.stem.lower())
-    for f in (CODEX_DIR / "core").glob("*.md") if (CODEX_DIR / "core").exists() else []:
-        CLAIM_SLUGS.add(f.stem.lower())
-    for f in (CODEX_DIR / "decisions").glob("*.md") if (CODEX_DIR / "decisions").exists() else []:
-        CLAIM_SLUGS.add(f.stem.lower())
-
-
-def _resolve_link(link_text: str) -> bool:
-    """Check if a [[wiki-link]] resolves to a known entity, claim, or map."""
-    slug = _slugify(link_text)
-    return (
-        slug in ENTITY_SLUGS
-        or slug in CLAIM_SLUGS
-        or slug in MAP_FILES
-        or link_text.lower() in MAP_FILES
-    )
-
-
-def _count_resolved_wiki_links(file_path: Path) -> int:
-    """Count wiki-links in a claim file that resolve to real targets."""
-    if not file_path.exists():
-        return 0
-    try:
-        text = file_path.read_text(encoding="utf-8")
-    except Exception:
-        return 0
-
-    links = re.findall(r"\[\[([^\]]+)\]\]", text)
-    return sum(1 for link in links if _resolve_link(link))
-
-
-def _get_confidence(file_path: Path) -> str:
-    """Extract confidence field from claim frontmatter."""
-    if not file_path.exists():
-        return "experimental"
-    try:
-        text = file_path.read_text(encoding="utf-8")
-    except Exception:
-        return "experimental"
-
-    m = re.search(r"^confidence:\s*(\S+)", text, re.MULTILINE)
-    return m.group(1).strip() if m else "experimental"
-
-
-def _has_cross_domain_ref(file_path: Path) -> bool:
-    """Check if claim references another domain via secondary_domains or cross-domain links."""
-    if not file_path.exists():
-        return False
-    try:
-        text = file_path.read_text(encoding="utf-8")
-    except Exception:
-        return False
-
-    if re.search(r"^secondary_domains:\s*\[.+\]", text, re.MULTILINE):
-        return True
-    if re.search(r"^depends_on:", text, re.MULTILINE):
-        return True
-    return False
-
-
-def _has_challenged_by(file_path: Path) -> bool:
-    """Check if claim has challenged_by field."""
-    if not file_path.exists():
-        return False
-    try:
-        text = file_path.read_text(encoding="utf-8")
-    except Exception:
-        return False
-    return bool(re.search(r"^challenged_by:", text, re.MULTILINE))
-
-
-def _get_domain_weight(domain: str) -> float:
-    """Domain maturity weight: sparse domains get bonus, mature domains get discount."""
-    count = DOMAIN_CLAIM_COUNTS.get(domain, 0)
-    if count < 20:
-        return 1.5
-    elif count > 50:
-        return 0.8
-    return 1.0
-
-
-def _init_domain_counts():
-    """Count claims per domain."""
-    global DOMAIN_CLAIM_COUNTS
-    domains_dir = CODEX_DIR / "domains"
-    if not domains_dir.exists():
-        return
-    for domain_dir in domains_dir.iterdir():
-        if domain_dir.is_dir():
-            count = sum(1 for f in domain_dir.glob("*.md") if f.name != "_map.md")
-            DOMAIN_CLAIM_COUNTS[domain_dir.name] = count
-
-
-def _normalize_contributor(submitted_by: str | None, agent: str | None, branch: str | None = None) -> str:
-    """Normalize contributor handle — strip @, map agent self-directed to agent name.
-
-    For fork PRs (contrib/NAME/...), extract contributor from branch name.
-    """
-    if branch and branch.startswith("contrib/"):
-        parts = branch.split("/")
-        if len(parts) >= 2 and parts[1]:
-            return parts[1].lower()
-
-    raw = submitted_by or agent or "unknown"
-    raw = raw.strip()
-    if raw.startswith("@"):
-        raw = raw[1:]
-    if " (self-directed)" in raw:
-        raw = raw.replace(" (self-directed)", "")
-    if raw in ("pipeline", ""):
-        return agent.strip() if agent and agent.strip() not in ("pipeline", "") else "pipeline"
-    return raw
-
-
-def classify_pr(pr: dict) -> str | None:
-    """Classify a merged PR as create/enrich/challenge or None (skip).
-
-    Uses branch name pattern + commit_type as primary signal.
-    Falls back to file-level analysis for ambiguous cases.
-    """
-    branch = pr.get("branch", "")
-    commit_type = pr.get("commit_type", "")
-
-    if commit_type in ("pipeline", "entity"):
-        return None
-
-    if "challenge" in branch.lower():
-        return "challenge"
-
-    if branch.startswith("extract/") or branch.startswith("research-"):
-        return "create"
-
-    if "reweave" in branch.lower() or "enrich" in branch.lower():
-        return "enrich"
-
-    if commit_type == "research":
-        return "create"
-
-    if commit_type == "reweave":
-        return "enrich"
-
-    if commit_type == "fix":
-        return "enrich"
-
-    if commit_type == "knowledge":
-        return "create"
-
-    return "create"
-
-
-def _find_claim_file(pr: dict) -> Path | None:
-    """Find the claim file for a merged PR."""
-    domain = pr.get("domain")
-    branch = pr.get("branch", "")
-
-    if not domain:
-        return None
-
-    domain_dir = CODEX_DIR / "domains" / domain
-    if not domain_dir.exists():
-        return None
-
-    slug_part = branch.split("/")[-1] if "/" in branch else branch
-    slug_part = re.sub(r"-[a-f0-9]{4}$", "", slug_part)
-
-    for claim_file in domain_dir.glob("*.md"):
-        if claim_file.name == "_map.md":
-            continue
-        claim_slug = _slugify(claim_file.stem)
-        if slug_part and slug_part in claim_slug:
-            return claim_file
-
-    return None
-
-
-def score_contribution(action_type: str, claim_file: Path | None, domain: str) -> tuple[float, dict]:
-    """Compute CI points for a single contribution.
-
-    Returns (score, breakdown_dict) for transparency.
-    """
-    weight = ACTION_WEIGHTS[action_type]
-
-    confidence = _get_confidence(claim_file) if claim_file else "experimental"
-    base = CONFIDENCE_BASE.get(confidence, 1.0)
-
-    if action_type == "challenge" and claim_file and _has_challenged_by(claim_file):
-        base = 3.0 if confidence in ("proven",) else 2.5
-
-    domain_weight = _get_domain_weight(domain)
-
-    connectivity = 0.0
-    if claim_file and _has_cross_domain_ref(claim_file):
-        connectivity += 0.2
-
-    create_multiplier = 1.0
-    resolved_links = 0
-    if action_type == "create" and claim_file:
-        resolved_links = _count_resolved_wiki_links(claim_file)
-        if resolved_links >= 3:
-            create_multiplier = 1.5
-
-    importance = base * domain_weight + connectivity
-    score = weight * importance * create_multiplier
-
-    return score, {
-        "action": action_type,
-        "weight": weight,
-        "confidence": confidence,
-        "base": base,
-        "domain_weight": domain_weight,
-        "connectivity_bonus": connectivity,
-        "create_multiplier": create_multiplier,
-        "resolved_links": resolved_links,
-        "importance": importance,
-        "score": round(score, 4),
-    }
-
-
-def collect_and_score(hours: int = 24) -> dict:
-    """Main scoring pipeline: collect merged PRs, classify, score."""
-    _init_domain_counts()
-    _init_link_index()
-
-    cutoff = (datetime.now(timezone.utc) - timedelta(hours=hours)).isoformat()
-
-    conn = sqlite3.connect(str(DB_PATH))
-    conn.row_factory = sqlite3.Row
-    try:
-        rows = conn.execute(
-            """SELECT number, branch, domain, agent, commit_type, merged_at,
-                      submitted_by, description
-               FROM prs
-               WHERE status = 'merged' AND merged_at >= ?
-               ORDER BY merged_at DESC""",
-            (cutoff,),
-        ).fetchall()
-    finally:
-        conn.close()
-
-    contributions = []
-    contributor_deltas: dict[str, float] = {}
-    domain_activity: dict[str, int] = {}
-    action_counts = {"create": 0, "enrich": 0, "challenge": 0}
-
-    for row in rows:
-        pr = dict(row)
-        action_type = classify_pr(pr)
-        if action_type is None:
-            continue
-
-        claim_file = _find_claim_file(pr)
-        domain = pr.get("domain", "unknown")
-        score, breakdown = score_contribution(action_type, claim_file, domain)
-
-        contributor = _normalize_contributor(
-            pr.get("submitted_by"), pr.get("agent"), pr.get("branch")
-        )
-        contributor_deltas[contributor] = contributor_deltas.get(contributor, 0) + score
-        domain_activity[domain] = domain_activity.get(domain, 0) + 1
-        action_counts[action_type] = action_counts.get(action_type, 0) + 1
-
-        contributions.append({
-            "pr_number": pr["number"],
-            "contributor": contributor,
-            "agent": pr.get("agent", ""),
-            "domain": domain,
-            "action": action_type,
-            "score": round(score, 4),
-            "breakdown": breakdown,
-            "description": pr.get("description", ""),
-            "merged_at": pr.get("merged_at", ""),
-        })
-
-    total_claims = sum(DOMAIN_CLAIM_COUNTS.values())
-
-    return {
-        "period_hours": hours,
-        "generated_at": datetime.now(timezone.utc).isoformat(),
-        "date": datetime.now(LONDON_TZ).strftime("%B %d, %Y"),
-        "contributions": contributions,
-        "contributor_deltas": {k: round(v, 4) for k, v in sorted(
-            contributor_deltas.items(), key=lambda x: -x[1]
-        )},
-        "domain_activity": dict(sorted(domain_activity.items(), key=lambda x: -x[1])),
-        "action_counts": action_counts,
-        "total_contributions": len(contributions),
-        "total_ci_awarded": round(sum(c["score"] for c in contributions), 4),
-        "kb_state": {
-            "total_claims": total_claims,
-            "domains": len(DOMAIN_CLAIM_COUNTS),
-            "domain_breakdown": dict(DOMAIN_CLAIM_COUNTS),
-        },
-    }
-
-
-def update_contributors(digest: dict):
-    """Write CI deltas to contributors table."""
-    if not digest["contributor_deltas"]:
-        return
-
-    conn = sqlite3.connect(str(DB_PATH))
-    try:
-        for handle, delta in digest["contributor_deltas"].items():
-            conn.execute(
-                """INSERT INTO contributors (handle, claims_merged, created_at, updated_at)
-                   VALUES (?, 0, datetime('now'), datetime('now'))
-                   ON CONFLICT(handle) DO UPDATE SET updated_at = datetime('now')""",
-                (handle,),
-            )
-        conn.commit()
-    finally:
-        conn.close()
-
-    log.info("Updated %d contributor records", len(digest["contributor_deltas"]))
-
-
-def save_scores_to_db(digest: dict):
-    """Write individual contribution scores to contribution_scores table."""
-    conn = sqlite3.connect(str(DB_PATH))
-    try:
-        conn.execute("""CREATE TABLE IF NOT EXISTS contribution_scores (
-            id INTEGER PRIMARY KEY AUTOINCREMENT,
-            pr_number INTEGER UNIQUE,
-            contributor TEXT NOT NULL,
-            event_type TEXT CHECK(event_type IN ('create','enrich','challenge')),
-            ci_earned REAL,
-            claim_slug TEXT,
-            domain TEXT,
-            scored_at TEXT NOT NULL
-        )""")
-        for c in digest["contributions"]:
-            slug = (c.get("description") or "")[:200] or c.get("breakdown", {}).get("action", "")
-            conn.execute(
-                """INSERT INTO contribution_scores (pr_number, contributor, event_type, ci_earned, claim_slug, domain, scored_at)
-                   VALUES (?, ?, ?, ?, ?, ?, ?)
-                   ON CONFLICT(pr_number) DO UPDATE SET
-                     contributor = excluded.contributor,
-                     ci_earned = excluded.ci_earned,
-                     event_type = excluded.event_type,
-                     scored_at = excluded.scored_at""",
-                (c["pr_number"], c["contributor"], c["action"], c["score"], slug, c["domain"], c["merged_at"]),
-            )
-        conn.commit()
-        log.info("Wrote %d contribution scores to DB", len(digest["contributions"]))
-    finally:
-        conn.close()
-
-
-def save_digest_json(digest: dict):
-    """Save latest digest as JSON for API consumption."""
-    DIGEST_JSON_PATH.parent.mkdir(parents=True, exist_ok=True)
-    with open(DIGEST_JSON_PATH, "w") as f:
-        json.dump(digest, f, indent=2, default=str)
-    log.info("Saved digest to %s", DIGEST_JSON_PATH)
-
-
-def send_telegram(digest: dict):
-    """Post digest summary to Telegram."""
-    token_file = TELEGRAM_TOKEN_FILE
-    if not token_file.exists():
-        log.warning("Telegram token not found at %s", token_file)
-        return
-
-    token = token_file.read_text().strip()
-
-    lines = [f"📊 *Daily KB Digest — {digest['date']}*", ""]
-
-    if digest["contributions"]:
-        lines.append(f"*NEW CONTRIBUTIONS* (last {digest['period_hours']}h):")
-        action_emoji = {"challenge": "⚔️", "create": "🆕", "enrich": "📚"}
-
-        by_contributor: dict[str, list] = {}
-        for c in digest["contributions"]:
-            name = c["contributor"]
-            by_contributor.setdefault(name, []).append(c)
-
-        for name, contribs in sorted(by_contributor.items(), key=lambda x: -sum(c["score"] for c in x[1])):
-            total_score = sum(c["score"] for c in contribs)
-            actions = {}
-            for c in contribs:
-                actions[c["action"]] = actions.get(c["action"], 0) + 1
-
-            action_summary = ", ".join(
-                f"{action_emoji.get(a, '•')} {n} {a}" for a, n in sorted(actions.items(), key=lambda x: -x[1])
-            )
-            lines.append(f"  {name}: {action_summary} → +{total_score:.2f} CI")
-
-        lines.append("")
-
-    lines.append("*KB STATE:*")
-    kb = digest["kb_state"]
-    ac = digest["action_counts"]
-    lines.append(
-        f"Claims: {kb['total_claims']} (+{digest['total_contributions']}) | "
-        f"Domains: {kb['domains']}"
-    )
-    lines.append(
-        f"Creates: {ac.get('create', 0)} | "
-        f"Enrichments: {ac.get('enrich', 0)} | "
-        f"Challenges: {ac.get('challenge', 0)}"
-    )
-
-    if digest["domain_activity"]:
-        top_domain = max(digest["domain_activity"], key=digest["domain_activity"].get)
-        lines.append(f"Most active: {top_domain} ({digest['domain_activity'][top_domain]} events)")
-
-    if digest["contributor_deltas"]:
-        lines.append("")
-        lines.append("*LEADERBOARD CHANGE:*")
-        for i, (name, delta) in enumerate(digest["contributor_deltas"].items(), 1):
-            if i > 5:
-                break
-            lines.append(f"  #{i} {name} +{delta:.2f} CI")
-
-    text = "\n".join(lines)
-
-    url = f"https://api.telegram.org/bot{token}/sendMessage"
-    payload = json.dumps({
-        "chat_id": TELEGRAM_CHAT_ID,
-        "text": text,
-        "parse_mode": "Markdown",
-    }).encode("utf-8")
-
-    req = urllib.request.Request(url, data=payload, headers={"Content-Type": "application/json"})
-    try:
-        with urllib.request.urlopen(req, timeout=15) as resp:
-            result = json.loads(resp.read())
-            if result.get("ok"):
-                log.info("Telegram digest sent successfully")
-            else:
-                log.error("Telegram API error: %s", result)
-    except Exception as e:
-        log.error("Failed to send Telegram message: %s", e)
-
-
-def main():
-    hours = int(sys.argv[1]) if len(sys.argv) > 1 else 24
-    dry_run = "--dry-run" in sys.argv
-    no_telegram = "--no-telegram" in sys.argv
-
-    log.info("Running scoring digest for last %dh (dry_run=%s)", hours, dry_run)
-
-    digest = collect_and_score(hours)
-
-    log.info(
-        "Scored %d contributions: %d create, %d enrich, %d challenge → %.2f total CI",
-        digest["total_contributions"],
-        digest["action_counts"]["create"],
-        digest["action_counts"]["enrich"],
-        digest["action_counts"]["challenge"],
-        digest["total_ci_awarded"],
-    )
-
-    for name, delta in digest["contributor_deltas"].items():
-        log.info("  %s: +%.4f CI", name, delta)
-
-    if dry_run:
-        print(json.dumps(digest, indent=2, default=str))
-        return
-
-    save_digest_json(digest)
-    save_scores_to_db(digest)
-    update_contributors(digest)
-
-    if not no_telegram:
-        send_telegram(digest)
-
-    log.info("Digest complete")
-
-
-if __name__ == "__main__":
-    main()
--- a/docs/self-directed-research.md
+++ b/docs/self-directed-research.md
--- a/sync-mirror.sh
+++ b/sync-mirror.sh
@ -0,0 +1,159 @@
+#!/bin/bash
+# Bidirectional sync: Forgejo (authoritative) <-> GitHub (public mirror)
+# Forgejo wins on conflict. Runs every 2 minutes via cron.
+#
+# Security note: GitHub->Forgejo path is for external contributor convenience.
+# Never auto-process branches arriving via this path without a PR.
+# Eval pipeline and extract cron only act on PRs, not raw branches.
+
+set -euo pipefail
+
+REPO_DIR="/opt/teleo-eval/mirror/teleo-codex.git"
+LOG="/opt/teleo-eval/logs/sync.log"
+LOCKFILE="/tmp/sync-mirror.lock"
+
+log() { echo "[$(date -Iseconds)] $1" >> "$LOG"; }
+
+# Lockfile — prevent concurrent runs
+if [ -f "$LOCKFILE" ]; then
+    pid=$(cat "$LOCKFILE" 2>/dev/null)
+    if kill -0 "$pid" 2>/dev/null; then
+        exit 0
+    fi
+    rm -f "$LOCKFILE"
+fi
+echo $$ > "$LOCKFILE"
+trap 'rm -f "$LOCKFILE"' EXIT
+
+# Pre-flight: fix permissions if another user touched the mirror dir (Rhea)
+BAD_PERMS=$(find "$REPO_DIR" ! -user teleo 2>/dev/null | head -1 || true)
+if [ -n "$BAD_PERMS" ]; then
+    log "Fixing mirror permissions (found: $BAD_PERMS)"
+    chown -R teleo:teleo "$REPO_DIR" 2>/dev/null
+fi
+cd "$REPO_DIR" || { log "ERROR: cannot cd to $REPO_DIR"; exit 1; }
+
+# Step 1: Fetch from Forgejo (must succeed — it's authoritative)
+log "Fetching from Forgejo..."
+if ! git fetch forgejo --prune >> "$LOG" 2>&1; then
+    log "ERROR: Forgejo fetch failed — aborting"
+    exit 1
+fi
+
+# Step 2: Fetch from GitHub (warn on failure, don't abort)
+log "Fetching from GitHub..."
+git fetch origin --prune >> "$LOG" 2>&1 || log "WARN: GitHub fetch failed"
+
+# Step 2.5: GitHub main -> Forgejo main (ff-only)
+# If a PR was merged on GitHub, GitHub main is ahead of Forgejo main.
+# Fast-forward Forgejo main to match — safe because ff-only guarantees no divergence.
+GITHUB_MAIN_FF=$(git rev-parse refs/remotes/origin/main 2>/dev/null || true)
+FORGEJO_MAIN_FF=$(git rev-parse refs/remotes/forgejo/main 2>/dev/null || true)
+if [ -n "$GITHUB_MAIN_FF" ] && [ -n "$FORGEJO_MAIN_FF" ]; then
+    if [ "$GITHUB_MAIN_FF" != "$FORGEJO_MAIN_FF" ]; then
+        if git merge-base --is-ancestor "$FORGEJO_MAIN_FF" "$GITHUB_MAIN_FF"; then
+            log "GitHub main ($GITHUB_MAIN_FF) ahead of Forgejo main ($FORGEJO_MAIN_FF) — fast-forwarding"
+            git push forgejo "refs/remotes/origin/main:refs/heads/main" >> "$LOG" 2>&1 && \
+                log "Forgejo main fast-forwarded to $GITHUB_MAIN_FF" || \
+                log "WARN: Failed to fast-forward Forgejo main"
+        fi
+    fi
+fi
+
+# Step 3: Forgejo -> GitHub (primary direction)
+# Update local refs from Forgejo remote refs using process substitution (avoids subshell)
+log "Syncing Forgejo -> GitHub..."
+while read branch; do
+    [ "$branch" = "HEAD" ] && continue
+    git update-ref "refs/heads/$branch" "refs/remotes/forgejo/$branch" 2>/dev/null || \
+        log "WARN: Failed to update ref $branch"
+done < <(git for-each-ref --format="%(refname:lstrip=3)" refs/remotes/forgejo/)
+
+# Safety: verify Forgejo main descends from GitHub main before force-pushing
+GITHUB_MAIN=$(git rev-parse refs/remotes/origin/main 2>/dev/null || true)
+FORGEJO_MAIN=$(git rev-parse refs/remotes/forgejo/main 2>/dev/null || true)
+PUSH_MAIN=true
+if [ -n "$GITHUB_MAIN" ] && [ -n "$FORGEJO_MAIN" ]; then
+    if ! git merge-base --is-ancestor "$GITHUB_MAIN" "$FORGEJO_MAIN"; then
+        log "CRITICAL: Forgejo main is NOT a descendant of GitHub main — skipping main push"
+        log "CRITICAL: GitHub main: $GITHUB_MAIN, Forgejo main: $FORGEJO_MAIN"
+        PUSH_MAIN=false
+    fi
+fi
+
+if [ "$PUSH_MAIN" = true ]; then
+    git push origin --all --force >> "$LOG" 2>&1 || log "WARN: Push to GitHub failed"
+else
+    # Push all branches except main
+    while read branch; do
+        [ "$branch" = "main" ] && continue
+        [ "$branch" = "HEAD" ] && continue
+        git push origin --force "refs/heads/$branch:refs/heads/$branch" >> "$LOG" 2>&1 || \
+            log "WARN: Failed to push $branch to GitHub"
+    done < <(git for-each-ref --format="%(refname:lstrip=2)" refs/heads/)
+fi
+git push origin --tags --force >> "$LOG" 2>&1 || log "WARN: Tag push to GitHub failed"
+
+# Step 4: GitHub -> Forgejo (external contributions only)
+# Only push branches that exist on GitHub but NOT on Forgejo
+log "Checking GitHub-only branches..."
+GITHUB_ONLY=$(comm -23 \
+    <(git for-each-ref --format="%(refname:lstrip=3)" refs/remotes/origin/ | grep -v HEAD | sort) \
+    <(git for-each-ref --format="%(refname:lstrip=3)" refs/remotes/forgejo/ | grep -v HEAD | sort))
+
+if [ -n "$GITHUB_ONLY" ]; then
+    FORGEJO_TOKEN=$(cat /opt/teleo-eval/secrets/forgejo-admin-token 2>/dev/null)
+    for branch in $GITHUB_ONLY; do
+        log "New from GitHub: $branch -> Forgejo"
+        git push forgejo "refs/remotes/origin/$branch:refs/heads/$branch" >> "$LOG" 2>&1 || {
+            log "WARN: Failed to push $branch to Forgejo"
+            continue
+        }
+        # Auto-create PR on Forgejo for mirrored branches (external contributor path)
+        # Skip pipeline-internal branches
+        case "$branch" in
+            extract/*|ingestion/*) continue ;;
+        esac
+        if [ -n "$FORGEJO_TOKEN" ]; then
+            # Check if PR already exists for this branch (open or closed)
+            # NOTE: Forgejo ?head= filter is broken (ignores head value, returns all PRs).
+            # Workaround: fetch open+closed PRs, pipe to Python, check head.ref.
+            HAS_PR=$( {
+                curl -sf "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls?state=open&limit=50" \
+                    -H "Authorization: token $FORGEJO_TOKEN" 2>/dev/null || echo "[]"
+                echo ""
+                curl -sf "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls?state=closed&sort=created&limit=50" \
+                    -H "Authorization: token $FORGEJO_TOKEN" 2>/dev/null || echo "[]"
+            } | python3 -c "
+import sys, json
+branch = sys.argv[1]
+for line in sys.stdin:
+    line = line.strip()
+    if not line or line == '[]': continue
+    try:
+        for pr in json.loads(line):
+            if pr.get('head', {}).get('ref') == branch:
+                print('yes'); sys.exit(0)
+    except: pass
+print('no')
+" "$branch" 2>/dev/null || echo "no")
+            if [ "$HAS_PR" = "no" ]; then
+                PR_TITLE=$(echo "$branch" | sed 's|/|: |;s/-/ /g')
+                RESULT=$(curl -sf -X POST "http://localhost:3000/api/v1/repos/teleo/teleo-codex/pulls" \
+                    -H "Authorization: token $FORGEJO_TOKEN" \
+                    -H "Content-Type: application/json" \
+                    -d "{\"title\":\"$PR_TITLE\",\"head\":\"$branch\",\"base\":\"main\"}" 2>/dev/null || echo "")
+                PR_NUM=$(echo "$RESULT" | grep -o '"number":[0-9]*' | head -1 | grep -o "[0-9]*" || true)
+                if [ -n "$PR_NUM" ]; then
+                    log "Auto-created PR #$PR_NUM on Forgejo for $branch"
+                else
+                    log "WARN: Failed to auto-create PR for $branch"
+                fi
+            fi
+        fi
+    done
+else
+    log "No new GitHub-only branches"
+fi
+
+log "Sync complete"
--- a/systemd/teleo-auto-deploy.service
+++ b/systemd/teleo-auto-deploy.service
@ -1,10 +0,0 @@
-[Unit]
-Description=Auto-deploy teleo-infrastructure from Forgejo to working directories
-After=network.target
-
-[Service]
-Type=oneshot
-User=teleo
-ExecStart=/opt/teleo-eval/workspaces/deploy-infra/deploy/auto-deploy.sh
-StandardOutput=journal
-StandardError=journal
--- a/systemd/teleo-auto-deploy.timer
+++ b/systemd/teleo-auto-deploy.timer
@ -1,10 +0,0 @@
-[Unit]
-Description=Run teleo auto-deploy every 2 minutes
-
-[Timer]
-OnBootSec=30
-OnUnitActiveSec=2min
-AccuracySec=10s
-
-[Install]
-WantedBy=timers.target
--- a/telegram/bot.py
+++ b/telegram/bot.py
@ -994,7 +994,7 @@ async def handle_tagged(update: Update, context: ContextTypes.DEFAULT_TYPE):

    # Rate limit check
    if user and is_rate_limited(user.id):
-        await msg.reply_text("I'm processing other requests — try again in a few minutes.", do_quote=True)
+        await msg.reply_text("I'm processing other requests — try again in a few minutes.", quote=True)
        return

    logger.info("Tagged by @%s: %s", user.username if user else "unknown", text[:100])
@ -1295,7 +1295,7 @@ IMPORTANT: Special tags you can append at the end of your response (after your m
                tool_calls.append({"tool": f"kb:{t.get('tool', 'unknown')}", **{k: v for k, v in t.items() if k != "tool"}})

    if not response:
-        await msg.reply_text("Processing error — I'll get back to you.", do_quote=True)
+        await msg.reply_text("Processing error — I'll get back to you.", quote=True)
        return

    # Parse LEARNING and RESEARCH tags before posting
@ -1445,7 +1445,7 @@ IMPORTANT: Special tags you can append at the end of your response (after your m
    # Post response (without tag lines)
    # Telegram has a 4096 char limit — split long messages
    if len(display_response) <= 4096:
-        await msg.reply_text(display_response, do_quote=True)
+        await msg.reply_text(display_response, quote=True)
    else:
        # Split on paragraph boundaries where possible
        chunks = []
--- a/telegram/output_gate.py
+++ b/telegram/output_gate.py
@ -16,8 +16,7 @@ import re
 _SYSTEM_PATTERNS = [
    # Pipeline operations
    re.compile(r"\b(PR\s*#\d+|pull request|merge|rebase|cherry.?pick)\b", re.IGNORECASE),
-    re.compile(r"\b(batch.?extract|extract/|extractor)\b", re.IGNORECASE),
-    re.compile(r"\bextract(?:ed|ion)\b.*\b(pipeline|queue|PR|branch|source|cron)\b", re.IGNORECASE),
+    re.compile(r"\b(extraction|extracted|extractor|extract/)\b", re.IGNORECASE),
    re.compile(r"\b(pipeline|cron|batch.?extract|systemd|teleo-pipeline)\b", re.IGNORECASE),
    re.compile(r"\b(conflict.?permanent|conflict.?closed|merge.?conflict)\b", re.IGNORECASE),

@ -32,36 +31,18 @@ _SYSTEM_PATTERNS = [
    re.compile(r"\b(approval.?rate|throughput|PRs?.?per.?hour)\b", re.IGNORECASE),
    re.compile(r"\b(reviewer_count|reviewer.?backfill)\b", re.IGNORECASE),

-    # Agent names — standalone mentions of any internal agent
-    # Leo and Rio excluded (common words) — caught by context patterns below
-    re.compile(r"\b(Epimetheus|Ganymede|Rhea|Oberon|Hermes|Theseus|Argus|Vida|Astra|Clay)\b"),
-    re.compile(r"\b(Leo|Rio)\s+(review|approv|reject|said|flagged|owns?|confirm)", re.IGNORECASE),
-    re.compile(r"\bPentagon\b"),
-    re.compile(r"\bm3ta\b", re.IGNORECASE),
-
    # Agent coordination internals
    re.compile(r"\b(Ganymede|Rhea|Oberon)\s+(review(?:ed)?|approv(?:ed|es?)|reject(?:ed|s)?)\b", re.IGNORECASE),
    re.compile(r"\b(PIPELINE_OWNED_PREFIXES|AGENT_NAMES)\b"),
    re.compile(r"\b(worktree|bare.?repo|forgejo|git\.livingip)\b", re.IGNORECASE),

-    # Coordination language
-    re.compile(r"\b(craft.?review|substance.?review|m3ta.?approv|skill.?graph|eval.?rubric)\b", re.IGNORECASE),
-
-    # Infrastructure domains
-    re.compile(r"\bteleo.?codex\b", re.IGNORECASE),
-    re.compile(r"\blivingip\.xyz\b", re.IGNORECASE),
-
-    # UUIDs (conversation IDs, agent IDs)
-    re.compile(r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}", re.IGNORECASE),
-
-    # Code / technical — require line-start or code context to avoid matching English "class"
-    re.compile(r"^\s*(def|import|class)\s+\w+", re.MULTILINE),
-    re.compile(r"[\w/]+\.(py|yaml|json)\b", re.IGNORECASE),
+    # Code / technical
+    re.compile(r"\b(def\s+\w+|import\s+\w+|class\s+\w+)\b"),
+    re.compile(r"\b(\.py|\.yaml|\.json|\.md)\s", re.IGNORECASE),
    re.compile(r"\b(sqlite3?|pipeline\.db|response_audit)\b", re.IGNORECASE),

-    # Internal metrics / debugging — require pipeline context, not bare English words
-    re.compile(r"\b(cosine.?sim|PRIOR_ART_THRESHOLD|SCHEMA_VERSION)\b", re.IGNORECASE),
-    re.compile(r"\bthreshold\b.*\b(cosine|vector|Qdrant|embedding|pre.?screen)\b", re.IGNORECASE),
+    # Internal metrics / debugging
+    re.compile(r"\b(cosine.?sim|threshold|PRIOR_ART_THRESHOLD)\b", re.IGNORECASE),
    re.compile(r"\b(pre.?screen|Layer\s*[01234]|RRF|entity.?boost)\b", re.IGNORECASE),

    # Paths
--- a/telegram/rio.yaml
+++ b/telegram/rio.yaml
@ -1,62 +0,0 @@
-# Rio — Teleo internet finance agent
-# This config drives Rio's Telegram bot identity, KB scope, and voice.
-
-# ─── Identity ────────────────────────────────────────────────────────────
-name: Rio
-handle: "@FutAIrdBot"
-x_handle: "@futaRdIO"
-bot_token_file: telegram-bot-token
-pentagon_agent_id: 244ba05f
-domain: internet-finance
-domain_expertise: >
-  futarchy, prediction markets, token governance, the MetaDAO ecosystem,
-  conditional markets, internet capital formation, and permissionless fundraising
-
-# ─── KB Scope ────────────────────────────────────────────────────────────
-# One full-KB query; results tagged primary/cross-domain post-hoc.
-kb_scope:
-  primary:
-    - domains/internet-finance
-    - foundations
-    - core
-
-# ─── Voice ───────────────────────────────────────────────────────────────
-voice_summary: "Sharp analyst talking to peers. High signal density."
-
-voice_definition: |
-  ## Register
-  You're a sharp analyst talking to peers — people who know markets and
-  governance mechanisms. Don't explain basics unless asked. Lead with your
-  take, not the context.
-
-  ## Certainty Expression
-  Be direct about conviction levels. "High conviction" / "Speculative but
-  interesting" / "I don't know." Never hedge with weasel words when you
-  have a clear view. Never express false certainty when you don't.
-
-  ## Domain Vocabulary
-  Use futarchy, pro-rata, oversubscription, ICO, conditional markets,
-  liquidation proposals without explanation. Explain newer protocol-specific
-  terms (ownership coins, PRISM) on first use.
-
-  ## Signature Moves
-  Connect everything to market mechanisms and incentive structures. When
-  someone describes a governance problem, you see the market design solution.
-  When someone describes a market outcome, you trace it back to the
-  mechanism that produced it.
-
-# ─── Learnings ───────────────────────────────────────────────────────────
-learnings_file: agents/rio/learnings.md
-
-# ─── Eval ────────────────────────────────────────────────────────────────
-opsec_additional_patterns:
-  - "token price \\$[\\d,.]+"
-  - "LP (allocation|commitment)"
-
-# ─── Model ───────────────────────────────────────────────────────────────
-response_model: anthropic/claude-opus-4-6
-triage_model: anthropic/claude-haiku-4.5
-max_tokens: 500
-
-# ─── Rate Limits ─────────────────────────────────────────────────────────
-max_response_per_user_per_hour: 30
--- a/telegram/sync-telegram-archives.sh
+++ b/telegram/sync-telegram-archives.sh
@ -1,72 +0,0 @@
-#!/bin/bash
-# Move telegram archives + apply pending learnings. Runs every 5 min via cron.
-set -euo pipefail
-
-STAGING="/opt/teleo-eval/telegram-archives"
-MAIN="/opt/teleo-eval/workspaces/main"
-LOCKFILE="/opt/teleo-eval/workspaces/.main-worktree.lock"
-LEARNINGS="$MAIN/agents/rio/learnings.md"
-PENDING="$STAGING/pending-learnings.jsonl"
-
-# Check if there's anything to do
-HAS_ARCHIVES=$(ls "$STAGING"/*.md 2>/dev/null | head -1) || true
-HAS_LEARNINGS=""
-[ -s "$PENDING" ] && HAS_LEARNINGS="yes"
-
-[ -z "$HAS_ARCHIVES" ] && [ -z "$HAS_LEARNINGS" ] && exit 0
-
-# Acquire worktree lock
-exec 9>"$LOCKFILE"
-if ! flock -n 9; then
-    exit 0  # Lock held — skip this cycle
-fi
-
-CHANGED=0
-
-# Move archive files
-for f in $STAGING/*.md; do
-    [ -f "$f" ] || continue
-    mv "$f" "$MAIN/inbox/queue/"
-    CHANGED=$((CHANGED + 1))
-done
-
-# Apply pending learnings to learnings.md
-if [ -s "$PENDING" ]; then
-    while IFS= read -r line; do
-        category=$(echo "$line" | python3 -c "import sys,json; print(json.load(sys.stdin).get('category','factual'))" 2>/dev/null || echo "factual")
-        correction=$(echo "$line" | python3 -c "import sys,json; print(json.load(sys.stdin).get('correction',''))" 2>/dev/null || echo "")
-        date_str=$(date +%Y-%m-%d)
-        [ -z "$correction" ] && continue
-
-        # Append to the right section
-        case "$category" in
-            communication)
-                # Find ## Communication Notes and append after it
-                sed -i "/^## Communication Notes/a - [$date_str] $correction" "$LEARNINGS"
-                ;;
-            structured_data)
-                sed -i "/^## Structured Data/a - [$date_str] $correction" "$LEARNINGS"
-                ;;
-            *)
-                sed -i "/^## Factual Corrections/a - [$date_str] $correction" "$LEARNINGS"
-                ;;
-        esac
-        CHANGED=$((CHANGED + 1))
-    done < "$PENDING"
-    rm -f "$PENDING"
-fi
-
-if [ "$CHANGED" -gt 0 ]; then
-    cd "$MAIN"
-    git add -A inbox/queue/ agents/rio/learnings.md
-    git commit -m "rio: sync $CHANGED item(s) from telegram staging
-
-Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>" 2>/dev/null || true
-    for attempt in 1 2 3; do
-        git pull --rebase origin main 2>/dev/null || { git rebase --abort 2>/dev/null; continue; }
-        git push origin main 2>/dev/null && break
-        sleep 2
-    done
-fi
-
-flock -u 9
--- a/telegram/theseus.yaml
+++ b/telegram/theseus.yaml
@ -1,68 +0,0 @@
-# Theseus — Teleo AI alignment & collective superintelligence agent
-# This config drives Theseus's Telegram bot identity, KB scope, and voice.
-
-# ─── Identity ────────────────────────────────────────────────────────────
-name: Theseus
-handle: "@TheseusAIBot"
-bot_token_file: theseus-bot-token
-pentagon_agent_id: 46864dd4
-domain: ai-alignment
-domain_expertise: >
-  AI alignment, collective superintelligence architecture, multi-agent
-  coordination, capability evaluation, formal verification as oversight,
-  and empirical AI capability evidence
-
-# ─── KB Scope ────────────────────────────────────────────────────────────
-kb_scope:
-  primary:
-    - domains/ai-alignment
-    - domains/collective-intelligence
-    - foundations
-    - core
-
-# ─── Voice ───────────────────────────────────────────────────────────────
-voice_summary: "Technically precise, structurally analytical. Alignment as coordination, not catastrophe."
-
-voice_definition: |
-  ## Register
-  Technically precise but accessible. No jargon-hiding, no catastrophizing.
-  You see alignment as a coordination problem, not a purely technical one.
-  That framing shapes everything you say.
-
-  ## Certainty Expression
-  Intellectually honest about what's empirically grounded vs theoretically
-  thin. Say "the evidence shows" when it does, "this is structural analysis"
-  when it's inference, "I don't know" when you don't. Never dress speculation
-  as fact.
-
-  ## Domain Vocabulary
-  Use alignment, mesa-optimization, RLHF, constitutional AI, verification,
-  coordination protocols, capability evaluation without explanation. Ground
-  abstract alignment concepts in concrete examples — the Claude's Cycles
-  research program, multi-agent architectures, observable failure modes.
-
-  ## Signature Moves
-  Connect everything to coordination and architecture. When someone raises
-  an alignment concern, you see the structural mechanism. When someone
-  describes a capability, you trace the coordination pattern that produced
-  it. Evidence over theory — always prefer documented observation over
-  hypotheticals.
-
-  ## What You Don't Do
-  No doomerism, no accelerationism. Structural analysis only. Don't
-  catastrophize and don't hand-wave risks away.
-
-# ─── Learnings ───────────────────────────────────────────────────────────
-learnings_file: agents/theseus/learnings.md
-
-# ─── Eval ────────────────────────────────────────────────────────────────
-opsec_additional_patterns:
-  - "internal (architecture|infra)"
-
-# ─── Model ───────────────────────────────────────────────────────────────
-response_model: anthropic/claude-opus-4-6
-triage_model: anthropic/claude-haiku-4.5
-max_tokens: 500
-
-# ─── Rate Limits ─────────────────────────────────────────────────────────
-max_response_per_user_per_hour: 30
--- a/Show more
+++ b/Show more