Two fixes for the 18-PR merge blockage:
1. When cherry-pick returns "already merged" (all commits empty because
content is already on main), close the PR directly instead of trying
to push the stale branch SHA to main. The branch ref points at old
commits that aren't descendants of current main, so the push would
always fail as non-fast-forward.
2. Retry worktree add once with jittered delay when config.lock
contention occurs from parallel domain merges.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Forgejo categorically blocks --force-with-lease on protected branches,
even for fast-forward pushes. The cherry-picked branch is already a
descendant of origin/main, so a regular push is a fast-forward by
definition. Non-ff is rejected by default — same safety guarantee.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes from Ganymede's review of extract-time-connection:
1. Replace git add -A with explicit file staging in _reciprocal_edges
2. Push to origin/main immediately after commit (survive batch-extract reset)
3. RECIPROCAL_EDGE_MAP: challenges→challenged_by (not symmetric)
Added challenged_by to REWEAVE_EDGE_FIELDS, EDGE_FIELDS, EDGE_WEIGHTS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The skip loop only matched `- ` (no indent) but YAML list items are
commonly written as ` - item` (2-space indent). This caused old list
items to persist alongside new ones, corrupting frontmatter on merge.
Fix: consume any line starting with space or dash as part of the current
field's value block.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Combined superset assertion and merge computation into single loop
(removed duplicate scalar-to-list normalization)
- Added worktree remove --force before worktree add to handle prior
crash leaving stale worktree (SIGKILL, OOM, power loss)
Two fixes from Ganymede review:
1. CRITICAL: blank line before closing --- compounded on repeat reweaves.
Body starts with \n---, so \n{body} created \n\n---. Fixed by checking
body prefix.
2. Replaced yaml.dump round-trip with _serialize_edge_fields() that splices
only edge arrays into raw frontmatter text. Non-edge fields (title,
confidence, type, quotes, flow styles) stay byte-identical to main HEAD.
_parse_yaml_frontmatter now returns 3-tuple: (dict, raw_fm_text, body).
_serialize_frontmatter takes (raw_fm_text, merged_edges_dict, body).
26 tests pass including idempotency (5x serialize), formatting preservation,
and no-blank-line regression test.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reweave PRs modify existing files (appending YAML edges). Cherry-pick
fails ~75% when main moves between PR creation and merge.
_merge_reweave_pr() reads each changed file from both main HEAD and
branch HEAD, unions the edge arrays (order-preserving, main-first),
and writes the result. Eliminates merge conflicts structurally.
Key design decisions (Ganymede + Theseus approved):
- Order-preserving dedup: main's edges first, branch-new appended
- Superset assertion: logs warning if branch missing main edges
- Uses main's body text (reweave only touches frontmatter)
- Loud failure on parse errors (no cherry-pick fallback)
- Append-only contract: reweave adds edges, never removes
18 tests covering parse, union, serialize, superset, and full workflow.
_retry_conflict_prs called _rebase_and_push which was never defined,
causing NameError on every conflict retry. Now uses _cherry_pick_onto_main
consistent with the primary merge path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Layer 1: Insertion-time dedup in openrouter-extract-v2.py — skip if source_slug
already appears in claim content.
Layer 2: Insertion-time dedup in entity_batch.py — skip if PR number already
enriched this claim.
Layer 3: Post-rebase dedup in merge.py — scan rebased files for duplicate
evidence blocks (same source reference) and remove them before force-push.
Root cause: multiple enrichment branches modify the same claim at the same
insertion point. When rebased sequentially, evidence blocks are duplicated.
(Leo: PRs #1751, #1752)
lib/dedup.py: standalone module — parses evidence headers, deduplicates by
source key, preserves trailing content (Relevant Notes, Topics sections).
9 tests covering all patterns including the real PR #1751 duplication case.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Atomic extract-and-connect (lib/connect.py):
- After extraction writes claim files, each new claim is embedded via
OpenRouter, searched against Qdrant, and top-5 neighbors (cosine > 0.55)
are added as `related` edges in the claim's frontmatter
- Edges written on NEW claim only — avoids merge conflicts
- Cross-domain connections enabled, non-fatal on Qdrant failure
- Wired into openrouter-extract-v2.py post-extraction step
Stale PR monitor (lib/stale_pr.py):
- Every watchdog cycle checks open extract/* PRs
- If open >30 min AND 0 claim files → auto-close with comment
- After 2 stale closures → marks source as extraction_failed
- Wired into watchdog.py as check #6
Response audit system:
- response_audit table (migration v8), persistent audit conn in bot.py
- 90-day retention cleanup, tool_calls JSON column
- Confidence tag stripping, systemd ReadWritePaths for pipeline.db
Supporting infrastructure:
- reweave.py: nightly edge reconnection for orphan claims
- reconcile-sources.py: source status reconciliation
- backfill-domains.py: domain classification backfill
- ops/reconcile-source-status.sh: operational reconciliation script
- Attribution improvements, post-extract enrichments, merge improvements
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gate 3 in batch-extract-50.sh: query pipeline.db for closed PRs before
re-extracting. Sources with >=3 closed PRs are skipped (zombie protection).
Cost tracking: openrouter_call() now returns (text, usage) tuple with
prompt_tokens and completion_tokens from the OpenRouter API response.
All callers updated to unpack and pass tokens to costs.record_usage().
Added missing triage cost recording. Fixed batch domain review recording
cost once per batch instead of once per PR.
Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After a PR merges successfully, _embed_merged_claims() diffs the merged SHA
against its parent to find new/changed .md files in knowledge directories
(domains/, core/, foundations/, decisions/, entities/). Each file is embedded
via embed-claims.py --file (OpenRouter, text-embedding-3-small).
Non-fatal: embedding failure logs a warning but does not block the merge
pipeline. This keeps vector search current without requiring manual re-embeds.
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
Root cause of 5-day pipeline stall: fixer GC marked PRs as closed in DB
but never synced to Forgejo. Branches stayed alive on remote, blocking
Gate 2 in batch-extract (branch exists → skip forever).
Now: GC fetches PR numbers, posts audit comment, closes on Forgejo,
deletes remote branch, THEN updates DB. Same pattern as _terminate_pr
in evaluate.py.
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
- Claim-shape detector: if YAML has type: claim, force STANDARD minimum (Theseus)
- Random pre-merge promotion: 15% of LIGHT → STANDARD before eval (Rio)
- LIGHT_SKIP_LLM config flag: skip domain+Leo review for LIGHT (Rhea: env var rollback)
- Updated both_approve: domain_verdict=skipped is valid for LIGHT auto-approve
- Cost recording: only charge for reviews that actually ran
- SAMPLE_AUDIT_RATE bumped 0.10 → 0.15, audit model = Opus (Leo: different family from Haiku)
Multi-agent design review: Rio (gaming vectors, model diversity), Theseus (correlated
blindspots, claim-shape guard), Rhea (shadow mode, config flag, deployment), Leo (approval).
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
Forgejo returns 200 with HTML content-type on successful merge instead
of JSON. Our API helper threw on resp.json(), causing merge to report
failure even though the PR merged. Now treats non-JSON 200 as success.
This was causing PRs #732 and #789 to show as conflict in our DB while
actually merged on Forgejo, and tripping the merge circuit breaker.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
Unevaluated PRs (eval_attempts=0) now sort before re-evals in the
eval cycle query. Fresh PRs have a higher chance of passing (~12%)
vs re-evals of already-rejected PRs. Prevents migration-reset PRs
from consuming eval slots that fresh PRs could use.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
Opus was ignoring the valid tag list and generating custom tags like
schema-enrichment-slug-mismatch, which fall through to 'unknown' in
disposition logic. All three prompts (domain, Leo standard, Leo deep)
now explicitly say "do not invent new tags" alongside the valid tag list.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
If source_path is NULL, the source requeue silently matches nothing.
Log a warning so we catch orphaned terminations in monitoring.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
Schema migration v3: adds eval_attempts (INTEGER) and eval_issues (TEXT/JSON)
columns to prs table.
Retry budget logic (Ganymede-approved design):
- Increment eval_attempts on each evaluate_pr() call
- Hard cap: eval_attempts >= 3 → terminal (close PR, tag source needs_human)
- Attempt 1: normal — back to open, wait for fix
- Attempt 2: classify issues as mechanical/substantive
- Mechanical only (schema, wiki links, dedup): keep open for one more try
- Substantive (factual, confidence, scope, title): close PR, requeue source
- Issue tags parsed from reviewer comments, stored in eval_issues column
- SHA-based reset: new commits on PR branch → eval_attempts=0, verdicts reset
- Post-migration stagger: LIMIT 5 for first batch to avoid OpenRouter spike
- Cost recording updated: domain review → OpenRouter, Leo → tier-dependent
Stops the 32-PR infinite loop burning ~$0.03/cycle with no terminal state.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
- Domain review → GPT-4o (OpenRouter), Leo STANDARD → Sonnet (OpenRouter),
Leo DEEP → Opus (Claude Max). Two model families = no correlated blind spots.
- Opus reserved for DEEP eval only — protects rate limit for overnight research.
- Review prompts calibrated: require per-criterion evidence, blocking-vs-observation
verdict rules. Moved from 100% rubber-stamp approval to 12% pass rate.
- OpenRouter failures classified as openrouter_failed (not rate_limited) to avoid
spurious 15-min Opus backoff.
- merge.py: pre-check PR state before merge API call (prevents 405 on re-merge).
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>
- Fix#12: domain_review undefined on resume path — initialize to None,
guard _parse_issues() call. Prevents NameError on PRs resuming after
partial eval (76 PRs in this state right now).
- Fix#11: concurrent eval workers can duplicate reviews — add atomic
UPDATE SET status='reviewing' WHERE status='open' at top of
evaluate_pr(). Check rowcount, skip if already claimed.
- Fix#8: subprocess tracking for graceful shutdown — _active_subprocesses
set in evaluate module, tracked in _claude_cli_call, exposed via
kill_active_subprocesses(). Replaces dead code in teleo-pipeline.py.
- Fix health.py divide-by-zero — guard all metabolic metric reads against
None from NULLIF/empty result set. Prevents TypeError on /health when
no PRs have been evaluated in 24h.
Also includes Leo's existing hot-fixes:
- Rate limit detection checks stdout regardless of exit code
- 15-minute cycle-level backoff on rate limit
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>