github_feedback.py posts pipeline status to GitHub PRs at three touchpoints:
discovery ack, eval review result, and merge/close outcome. Only fires for
PRs with a github_pr link (set by sync-mirror.sh). All calls non-fatal.
contributor.py: expanded git author fallback to scan all non-merge commits
(was only checking last commit), added teleo-bot and github-actions[bot]
to bot filter list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enables GitHub↔Forgejo PR linking for the contributor pipeline.
Mirror script will store GitHub PR number when creating Forgejo PRs,
allowing back-sync of eval feedback and merge/close status.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
External contributors who run `git merge main` create merge commits that
cherry-pick can't handle without -m flag. --no-merges filters these out.
Added detection for branches with only merge commits but real content diff.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two changes to address the #1 rejection reason:
1. extraction_prompt.py: Explicitly tell LLM NOT to use [[wiki links]]
in body text — use connections/related_claims JSON fields instead.
Remove misleading "post-processor handles wiki links" language.
2. extract.py _get_kb_index(): Expand KB index to include entity stems
from entities/{domain}/ so the LLM knows what entities exist when
building connections. Previously only showed domain claims.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both the "already merged" path and _handle_permanent_conflicts closed PRs on
Forgejo without checking the return value. On API failure, the DB update would
proceed anyway, creating ghost PRs (DB=closed/merged, Forgejo=open). Now both
paths check for None return and skip DB updates on failure — same pattern as
close_pr in pr_state.py.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There were TWO `if not unprocessed: return 0, 0` gates. The previous
fix (c763c99) only addressed the second one. The first at line 746
fires before the re-extraction query even runs. Replace with a comment
explaining why we don't early-return there.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The re-extraction check was below an early return that fires when
unprocessed queue is empty. Sources in needs_reextraction state were
never picked up unless new sources happened to arrive simultaneously.
Move re-extraction query above the gate so both paths run independently.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Path(target).name strips directory components from LLM-generated
target filenames, preventing path traversal via ../. Same pattern
already applied to claim filenames (line 404) and entity filenames
(line 416). Ganymede-approved.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three changes:
1. Drop underscore prefixes in eval_parse.py — functions are now the public
API of the module (filter_diff, parse_verdict, classify_issues, etc.).
All 12 functions renamed, imports updated in evaluate.py and tests.
2. Extract eval_actions.py from evaluate.py — 3 async PR disposition functions:
- post_formal_approvals: submit Forgejo reviews from 2 agents
- terminate_pr: close PR, post rejection comment, requeue source
- dispose_rejected_pr: disposition logic for rejected PRs on attempt 2+
evaluate.py drops from ~1140 to 911 lines.
3. 14 new tests in test_eval_actions.py covering all three functions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical bug fix: close_pr now checks forgejo_api return value and
skips DB update on Forgejo failure, preventing ghost PRs (DB closed,
Forgejo open). Returns bool so callers can handle failures.
_terminate_pr checks return value — skips source requeue on failure.
stale_pr.py migrated from raw Forgejo+DB to close_pr (last raw close
transition eliminated).
eval_parse.py: 15 pure parsing functions extracted from evaluate.py
(~370 lines removed). Zero I/O, zero async, independently testable.
evaluate.py drops from ~1510 to ~1140 lines.
Tests: 295 passed (42 new eval_parse + 2 new close_pr), zero regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes for conversation-sourced claim quality:
1. Trust hierarchy in extraction prompt: bot-generated numbers are
flagged as unverified context, not evidence. Directional claims
are extractable but specific figures require external verification.
Prevents laundering bot guesses into the KB as evidence.
2. Conversation-sourced claims tagged with verified: false and
source_type: conversation in frontmatter. Downstream consumers
(Leo, dashboard) can filter/flag these for verification.
3. GET /api/telegram-extractions endpoint for daily spot-checking.
Shows recent Telegram-sourced PRs with claim titles, status,
merge rate, and eval issues. Quick review surface.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix 4 Forgejo ghost PR bugs flagged by Ganymede:
- fixer.py GC close: DB update ran outside try/except, closing DB even on Forgejo failure
- substantive_fixer.py droppable: NO Forgejo close at all
- substantive_fixer.py auto-enrichment: DB update before Forgejo (reversed order)
- substantive_fixer.py close_and_reextract: replace manual Forgejo+DB with close_pr()
Add start_fixing() and reset_for_reeval() to pr_state.py:
- start_fixing: atomic claim + fix_attempts increment in one statement
- reset_for_reeval: clears all eval state for re-evaluation after fix
Also fixes stale line number comment in merge.py (Ganymede nit).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two changes:
1. extract.py: Enrichments now modify existing claim files by appending
evidence sections. Previously enrichment-only extractions were
discarded as null-result even when they contained valuable challenges.
2. extraction_prompt.py: Corrections should produce BOTH a claim (the
corrected knowledge) AND an enrichment (linking to what it corrects).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace 38 hand-crafted UPDATE prs SET status calls across evaluate.py
and merge.py with 7 centralized functions that enforce invariants:
- close_pr: always syncs Forgejo (opt-out for reconciliation)
- approve_pr: raises ValueError on empty domain (prevents NULL bugs)
- mark_merged: always sets merged_at, clears last_error
- mark_conflict: always increments merge_failures, sets merge_cycled
- mark_conflict_permanent: terminal conflict state
- reopen_pr: handles all reopen scenarios (transient, rejection, reeval)
- start_review: atomic claim with bool return
This eliminates the class of bugs that produced 3 incidents:
1. Domain NULL on musings bypass (7 PRs stuck, 20h zero throughput)
2. Forgejo ghost PRs (70 PRs open on Forgejo but closed in DB)
3. Merge_cycled missing on various close paths
Also fixes: 3 close paths in merge.py had DB update before Forgejo call
(reversed order). close_pr does Forgejo first, then DB.
Only remaining raw status transition: _claim_next_pr (approved→merging)
which is an atomic subquery and doesn't have invariant requirements.
20 new tests, 264 total passing, 0 regressions. Net -101 lines in
evaluate.py + merge.py.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When source format is "conversation", inject specialized extraction
rules that prioritize human corrections/pushback as highest-value
content. Fixes null-result on short but high-signal correction
messages. Maps corrections to existing KB claims as challenges.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
db.py: migration v20 adds conflict_rebase_attempts, merge_failures,
merge_cycled columns (already exist on VPS via manual migration, missing
from code — any future DB rebuild would break retry mechanism).
merge.py: replace retry-with-backoff on config.lock with asyncio.Lock
(_bare_repo_lock) around all worktree add/remove calls. Prevents
contention instead of retrying it. Applied to both _cherry_pick_onto_main
and _merge_reweave_pr.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two code paths set status='closed' in the pipeline DB without calling
the Forgejo API to close the PR. This caused 50 ghost PRs to accumulate
on Forgejo (dashboard shows review backlog) while the pipeline considered
them done.
- evaluate.py: no-diff stale branch close now calls Forgejo PATCH
- merge.py: permanent conflict close now calls Forgejo PATCH
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parallel domain merges race on the bare repo's config file. The single
retry only covered one of two worktree-add call sites and used fixed
delay. Now both sites retry up to 3 times with increasing jitter.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ganymede review nit: substring match on "already" could false-positive
on future return strings. Pin to the two known values from cherry_pick().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Musings bypass and batch both_approve set status='approved' without
domain or auto_merge. Merge gate requires domain IS NOT NULL and
prefix match OR auto_merge=1. Result: agent PRs deadlocked for 20+ hours.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes for the 18-PR merge blockage:
1. When cherry-pick returns "already merged" (all commits empty because
content is already on main), close the PR directly instead of trying
to push the stale branch SHA to main. The branch ref points at old
commits that aren't descendants of current main, so the push would
always fail as non-fast-forward.
2. Retry worktree add once with jittered delay when config.lock
contention occurs from parallel domain merges.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Forgejo categorically blocks --force-with-lease on protected branches,
even for fast-forward pushes. The cherry-picked branch is already a
descendant of origin/main, so a regular push is a fast-forward by
definition. Non-ff is rejected by default — same safety guarantee.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes from Ganymede's review of extract-time-connection:
1. Replace git add -A with explicit file staging in _reciprocal_edges
2. Push to origin/main immediately after commit (survive batch-extract reset)
3. RECIPROCAL_EDGE_MAP: challenges→challenged_by (not symmetric)
Added challenged_by to REWEAVE_EDGE_FIELDS, EDGE_FIELDS, EDGE_WEIGHTS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The skip loop only matched `- ` (no indent) but YAML list items are
commonly written as ` - item` (2-space indent). This caused old list
items to persist alongside new ones, corrupting frontmatter on merge.
Fix: consume any line starting with space or dash as part of the current
field's value block.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Combined superset assertion and merge computation into single loop
(removed duplicate scalar-to-list normalization)
- Added worktree remove --force before worktree add to handle prior
crash leaving stale worktree (SIGKILL, OOM, power loss)
Two fixes from Ganymede review:
1. CRITICAL: blank line before closing --- compounded on repeat reweaves.
Body starts with \n---, so \n{body} created \n\n---. Fixed by checking
body prefix.
2. Replaced yaml.dump round-trip with _serialize_edge_fields() that splices
only edge arrays into raw frontmatter text. Non-edge fields (title,
confidence, type, quotes, flow styles) stay byte-identical to main HEAD.
_parse_yaml_frontmatter now returns 3-tuple: (dict, raw_fm_text, body).
_serialize_frontmatter takes (raw_fm_text, merged_edges_dict, body).
26 tests pass including idempotency (5x serialize), formatting preservation,
and no-blank-line regression test.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reweave PRs modify existing files (appending YAML edges). Cherry-pick
fails ~75% when main moves between PR creation and merge.
_merge_reweave_pr() reads each changed file from both main HEAD and
branch HEAD, unions the edge arrays (order-preserving, main-first),
and writes the result. Eliminates merge conflicts structurally.
Key design decisions (Ganymede + Theseus approved):
- Order-preserving dedup: main's edges first, branch-new appended
- Superset assertion: logs warning if branch missing main edges
- Uses main's body text (reweave only touches frontmatter)
- Loud failure on parse errors (no cherry-pick fallback)
- Append-only contract: reweave adds edges, never removes
18 tests covering parse, union, serialize, superset, and full workflow.
_retry_conflict_prs called _rebase_and_push which was never defined,
causing NameError on every conflict retry. Now uses _cherry_pick_onto_main
consistent with the primary merge path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Layer 1: Insertion-time dedup in openrouter-extract-v2.py — skip if source_slug
already appears in claim content.
Layer 2: Insertion-time dedup in entity_batch.py — skip if PR number already
enriched this claim.
Layer 3: Post-rebase dedup in merge.py — scan rebased files for duplicate
evidence blocks (same source reference) and remove them before force-push.
Root cause: multiple enrichment branches modify the same claim at the same
insertion point. When rebased sequentially, evidence blocks are duplicated.
(Leo: PRs #1751, #1752)
lib/dedup.py: standalone module — parses evidence headers, deduplicates by
source key, preserves trailing content (Relevant Notes, Topics sections).
9 tests covering all patterns including the real PR #1751 duplication case.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Atomic extract-and-connect (lib/connect.py):
- After extraction writes claim files, each new claim is embedded via
OpenRouter, searched against Qdrant, and top-5 neighbors (cosine > 0.55)
are added as `related` edges in the claim's frontmatter
- Edges written on NEW claim only — avoids merge conflicts
- Cross-domain connections enabled, non-fatal on Qdrant failure
- Wired into openrouter-extract-v2.py post-extraction step
Stale PR monitor (lib/stale_pr.py):
- Every watchdog cycle checks open extract/* PRs
- If open >30 min AND 0 claim files → auto-close with comment
- After 2 stale closures → marks source as extraction_failed
- Wired into watchdog.py as check #6
Response audit system:
- response_audit table (migration v8), persistent audit conn in bot.py
- 90-day retention cleanup, tool_calls JSON column
- Confidence tag stripping, systemd ReadWritePaths for pipeline.db
Supporting infrastructure:
- reweave.py: nightly edge reconnection for orphan claims
- reconcile-sources.py: source status reconciliation
- backfill-domains.py: domain classification backfill
- ops/reconcile-source-status.sh: operational reconciliation script
- Attribution improvements, post-extract enrichments, merge improvements
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gate 3 in batch-extract-50.sh: query pipeline.db for closed PRs before
re-extracting. Sources with >=3 closed PRs are skipped (zombie protection).
Cost tracking: openrouter_call() now returns (text, usage) tuple with
prompt_tokens and completion_tokens from the OpenRouter API response.
All callers updated to unpack and pass tokens to costs.record_usage().
Added missing triage cost recording. Fixed batch domain review recording
cost once per batch instead of once per PR.
Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After a PR merges successfully, _embed_merged_claims() diffs the merged SHA
against its parent to find new/changed .md files in knowledge directories
(domains/, core/, foundations/, decisions/, entities/). Each file is embedded
via embed-claims.py --file (OpenRouter, text-embedding-3-small).
Non-fatal: embedding failure logs a warning but does not block the merge
pipeline. This keeps vector search current without requiring manual re-embeds.
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
Root cause of 5-day pipeline stall: fixer GC marked PRs as closed in DB
but never synced to Forgejo. Branches stayed alive on remote, blocking
Gate 2 in batch-extract (branch exists → skip forever).
Now: GC fetches PR numbers, posts audit comment, closes on Forgejo,
deletes remote branch, THEN updates DB. Same pattern as _terminate_pr
in evaluate.py.
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
- Claim-shape detector: if YAML has type: claim, force STANDARD minimum (Theseus)
- Random pre-merge promotion: 15% of LIGHT → STANDARD before eval (Rio)
- LIGHT_SKIP_LLM config flag: skip domain+Leo review for LIGHT (Rhea: env var rollback)
- Updated both_approve: domain_verdict=skipped is valid for LIGHT auto-approve
- Cost recording: only charge for reviews that actually ran
- SAMPLE_AUDIT_RATE bumped 0.10 → 0.15, audit model = Opus (Leo: different family from Haiku)
Multi-agent design review: Rio (gaming vectors, model diversity), Theseus (correlated
blindspots, claim-shape guard), Rhea (shadow mode, config flag, deployment), Leo (approval).
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
Forgejo returns 200 with HTML content-type on successful merge instead
of JSON. Our API helper threw on resp.json(), causing merge to report
failure even though the PR merged. Now treats non-JSON 200 as success.
This was causing PRs #732 and #789 to show as conflict in our DB while
actually merged on Forgejo, and tripping the merge circuit breaker.
Pentagon-Agent: Leo <294C3CA1-0205-4668-82FA-B984D54F48AD>