Commit graph

63 commits

Author SHA1 Message Date
83526bc90e fix: quote YAML edge values containing colons, skip unparseable files in reweave merge
Root cause of 84% reweave PR rejection rate: claim titles with colons
(e.g., "COAL: Meta-PoW: The ORE Treasury Protocol") written as bare
YAML list items, causing yaml.safe_load to fail during merge.

Three changes:
1. frontmatter.py: _yaml_quote() wraps colon-containing values in double quotes
2. reweave.py: _write_edge_regex uses _yaml_quote for new edges
3. merge.py: skip individual files with parse failures instead of aborting entire PR

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:07:28 +01:00
ac794f5c68 Fix source_channel migration: add to SCHEMA_SQL, default 'unknown' not 'telegram'
Ganymede review findings:
1. source_channel was missing from CREATE TABLE (fresh installs wouldn't have it)
2. Default fallback changed from 'telegram' to 'unknown' — unknown prefixes
   are genuinely unknown, not telegram
3. Cross-reference comments added between BRANCH_PREFIX_MAP and _CHANNEL_MAP

Also wires classify_source_channel into merge.py PR discovery INSERT.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 13:27:15 +01:00
0f868aefab Add GitHub PR feedback module and fix attribution for mirrored PRs
Some checks failed
CI / lint-and-test (push) Has been cancelled
github_feedback.py posts pipeline status to GitHub PRs at three touchpoints:
discovery ack, eval review result, and merge/close outcome. Only fires for
PRs with a github_pr link (set by sync-mirror.sh). All calls non-fatal.

contributor.py: expanded git author fallback to scan all non-merge commits
(was only checking last commit), added teleo-bot and github-actions[bot]
to bot filter list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:16:28 +01:00
13f21f7732 feat: external contributor pipeline — fork PR handling, attribution, prefix recognition
- Mirror: fetch GitHub fork PR refs (refs/pull/*/head), push to Forgejo as gh-pr-N/branch
- Mirror: fork PRs auto-create Forgejo PR with GitHub PR title, link github_pr in DB
- db.py: add contrib + gh-pr-* to classify_branch for external contributor branches
- contributor.py: git commit author as attribution fallback (before branch agent)
- contributor.py: skip bot/generic authors (m3taversal, teleo, pipeline)
- Tests: fix fallback test for new git author path, add external contributor test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:14:01 +01:00
fb121e4010 Add github_pr column to prs table (migration v21)
Some checks are pending
CI / lint-and-test (push) Waiting to run
Enables GitHub↔Forgejo PR linking for the contributor pipeline.
Mirror script will store GitHub PR number when creating Forgejo PRs,
allowing back-sync of eval feedback and merge/close status.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:07:04 +01:00
26a8b15f56 fix: skip merge commits in cherry-pick to prevent fork workflow content loss
Some checks are pending
CI / lint-and-test (push) Waiting to run
External contributors who run `git merge main` create merge commits that
cherry-pick can't handle without -m flag. --no-merges filters these out.
Added detection for branches with only merge commits but real content diff.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:04:45 +01:00
687f3d3151 fix: prevent broken wiki links in extraction (226 rejections)
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two changes to address the #1 rejection reason:

1. extraction_prompt.py: Explicitly tell LLM NOT to use [[wiki links]]
   in body text — use connections/related_claims JSON fields instead.
   Remove misleading "post-processor handles wiki links" language.

2. extract.py _get_kb_index(): Expand KB index to include entity stems
   from entities/{domain}/ so the LLM knows what entities exist when
   building connections. Previously only showed domain claims.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:28:58 +01:00
0ce7412396 fix: check Forgejo close return value in 2 merge.py paths to prevent ghost PRs
Both the "already merged" path and _handle_permanent_conflicts closed PRs on
Forgejo without checking the return value. On API failure, the DB update would
proceed anyway, creating ghost PRs (DB=closed/merged, Forgejo=open). Now both
paths check for None return and skip DB updates on failure — same pattern as
close_pr in pr_state.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:18:50 +01:00
28b25329b3 fix: remove FIRST early return that also blocked re-extraction
Some checks are pending
CI / lint-and-test (push) Waiting to run
There were TWO `if not unprocessed: return 0, 0` gates. The previous
fix (c763c99) only addressed the second one. The first at line 746
fires before the re-extraction query even runs. Replace with a comment
explaining why we don't early-return there.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:17:20 +01:00
c763c99910 fix: re-extraction loop runs even when queue is empty
Some checks are pending
CI / lint-and-test (push) Waiting to run
The re-extraction check was below an early return that fires when
unprocessed queue is empty. Sources in needs_reextraction state were
never picked up unless new sources happened to arrive simultaneously.
Move re-extraction query above the gate so both paths run independently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:04:49 +01:00
4c3ce265e4 fix: sanitize enrichment target_file path traversal
Some checks are pending
CI / lint-and-test (push) Waiting to run
Path(target).name strips directory components from LLM-generated
target filenames, preventing path traversal via ../. Same pattern
already applied to claim filenames (line 404) and entity filenames
(line 416). Ganymede-approved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:40:37 +01:00
46ad508de7 Phase 6b: extract post_merge.py from merge.py — post-merge effects
Some checks are pending
CI / lint-and-test (push) Waiting to run
7 functions extracted to lib/post_merge.py:
- embed_merged_claims, reciprocal_edges, find_claim_file, add_edge_to_file,
  archive_source_for_pr, commit_source_moves, update_source_frontmatter_status

git_fn injection pattern (same as contributor.py) for 3 async functions
that need git operations. Unused async_main_worktree_lock import removed
from merge.py.

merge.py: 1562 → 1200 lines (−362). Total reduction from 1912: −712 lines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:20:59 +01:00
ed1edd6466 Phase 6a: extract frontmatter.py from merge.py — pure YAML helpers
4 functions + 2 constants extracted to lib/frontmatter.py:
- parse_yaml_frontmatter, union_edge_lists, serialize_edge_fields,
  serialize_frontmatter, REWEAVE_EDGE_FIELDS, RECIPROCAL_EDGE_MAP

merge.py: 1678 → 1562 lines (−116).
test_reweave_merge.py: replaced local function copies with imports from
frontmatter.py — fixes missing challenged_by in test's REWEAVE_EDGE_FIELDS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:16:38 +01:00
53dc18afd5 Phase 5: Extract contributor.py from merge.py (−234 lines)
Some checks are pending
CI / lint-and-test (push) Waiting to run
5 functions extracted: is_knowledge_pr, refine_commit_type,
record_contributor_attribution, upsert_contributor, recalculate_tier.

git_fn parameter injection avoids circular import (merge→contributor,
contributor needs _git from merge). Single call site passes _git.

merge.py: 1912 → 1678 lines. 23 new tests, zero regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:08:26 +01:00
f46e14dfae refactor: Phase 4 — extract eval_actions.py, drop underscore prefixes in eval_parse
Some checks are pending
CI / lint-and-test (push) Waiting to run
Three changes:

1. Drop underscore prefixes in eval_parse.py — functions are now the public
   API of the module (filter_diff, parse_verdict, classify_issues, etc.).
   All 12 functions renamed, imports updated in evaluate.py and tests.

2. Extract eval_actions.py from evaluate.py — 3 async PR disposition functions:
   - post_formal_approvals: submit Forgejo reviews from 2 agents
   - terminate_pr: close PR, post rejection comment, requeue source
   - dispose_rejected_pr: disposition logic for rejected PRs on attempt 2+
   evaluate.py drops from ~1140 to 911 lines.

3. 14 new tests in test_eval_actions.py covering all three functions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:57:51 +01:00
376b77999f refactor: Phase 3 — fix close_pr ghost bug, wire stale_pr, extract eval_parse
Some checks are pending
CI / lint-and-test (push) Waiting to run
Critical bug fix: close_pr now checks forgejo_api return value and
skips DB update on Forgejo failure, preventing ghost PRs (DB closed,
Forgejo open). Returns bool so callers can handle failures.

_terminate_pr checks return value — skips source requeue on failure.
stale_pr.py migrated from raw Forgejo+DB to close_pr (last raw close
transition eliminated).

eval_parse.py: 15 pure parsing functions extracted from evaluate.py
(~370 lines removed). Zero I/O, zero async, independently testable.
evaluate.py drops from ~1510 to ~1140 lines.

Tests: 295 passed (42 new eval_parse + 2 new close_pr), zero regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:40:23 +01:00
716cc43890 extraction quality: trust hierarchy + verified tagging + telegram review endpoint
Some checks are pending
CI / lint-and-test (push) Waiting to run
Three fixes for conversation-sourced claim quality:

1. Trust hierarchy in extraction prompt: bot-generated numbers are
   flagged as unverified context, not evidence. Directional claims
   are extractable but specific figures require external verification.
   Prevents laundering bot guesses into the KB as evidence.

2. Conversation-sourced claims tagged with verified: false and
   source_type: conversation in frontmatter. Downstream consumers
   (Leo, dashboard) can filter/flag these for verification.

3. GET /api/telegram-extractions endpoint for daily spot-checking.
   Shows recent Telegram-sourced PRs with claim titles, status,
   merge rate, and eval issues. Quick review surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:38:39 +01:00
c8a08023f9 refactor: Phase 2 — wire pr_state into fixer.py and substantive_fixer.py
Some checks are pending
CI / lint-and-test (push) Waiting to run
Fix 4 Forgejo ghost PR bugs flagged by Ganymede:
- fixer.py GC close: DB update ran outside try/except, closing DB even on Forgejo failure
- substantive_fixer.py droppable: NO Forgejo close at all
- substantive_fixer.py auto-enrichment: DB update before Forgejo (reversed order)
- substantive_fixer.py close_and_reextract: replace manual Forgejo+DB with close_pr()

Add start_fixing() and reset_for_reeval() to pr_state.py:
- start_fixing: atomic claim + fix_attempts increment in one statement
- reset_for_reeval: clears all eval state for re-evaluation after fix

Also fixes stale line number comment in merge.py (Ganymede nit).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:21:40 +01:00
1e0c1cd788 Write enrichments as file modifications; strengthen correction extraction
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two changes:
1. extract.py: Enrichments now modify existing claim files by appending
   evidence sections. Previously enrichment-only extractions were
   discarded as null-result even when they contained valuable challenges.
2. extraction_prompt.py: Corrections should produce BOTH a claim (the
   corrected knowledge) AND an enrichment (linking to what it corrects).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:12:29 +01:00
1f5eb324f3 refactor: centralize PR state transitions in lib/pr_state.py
Some checks are pending
CI / lint-and-test (push) Waiting to run
Replace 38 hand-crafted UPDATE prs SET status calls across evaluate.py
and merge.py with 7 centralized functions that enforce invariants:
- close_pr: always syncs Forgejo (opt-out for reconciliation)
- approve_pr: raises ValueError on empty domain (prevents NULL bugs)
- mark_merged: always sets merged_at, clears last_error
- mark_conflict: always increments merge_failures, sets merge_cycled
- mark_conflict_permanent: terminal conflict state
- reopen_pr: handles all reopen scenarios (transient, rejection, reeval)
- start_review: atomic claim with bool return

This eliminates the class of bugs that produced 3 incidents:
1. Domain NULL on musings bypass (7 PRs stuck, 20h zero throughput)
2. Forgejo ghost PRs (70 PRs open on Forgejo but closed in DB)
3. Merge_cycled missing on various close paths

Also fixes: 3 close paths in merge.py had DB update before Forgejo call
(reversed order). close_pr does Forgejo first, then DB.

Only remaining raw status transition: _claim_next_pr (approved→merging)
which is an atomic subquery and doesn't have invariant requirements.

20 new tests, 264 total passing, 0 regressions. Net -101 lines in
evaluate.py + merge.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:08:57 +01:00
d073e22e8d Add conversation-aware extraction for Telegram sources
Some checks are pending
CI / lint-and-test (push) Waiting to run
When source format is "conversation", inject specialized extraction
rules that prioritize human corrections/pushback as highest-value
content. Fixes null-result on short but high-signal correction
messages. Maps corrections to existing KB claims as challenges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:05:51 +01:00
552f44ec1c fix: add migration v20 for conflict retry columns + serialize worktree ops
Some checks are pending
CI / lint-and-test (push) Waiting to run
db.py: migration v20 adds conflict_rebase_attempts, merge_failures,
merge_cycled columns (already exist on VPS via manual migration, missing
from code — any future DB rebuild would break retry mechanism).

merge.py: replace retry-with-backoff on config.lock with asyncio.Lock
(_bare_repo_lock) around all worktree add/remove calls. Prevents
contention instead of retrying it. Applied to both _cherry_pick_onto_main
and _merge_reweave_pr.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:19:56 +01:00
e0c9951308 fix: close stale PRs on Forgejo when pipeline DB marks them closed
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two code paths set status='closed' in the pipeline DB without calling
the Forgejo API to close the PR. This caused 50 ghost PRs to accumulate
on Forgejo (dashboard shows review backlog) while the pipeline considered
them done.

- evaluate.py: no-diff stale branch close now calls Forgejo PATCH
- merge.py: permanent conflict close now calls Forgejo PATCH

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:15:58 +01:00
0d3fe95522 Add config.lock retry with jitter to both worktree-add sites
Some checks are pending
CI / lint-and-test (push) Waiting to run
Parallel domain merges race on the bare repo's config file. The single
retry only covered one of two worktree-add call sites and used fixed
delay. Now both sites retry up to 3 times with increasing jitter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:13:32 +01:00
1755580b95 Harden already-merged detection to exact string match
Some checks are pending
CI / lint-and-test (push) Waiting to run
Ganymede review nit: substring match on "already" could false-positive
on future return strings. Pin to the two known values from cherry_pick().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:06:20 +01:00
ad7ee0831e fix(evaluate): set domain + auto_merge on all 5 approval paths
Some checks are pending
CI / lint-and-test (push) Waiting to run
Musings bypass and batch both_approve set status='approved' without
domain or auto_merge. Merge gate requires domain IS NOT NULL and
prefix match OR auto_merge=1. Result: agent PRs deadlocked for 20+ hours.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:03:42 +01:00
f38b1e3c01 fix: handle already-merged PRs + retry worktree config.lock
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two fixes for the 18-PR merge blockage:

1. When cherry-pick returns "already merged" (all commits empty because
   content is already on main), close the PR directly instead of trying
   to push the stale branch SHA to main. The branch ref points at old
   commits that aren't descendants of current main, so the push would
   always fail as non-fast-forward.

2. Retry worktree add once with jittered delay when config.lock
   contention occurs from parallel domain merges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:57:28 +01:00
ff357c4bbc fix: remove --force-with-lease from main push to unblock 16 PRs
Some checks are pending
CI / lint-and-test (push) Waiting to run
Forgejo categorically blocks --force-with-lease on protected branches,
even for fast-forward pushes. The cherry-picked branch is already a
descendant of origin/main, so a regular push is a fast-forward by
definition. Non-ff is rejected by default — same safety guarantee.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:52:39 +01:00
81afcd319f fix: sync all code from VPS — repo is now authoritative source of truth
Some checks are pending
CI / lint-and-test (push) Waiting to run
24 files: 8 pipeline lib modules, 6 diagnostics updates, 4 new diagnostics
modules, telegram bot fix, 5 active operational scripts. Key changes:
- Security: SQL injection prevention (alerting.py), SSL verification
  (review_queue.py), path traversal guard (extract.py)
- Cost tracking: per-PR cost accumulation in evaluate.py
- Auto-recovery: watchdog tier0 reset with retry cap + cooldown
- Extraction: structured edge fields, post-write vector connection
- New modules: vitality, research_tracking, research_routes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:18:01 +01:00
681afad506 Consolidate pipeline code from teleo-codex + VPS into single repo
Some checks failed
CI / lint-and-test (push) Has been cancelled
Sources merged:
- teleo-codex/ops/pipeline-v2/ (11 newer lib files, 5 new lib modules)
- teleo-codex/ops/ (agent-state, diagnostics expansion, systemd units, ops scripts)
- VPS /opt/teleo-eval/telegram/ (10 new bot files, agent configs)
- VPS /opt/teleo-eval/pipeline/ops/ (vector-gc, backfill-descriptions)
- VPS /opt/teleo-eval/sync-mirror.sh (Bug 2 + Step 2.5 fixes)

Non-trivial merges:
- connect.py: kept codex threshold (0.65) + added infra domain parameter
- watchdog.py: kept infra version (stale_pr integration, superset of codex)
- deploy.sh: codex rsync version (interim, until VPS git clone migration)
- diagnostics/app.py: codex decomposed dashboard (14 new route modules)

81 files changed, +17105/-200 lines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:52:26 +01:00
95f637491e fix: Ganymede review — explicit staging, push after commit, challenged_by reciprocal
Some checks failed
CI / lint-and-test (push) Has been cancelled
Three fixes from Ganymede's review of extract-time-connection:
1. Replace git add -A with explicit file staging in _reciprocal_edges
2. Push to origin/main immediately after commit (survive batch-extract reset)
3. RECIPROCAL_EDGE_MAP: challenges→challenged_by (not symmetric)
   Added challenged_by to REWEAVE_EDGE_FIELDS, EDGE_FIELDS, EDGE_WEIGHTS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:46:47 +01:00
be010e666a feat: extract-time connection + post-merge reciprocal edges
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two-part fix for 58% orphan ratio:

1. Prompt-time prior art: Qdrant lookup before extraction injects
   existing claims as connection candidates. LLM classifies edges
   as supports/challenges/related. reconstruct_claim_content writes
   typed edges in frontmatter.

2. Post-merge reciprocal edges: _reciprocal_edges() runs after
   cherry-pick merge, reads new claims' outgoing edges, writes
   reciprocal edges on target files. Ensures every new claim has
   incoming links.

Files: lib/extraction_prompt.py, lib/merge.py, openrouter-extract-v2.py
Tests: 214 passed (3 failures + 3 errors pre-existing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:25:31 +01:00
84cb001dd6 fix: handle indented YAML list items in _serialize_edge_fields
The skip loop only matched `- ` (no indent) but YAML list items are
commonly written as `  - item` (2-space indent). This caused old list
items to persist alongside new ones, corrupting frontmatter on merge.

Fix: consume any line starting with space or dash as part of the current
field's value block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 14:01:34 +01:00
16e798f6a2 fix: eliminate dead code + add stale worktree pre-cleanup in _merge_reweave_pr
- Combined superset assertion and merge computation into single loop
  (removed duplicate scalar-to-list normalization)
- Added worktree remove --force before worktree add to handle prior
  crash leaving stale worktree (SIGKILL, OOM, power loss)
2026-04-04 13:50:28 +01:00
b091642146 fix: string-level edge splicing in reweave merge — no yaml.dump reformatting
Two fixes from Ganymede review:
1. CRITICAL: blank line before closing --- compounded on repeat reweaves.
   Body starts with \n---, so \n{body} created \n\n---. Fixed by checking
   body prefix.
2. Replaced yaml.dump round-trip with _serialize_edge_fields() that splices
   only edge arrays into raw frontmatter text. Non-edge fields (title,
   confidence, type, quotes, flow styles) stay byte-identical to main HEAD.

_parse_yaml_frontmatter now returns 3-tuple: (dict, raw_fm_text, body).
_serialize_frontmatter takes (raw_fm_text, merged_edges_dict, body).

26 tests pass including idempotency (5x serialize), formatting preservation,
and no-blank-line regression test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:48:44 +01:00
6b3a5833df feat: per-file frontmatter union for reweave PR merge
Reweave PRs modify existing files (appending YAML edges). Cherry-pick
fails ~75% when main moves between PR creation and merge.

_merge_reweave_pr() reads each changed file from both main HEAD and
branch HEAD, unions the edge arrays (order-preserving, main-first),
and writes the result. Eliminates merge conflicts structurally.

Key design decisions (Ganymede + Theseus approved):
- Order-preserving dedup: main's edges first, branch-new appended
- Superset assertion: logs warning if branch missing main edges
- Uses main's body text (reweave only touches frontmatter)
- Loud failure on parse errors (no cherry-pick fallback)
- Append-only contract: reweave adds edges, never removes

18 tests covering parse, union, serialize, superset, and full workflow.
2026-04-04 13:43:32 +01:00
5e0cdfc63a feat: consolidate eval pipeline, reweave fixes, enrichment dedup, cherry-pick merge, TG batching
Merges all work from epimetheus/enrichment-dedup-fix and epimetheus/eval-and-reweave-fixes:

- Eval pipeline: _LLMResponse in call_openrouter, URL fabrication check, confidence floor, cost alerts
- Reweave fixes: _is_entity gate, _same_source filter, temp 0.3, blank line sanitization
- Enrichment dedup: three-layer fix (source-slug, PR-number, post-rebase scan)
- Cherry-pick merge: replaces rebase-retry, --ours entity conflict resolution
- TG batching: group by chat_id + time proximity, force-split on unparseable timestamps
- Schema migration v10: response_audit columns for cost/confidence/blocking

67 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:21:59 +01:00
f25a4093c2 fix: replace broken _rebase_and_push call with cherry-pick in conflict retry
_retry_conflict_prs called _rebase_and_push which was never defined,
causing NameError on every conflict retry. Now uses _cherry_pick_onto_main
consistent with the primary merge path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:18:30 +01:00
686ef3fd7f Replace rebase-retry with cherry-pick merge mechanism
- _cherry_pick_onto_main replaces _rebase_and_push: creates fresh branch
  from origin/main, cherry-picks extraction commits, force-pushes
- Eliminates ~23% merge failure rate from rebase race conditions
- Agent branch protection: PIPELINE_OWNED_PREFIXES filter in SQL prevents
  auto-merge of agent-owned branches (theseus/*, rio/*, etc.)
- Empty-commit handling: skips already-merged content gracefully
- Entity conflict auto-resolution preserved for cherry-pick path
- Post-pick evidence dedup runs as safety net (same as post-rebase)
- Separate fetch calls for main and branch (fixes long branch name issue)

Fixes: PRs #2141, #157, #2142, #2180 (agent branch orphaning)
Fixes: ~23% merge failure rate (rebase race condition)
Related: PRs #1751, #1752 (enrichment dedup shares root cause)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:18:26 +01:00
f43f8f923f fix: enrichment idempotency — three-layer dedup prevents duplicate evidence blocks
Layer 1: Insertion-time dedup in openrouter-extract-v2.py — skip if source_slug
already appears in claim content.
Layer 2: Insertion-time dedup in entity_batch.py — skip if PR number already
enriched this claim.
Layer 3: Post-rebase dedup in merge.py — scan rebased files for duplicate
evidence blocks (same source reference) and remove them before force-push.

Root cause: multiple enrichment branches modify the same claim at the same
insertion point. When rebased sequentially, evidence blocks are duplicated.
(Leo: PRs #1751, #1752)

lib/dedup.py: standalone module — parses evidence headers, deduplicates by
source key, preserves trailing content (Relevant Notes, Topics sections).
9 tests covering all patterns including the real PR #1751 duplication case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:18:23 +01:00
e17e6c25db feat: two-pass retrieval with sort order and graph expansion
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
lib/search.py — shared search library:
- Pass 1 (default): top 5 from Qdrant, score >= 0.70, no expansion
- Pass 2 (expand=True): next 5 via offset=5, score >= 0.60, plus
  graph expansion from YAML frontmatter edges. Hard cap 10 total.
- Sort order: cosine desc → challenged_by → other graph-expanded
- result_type internal tag for stable sort (direct/challenge/graph)
- Module-level constants for easy threshold tuning post-calibration
- Structural file exclusion (_map.md, _overview.md)
- Within-vector dedup via _dedup_hits()

Caller updates:
- kb_retrieval.py: retrieve_vector_context() calls search(expand=True)
- diagnostics/app.py: search endpoint passes expand query param
- Argus imports from lib/search.py via sys.path (no longer owns search)

Tests: 5 new tests covering pass1-only, pass2 expansion, hard cap,
sort order, challenges-before-other-expansion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:34:45 +00:00
5f554bc2de feat: atomic extract-and-connect + stale PR monitor + response audit
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Atomic extract-and-connect (lib/connect.py):
- After extraction writes claim files, each new claim is embedded via
  OpenRouter, searched against Qdrant, and top-5 neighbors (cosine > 0.55)
  are added as `related` edges in the claim's frontmatter
- Edges written on NEW claim only — avoids merge conflicts
- Cross-domain connections enabled, non-fatal on Qdrant failure
- Wired into openrouter-extract-v2.py post-extraction step

Stale PR monitor (lib/stale_pr.py):
- Every watchdog cycle checks open extract/* PRs
- If open >30 min AND 0 claim files → auto-close with comment
- After 2 stale closures → marks source as extraction_failed
- Wired into watchdog.py as check #6

Response audit system:
- response_audit table (migration v8), persistent audit conn in bot.py
- 90-day retention cleanup, tool_calls JSON column
- Confidence tag stripping, systemd ReadWritePaths for pipeline.db

Supporting infrastructure:
- reweave.py: nightly edge reconnection for orphan claims
- reconcile-sources.py: source status reconciliation
- backfill-domains.py: domain classification backfill
- ops/reconcile-source-status.sh: operational reconciliation script
- Attribution improvements, post-extract enrichments, merge improvements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:34:20 +00:00
0457c49094 fix: zombie retry loop + cost tracking
Gate 3 in batch-extract-50.sh: query pipeline.db for closed PRs before
re-extracting. Sources with >=3 closed PRs are skipped (zombie protection).

Cost tracking: openrouter_call() now returns (text, usage) tuple with
prompt_tokens and completion_tokens from the OpenRouter API response.
All callers updated to unpack and pass tokens to costs.record_usage().
Added missing triage cost recording. Fixed batch domain review recording
cost once per batch instead of once per PR.

Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:29:58 +00:00
89692fda2d feat: embed-on-merge — auto-index new claims into Qdrant after PR merge
After a PR merges successfully, _embed_merged_claims() diffs the merged SHA
against its parent to find new/changed .md files in knowledge directories
(domains/, core/, foundations/, decisions/, entities/). Each file is embedded
via embed-claims.py --file (OpenRouter, text-embedding-3-small).

Non-fatal: embedding failure logs a warning but does not block the merge
pipeline. This keeps vector search current without requiring manual re-embeds.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 17:53:18 +00:00
4b5c5841ce doc: mixed PR classification priority note (Ganymede review)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 14:57:11 +00:00
cfb80d3496 feat: CI scoring overhaul — principal roll-up, commit-type filter, new weights
Step 1: principal column + commit_type column in pipeline.db. Static map
populates principal for local agents (rio→m3taversal etc.). VPS agents
(epimetheus, argus) have no principal.

Step 2: _classify_commit_type in merge.py. Pipeline commits (inbox/,
entities/, agents/) get commit_type='pipeline' and skip CI attribution
entirely. Knowledge commits (domains/, core/, foundations/, decisions/)
get full attribution.

Step 3 (Argus): Dashboard has dual view — by-principal (default,
governance) and by-agent (drill-down). Already implemented by Argus.

CI weights updated (Cory-approved):
- Challenger: 0.35 (was 0.20)
- Synthesizer: 0.25 (was 0.15)
- Reviewer: 0.20 (was 0.10)
- Sourcer: 0.15 (unchanged)
- Extractor: 0.05 (was 0.40)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 14:53:54 +00:00
d33ddd9f3d fix: fixer GC now closes PRs on Forgejo + deletes branches, not just DB
Root cause of 5-day pipeline stall: fixer GC marked PRs as closed in DB
but never synced to Forgejo. Branches stayed alive on remote, blocking
Gate 2 in batch-extract (branch exists → skip forever).

Now: GC fetches PR numbers, posts audit comment, closes on Forgejo,
deletes remote branch, THEN updates DB. Same pattern as _terminate_pr
in evaluate.py.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-24 14:37:50 +00:00
d97f68714a epimetheus: fix 2 nits from Ganymede final review
1. _merge_pr marked as CURRENTLY UNUSED (local ff-push is primary path)
2. Conversation window messages skip cold rate limit check (window counter IS the limit)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-20 20:25:06 +00:00
d79ff60689 epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features
Pipeline reliability (8 fixes, reviewed by Ganymede+Rhea+Leo+Rio):
1. Merge API recovery — pre-flight approval check, transient/permanent distinction, jitter
2. Ghost PR detection — ls-remote branch check in reconciliation, network guard
3. Source status contract — directory IS status, no code change needed
4. Batch-state markers eliminated — two-gate skip (archive-check + batched branch-check)
5. Branch SHA tracking — batched ls-remote, auto-reset verdicts, dismiss stale reviews
6. Mirror pre-flight permissions — chown check in sync-mirror.sh
7. Telegram archive commit-after-write — git add/commit/push with rebase --abort fallback
8. Post-merge source archiving — queue/ → archive/{domain}/ after merge

Pipeline fixes:
- merge_cycled flag — eval attempts preserved during merge-failure cycling (Ganymede+Rhea)
- merge_failures diagnostic counter
- Startup recovery preserves eval_attempts (was incorrectly resetting to 0)
- No-diff PRs auto-closed by eval (root cause of 17 zombie PRs)
- GC threshold aligned with substantive fixer budget (was 2, now 4)
- Conflict retry with 3-attempt budget + permanent conflict handler
- Local ff-merge fallback for Forgejo 405 errors

Telegram bot:
- KB retrieval: 3-layer (entity resolution → claim search → agent context)
- Reply-to-bot handler (context.bot.id check)
- Tag regex: @teleo|@futairdbot
- Prompt rewrite for natural analyst voice
- Market data API integration (Ben's token price endpoint)
- Conversation windows (5-message unanswered counter, per-user-per-chat)
- Conversation history in prompt (last 5 exchanges)
- Worktree file lock for archive writes

Infrastructure:
- worktree_lock.py — file-based lock (flock) for main worktree coordination
- backfill-sources.py — source DB registration for Argus funnel
- batch-extract-50.sh v3 — two-gate skip, batched ls-remote, network guard
- sync-mirror.sh — auto-PR creation for mirrored GitHub branches, permission pre-flight
- Argus dashboard — conflicts + reviewing in backlog, queue count in funnel
- Enrichment-inside-frontmatter bug fix (regex anchor, not --- split)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-20 20:17:27 +00:00
090b1411fd epimetheus: source archive restructure — inbox/queue + inbox/archive/{domain} + inbox/null-result
- config.py: added INBOX_QUEUE, INBOX_NULL_RESULT constants
- evaluate.py: skip patterns + LIGHT tier cover all inbox/ subdirs
- llm.py: eval prompts reference inbox/ generically
- telegram/bot.py: archives to inbox/queue/
- telegram/teleo-telegram.service: ReadWritePaths expanded
- research-prompt-v2.md: paths updated to inbox/queue/
- research-prompt-leo-synthesis.md: paths updated
- migrate-source-archive.py: one-time migration script

Reviewed by: Ganymede, Rhea, Leo (all approved)

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
2026-03-18 11:50:04 +00:00