teleo/teleo-infrastructure

Author	SHA1	Message	Date
m3taversal	84f6d3682c	fix(eval): treat empty diff as conservative fallback in auto-close gate Some checks are pending CI / lint-and-test (push) Waiting to run Details Ganymede review nit: if get_pr_diff returns an empty string (edge case — Forgejo quirk, empty PR), the old `if diff is None` branch would miss it, the `elif diff and ...` would evaluate False (empty string is falsy), and control would fall to `else` — triggering auto-close on zero diff content. Change `if diff is None` → `if not diff` so empty string ALSO falls through to the conservative path. Matches the stated posture: skip auto-close when in doubt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 11:24:16 +01:00
m3taversal	33c17f87a8	feat(eval): auto-close near-duplicate PRs when merged sibling exists Prevents Apr 22 runaway-damage pattern (44 open PRs manually bulk-closed) where a source extracted 20+ times before the cooldown gate landed, each leaving an orphan 'open' PR after eval correctly rejected as near-duplicate. Gate fires in dispose_rejected_pr before attempt-count branches: all_issues == ["near_duplicate"] (exact match — compound carries signal) AND sibling PR exists with same source_path in status='merged' AND diff contains "new file mode" (not enrichment-only) → close on Forgejo + DB with audit, post explanation comment. Ganymede review — 5 must-fix/warnings applied + 1 must-add: - Exact match on single-issue near_duplicate (compound rejections preserved) - Enrichment guard via diff scan (eval_parse regex can flag enrichment prose) - 10s timeout on get_pr_diff — conservative fallback on Forgejo wedge - Forgejo comment with canned explanation (best-effort, try/except) - Partial index idx_prs_source_path + migration v23 - Explicit p1.source_path IS NOT NULL in WHERE Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 11:17:29 +01:00
m3taversal	a053a8ebf9	fix(backfill): don't regress terminal source statuses to unprocessed Some checks are pending CI / lint-and-test (push) Waiting to run Details backfill-sources.py runs every 15 minutes and derives sources.status purely from directory location. If a source file is in inbox/queue/, it blindly overwrites the DB status to 'unprocessed' — even when the DB already had 'extracted' or 'null_result'. This is why the 43 zombies kept coming back after manual backfill: cron re-reset them every 15 minutes, then each 4h cooldown expiry re-triggered runaway extraction on the same source. Fix: never regress from a terminal status (extracted, null_result, error, ghost_no_file) to 'unprocessed'. File location is ambiguous (legitimately new vs. zombie from failed archive); DB is authoritative. Legitimate re-extraction still works — it goes through the needs_reextraction path which is unaffected by this gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 21:29:33 +01:00
m3taversal	97b590acd6	fix: close cooldown-dependence gaps in extract.py (Ganymede review) Some checks are pending CI / lint-and-test (push) Waiting to run Details Three targeted fixes from Ganymede's review of commit `469cb7f`: BUG #1 — Success path now updates sources.status='extracting' before PR creation, so queue scan's DB-authoritative filter catches sources between PR creation and merge. Previously the cooldown gate was load-bearing for this window, not belt-and-suspenders as claimed. BUG #2 — Second null-result path (line 573, triggered when enrichments existed but all targets were missing in worktree) now updates DB. Without this, that path created no PR, no DB mark, and would have re-entered the runaway loop 4h later when the cooldown window expired. NIT #6 — 4h cooldown moved to config.EXTRACTION_COOLDOWN_HOURS. Tunable without code change. Log format now shows the configured hours. Also backfilled 59 pre-existing zombie queue-path rows where the file was already archived but DB status said 'unprocessed' — these would have leaked past the DB filter once the 4h cooldown expired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 11:33:10 +01:00
m3taversal	469cb7f2da	fix: stop runaway re-extraction loop in extract.py Some checks are pending CI / lint-and-test (push) Waiting to run Details Three changes reduce extraction cost and duplicate PR flood: 1. 4-hour cooldown gate — skip sources with ANY PR (merged/closed/open) created in the last 4h. Prevents same source re-extracting every 60s while archive step lags behind merge. 2. DB-authoritative status — sources.status is now updated in the pipeline DB at each extraction terminal point (null_result, success). Queue scan checks DB first so sources with failed archives (e.g., root-owned worktree files blocking git pull --rebase) don't get re-extracted forever. Also moves archival into the extraction branch so it goes through PR merge instead of a fragile separate main-worktree push. 3. source_channel wiring — extract.py PR INSERT now sets source_channel from classify_source_channel(branch). Previously daemon-created PRs had NULL source_channel, breaking Argus dashboard filters. Combined with Ship's in-branch archive refactor. Root incident: blockworks-metadao-strategic-reset.md extracted 31 times in 12 hours. Nine other sources hit 10-22 extractions each. Near-duplicate rejection rate jumped to 94%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 11:19:30 +01:00
m3taversal	8de28d6ee0	feat: bidirectional source↔claim linking Some checks are pending CI / lint-and-test (push) Waiting to run Details Forward link: claims get `sourced_from: {domain}/{filename}` at extraction time. Reverse link: after merge, backlink_source_claims() updates source files with `claims_extracted:` list. All disk writes happen under async_main_worktree_lock. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 13:00:59 +01:00
m3taversal	05f375d775	feat: filter system accounts from leaderboard, add primary_ci field - SYSTEM_ACCOUNTS set excludes pipeline/unknown/teleo-agents from /api/contributors/list - primary_ci field: action_ci.total when available, else role-based ci_score - action_ci included in list endpoint for each contributor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:33:47 +01:00
m3taversal	4101048cd0	feat: wire action-type CI into contributor profiles - contribution_scores table stores per-PR CI with action type - Profile endpoint returns action_ci alongside role-based ci_score - Branch-name attribution: contrib/NAME/ PRs attributed to NAME - Cameron now shows 0.32 CI + BELIEF MOVER badge from challenge - Handle variant matching (cameron-s1 → cameron) for cross-system lookup - Full historical backfill: 985 scores across 9 contributors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:29:01 +01:00
m3taversal	af027d3ced	feat: add contributor profile API endpoint GET /api/contributors/{handle} — returns CI score, badges, domain breakdown, role percentages, contribution timeline, review stats. GET /api/contributors/list — leaderboard with min_claims filter. Git-log fallback for contributors not in pipeline.db (Cameron, Alex). Badge system: FOUNDING CONTRIBUTOR, BELIEF MOVER, KNOWLEDGE SOURCER, DOMAIN SPECIALIST, VETERAN, FIRST BLOOD. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:22:13 +01:00
m3taversal	1b27a2de31	feat: add /api/activity-feed endpoint with hot/recent/important sort Serves contribution events from pipeline.db. Classifies PRs as create/enrich/challenge, normalizes contributors, derives summaries from branch names when descriptions are empty. Hot sort uses challenge3 + enrich2 + signal / hours^1.5 decay from event time. Domain and contributor filters, pagination (limit/offset). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:16:46 +01:00
m3taversal	11e026448a	sync: dashboard_routes.py from VPS — digest + contributor-graph endpoints Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:01:15 +01:00
m3taversal	c3d0b1f5a4	feat: contributor graph PNG generator + API endpoint matplotlib chart with dual axes — cumulative claims (#00d4aa) and contributors (#7c3aed) on dark background. 1200x630 for Twitter. Auto-regenerates hourly via /api/contributor-graph endpoint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:01:02 +01:00
m3taversal	88e8e15c6d	feat: add /api/digest/latest endpoint for scoring digest data Serves the latest scoring-digest-latest.json from cron output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:55:24 +01:00
m3taversal	5463ca0b56	feat: add daily scoring digest with CREATE/ENRICH/CHALLENGE classification Classifies merged PRs by action type, scores with importance multiplier (confidence, domain maturity, connectivity bonus), updates contributor records, posts summary to Telegram, serves via /api/digest/latest. Cron: 7:07 UTC daily (8:07 AM London). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:55:13 +01:00
m3taversal	e043cf98dc	feat: add wiki-link audit script for codex graph integrity Crawls domains/foundations/core/decisions for [[wiki-links]], resolves against claim files, entities, maps, and agents. Reports dead links, orphans, and connectivity stats. Prerequisite for CI scoring connectivity bonus — broken links would inflate scores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:46:55 +01:00
m3taversal	9c0be78620	fix: align CI role weights with contribution-architecture.md config.py had extractor-heavy weights (0.40) from initial bootstrap. Correct weights per approved architecture: challenger 0.35, synthesizer 0.25, reviewer 0.20, sourcer 0.15, extractor 0.05. backfill-ci.py already had correct weights; this fixes the live computation in health.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:37:47 +01:00
m3taversal	c29049924e	fix: wire commit_type into contributor role assignment The contributor attribution always recorded "extractor" regardless of the PR's refined commit_type. Added COMMIT_TYPE_TO_ROLE mapping and applied it in all three attribution paths (Pentagon-Agent trailer, git author fallback, PR agent fallback). Backfill script resets and re-derives role counts from prs.commit_type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:27:36 +01:00
m3taversal	f463f49b46	fix: prevent false 'already up to date' on fork PRs with merge commits When a contributor merges main into their fork branch (standard GitHub workflow), merge-base equals main SHA, triggering the 'already up to date' early return. This closes the PR without cherry-picking the new content. Cameron's PR #3377 hit this exact bug. Fix: add a diff check before returning 'already up to date'. If the branch has actual content changes vs main, proceed to cherry-pick instead of short-circuiting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 22:41:14 +01:00
m3taversal	9505e5b40a	feat: add /api/contributor-growth endpoint + cumulative growth script Adds async git-log-based endpoint for cumulative contributor and claim tracking. 5-minute cache, excludes bot accounts, tags founding contributors. Standalone CLI script also included for ad-hoc data generation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 22:19:42 +01:00
m3taversal	f0cf772182	Merge remote-tracking branch 'origin/epimetheus/reduce-rejections' Some checks are pending CI / lint-and-test (push) Waiting to run Details	2026-04-20 19:03:26 +01:00
m3taversal	4fc541c656	Skip liquidated entities in portfolio fetcher Some checks are pending CI / lint-and-test (push) Waiting to run Details Ranger was liquidated — no point fetching empty data every cron run. Also purged 1,647 pre-Apr-20 snapshot rows (incomplete NAV data from data collection ramp-up, not actual market movement). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 18:55:04 +01:00
m3taversal	b7242d2206	Wire rejection_reason into review records + fix ingestion domain routing Some checks are pending CI / lint-and-test (push) Waiting to run Details rejection_reason was always NULL in review_records — now populated with comma-joined issue tags (near_duplicate, frontmatter_schema, etc.) at both rejection call sites. Also fixes stale reviewer_model="gpt-4o" hardcoding to use config.EVAL_DOMAIN_MODEL (currently Gemini Flash). Ingestion branches (ingestion/futardio-, ingestion/metadao-) now resolve to internet-finance domain instead of falling through to "general". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 18:03:34 +01:00
m3taversal	12078c8707	Reduce near-duplicate and frontmatter schema rejections Near-duplicate (159+ rejections): - Add extract-time dedup gate: SequenceMatcher check before file write ($0) - Strengthen extraction prompt: high-similarity matches (>=0.75) get explicit "DO NOT extract, use enrichment instead" warning - Strip [[wiki link]] brackets from related_claims field Frontmatter schema (129+ rejections): - Normalize LLM confidence aliases (high→likely, medium→experimental, etc.) in both _build_claim_content and validate_schema - Strip code fences (```markdown/```yaml) from entity content in extract.py and from diff content in validate.py tier0.5 check - Code fences were root cause of "no_frontmatter" failures: parser sees ```markdown as first line, not --- Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 18:03:26 +01:00
m3taversal	7a753da68b	fix: auto-deploy.sh rsync excludes broken + add tests/ sync Some checks are pending CI / lint-and-test (push) Waiting to run Details - Switch RSYNC_FLAGS string to RSYNC_OPTS bash array (same fix as deploy.sh in `368b579` — string passed literal quotes to rsync, matching nothing) - Add tests/ to rsync targets and syntax check glob for parity with deploy.sh - All 8 rsync calls now use "${RSYNC_OPTS[@]}" expansion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:22:11 +01:00
m3taversal	febbc7da30	add rio and theseus telegram bot agent configs Some checks are pending CI / lint-and-test (push) Waiting to run Details Two YAML files on VPS but not in repo. Agent identity, KB scope, and voice configs for the Telegram bots. No secrets (tokens reference file paths, not inline values). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:20:21 +01:00
m3taversal	368b5793d3	fix: deploy.sh rsync excludes were broken — quotes passed literally Some checks are pending CI / lint-and-test (push) Waiting to run Details RSYNC_FLAGS as a string meant --exclude='__pycache__' passed literal quotes to rsync, matching nothing. Switched to bash array (RSYNC_OPTS) so excludes work correctly. __pycache__ and .pyc files no longer sync. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:17:31 +01:00
m3taversal	670c50f384	fix: add telegram/ and tests/ to deploy pipeline, remove hardcoded API key Some checks are pending CI / lint-and-test (push) Waiting to run Details deploy.sh was missing telegram/ and tests/ directories — code existed in repo but never synced to VPS. Also removes hardcoded twitterapi.io key from x-ingest.py (reads from secrets file like all other modules). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:15:55 +01:00
m3taversal	a479ab533b	fix: add fetch_coins.py to auto-deploy loop + legacy migration comment Some checks are pending CI / lint-and-test (push) Waiting to run Details - auto-deploy.sh: fetch_coins.py was missing from the root-level .py deploy loop (line 72). Only manual deploy.sh had it. Next cycle syncs it to VPS. - fetch_coins.py: document the ALTER TABLE loop as legacy migration for older DBs that predate the CREATE TABLE columns. Reviewed-by: Ganymede Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:06:35 +01:00
m3taversal	eac5d2f0d3	fix: add fetch_coins.py to deploy.sh deployment list Some checks are pending CI / lint-and-test (push) Waiting to run Details fetch_coins.py was committed to repo root but deploy.sh only deployed teleo-pipeline.py and reweave.py. This meant bug fixes to fetch_coins would silently fail to reach VPS on deploy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:01:31 +01:00
m3taversal	5071ecef16	fix: apply Ganymede review fixes to portfolio code Some checks are pending CI / lint-and-test (push) Waiting to run Details dashboard_portfolio.py: - datetime.utcnow() → datetime.now(timezone.utc) (deprecation fix) - days parameter validation with try/except + min(..., 365) on 2 endpoints fetch_coins.py: - isinstance(chain, str) guard prevents AttributeError on string chain values - Log when adjusted market cap differs from DexScreener value Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:00:02 +01:00
m3taversal	ddf3c25e88	sync VPS state: portfolio dashboard + fetch_coins.py Some checks are pending CI / lint-and-test (push) Waiting to run Details Pull live app.py from VPS to close 243-line drift. Add portfolio dashboard (renamed from v2), portfolio nav link, and fetch_coins.py (daily cron script for ownership coin data). Delete stale lib/ copy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 16:55:36 +01:00
m3taversal	cde92d3db1	fix: wrap breaker calls in stage_loop to prevent permanent task death Some checks are pending CI / lint-and-test (push) Waiting to run Details A transient DB lock in breaker.record_failure() inside an except handler killed the asyncio coroutine permanently — snapshot_cycle died Apr 18 and never recovered. All three breaker call sites now have their own try/except. Also includes HTML injection fix for github_feedback review_text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 12:37:28 +01:00
m3taversal	83526bc90e	fix: quote YAML edge values containing colons, skip unparseable files in reweave merge Root cause of 84% reweave PR rejection rate: claim titles with colons (e.g., "COAL: Meta-PoW: The ORE Treasury Protocol") written as bare YAML list items, causing yaml.safe_load to fail during merge. Three changes: 1. frontmatter.py: _yaml_quote() wraps colon-containing values in double quotes 2. reweave.py: _write_edge_regex uses _yaml_quote for new edges 3. merge.py: skip individual files with parse failures instead of aborting entire PR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 12:07:28 +01:00
m3taversal	ae860a1d06	fix: set execute bit on research-session.sh and install-hermes.sh Mode 100644 → 100755. Previous commit added the safety net but missed the actual git mode change due to staging order. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 11:54:39 +01:00
m3taversal	878f6e06e3	fix: restore execute bits on .sh files, add chmod safety net to auto-deploy research-session.sh and install-hermes.sh were committed with mode 100644 during repo reorganization (`d2aec7fe`). rsync -az preserved the non-executable mode, breaking all research agent cron jobs since Apr 15. Safety net in auto-deploy.sh ensures any future permission loss is auto-corrected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 11:54:13 +01:00
m3taversal	ac794f5c68	Fix source_channel migration: add to SCHEMA_SQL, default 'unknown' not 'telegram' Ganymede review findings: 1. source_channel was missing from CREATE TABLE (fresh installs wouldn't have it) 2. Default fallback changed from 'telegram' to 'unknown' — unknown prefixes are genuinely unknown, not telegram 3. Cross-reference comments added between BRANCH_PREFIX_MAP and _CHANNEL_MAP Also wires classify_source_channel into merge.py PR discovery INSERT. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:27:15 +01:00
m3taversal	25a537d2e1	fix: divergence alerting — alert suppression bug + stale ref detection Bug: echo "alerted" ran regardless of curl success, permanently suppressing alerts on delivery failure. Fix: if/then/else wraps the state write. Warning: stale tracking refs after push steps caused false divergence. Fix: re-fetch both remotes before comparing. Both findings from Ganymede review of Step 6. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 11:10:32 +01:00
m3taversal	0f868aefab	Add GitHub PR feedback module and fix attribution for mirrored PRs Some checks failed CI / lint-and-test (push) Has been cancelled Details github_feedback.py posts pipeline status to GitHub PRs at three touchpoints: discovery ack, eval review result, and merge/close outcome. Only fires for PRs with a github_pr link (set by sync-mirror.sh). All calls non-fatal. contributor.py: expanded git author fallback to scan all non-merge commits (was only checking last commit), added teleo-bot and github-actions[bot] to bot filter list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:16:28 +01:00
m3taversal	13f21f7732	feat: external contributor pipeline — fork PR handling, attribution, prefix recognition - Mirror: fetch GitHub fork PR refs (refs/pull//head), push to Forgejo as gh-pr-N/branch - Mirror: fork PRs auto-create Forgejo PR with GitHub PR title, link github_pr in DB - db.py: add contrib + gh-pr- to classify_branch for external contributor branches - contributor.py: git commit author as attribution fallback (before branch agent) - contributor.py: skip bot/generic authors (m3taversal, teleo, pipeline) - Tests: fix fallback test for new git author path, add external contributor test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:14:01 +01:00
m3taversal	0b28c71e11	Wire github_pr storage into sync-mirror.sh (Step 3) Some checks are pending CI / lint-and-test (push) Waiting to run Details When mirror auto-creates a Forgejo PR from a GitHub branch, look up the GitHub PR number via API and store it in pipeline.db (github_pr column from migration v21). Enables reverse mapping for feedback and back-sync. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:10:06 +01:00
m3taversal	fb121e4010	Add github_pr column to prs table (migration v21) Some checks are pending CI / lint-and-test (push) Waiting to run Details Enables GitHub↔Forgejo PR linking for the contributor pipeline. Mirror script will store GitHub PR number when creating Forgejo PRs, allowing back-sync of eval feedback and merge/close status. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:07:04 +01:00
m3taversal	26a8b15f56	fix: skip merge commits in cherry-pick to prevent fork workflow content loss Some checks are pending CI / lint-and-test (push) Waiting to run Details External contributors who run `git merge main` create merge commits that cherry-pick can't handle without -m flag. --no-merges filters these out. Added detection for branches with only merge commits but real content diff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:04:45 +01:00
m3taversal	687f3d3151	fix: prevent broken wiki links in extraction (226 rejections) Some checks are pending CI / lint-and-test (push) Waiting to run Details Two changes to address the #1 rejection reason: 1. extraction_prompt.py: Explicitly tell LLM NOT to use [[wiki links]] in body text — use connections/related_claims JSON fields instead. Remove misleading "post-processor handles wiki links" language. 2. extract.py _get_kb_index(): Expand KB index to include entity stems from entities/{domain}/ so the LLM knows what entities exist when building connections. Previously only showed domain claims. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:28:58 +01:00
m3taversal	22b6ebb6f6	fix: lower reweave threshold 0.70→0.55, increase batch 50→200 Some checks are pending CI / lint-and-test (push) Waiting to run Details Orphan ratio at 39.6% (443/1118 claims) vs <15% target. Root cause: reweave threshold 0.70 too strict for text-embedding-3-small — 56% of orphans found "no neighbors." At 0.55, dry-run shows 0% no-neighbor skips. Batch size 200 clears backlog in ~3-4 nights at ~$0.20/run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:18:50 +01:00
m3taversal	0ce7412396	fix: check Forgejo close return value in 2 merge.py paths to prevent ghost PRs Both the "already merged" path and _handle_permanent_conflicts closed PRs on Forgejo without checking the return value. On API failure, the DB update would proceed anyway, creating ghost PRs (DB=closed/merged, Forgejo=open). Now both paths check for None return and skip DB updates on failure — same pattern as close_pr in pr_state.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:18:50 +01:00
m3taversal	28b25329b3	fix: remove FIRST early return that also blocked re-extraction Some checks are pending CI / lint-and-test (push) Waiting to run Details There were TWO `if not unprocessed: return 0, 0` gates. The previous fix (`c763c99`) only addressed the second one. The first at line 746 fires before the re-extraction query even runs. Replace with a comment explaining why we don't early-return there. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:17:20 +01:00
m3taversal	c763c99910	fix: re-extraction loop runs even when queue is empty Some checks are pending CI / lint-and-test (push) Waiting to run Details The re-extraction check was below an early return that fires when unprocessed queue is empty. Sources in needs_reextraction state were never picked up unless new sources happened to arrive simultaneously. Move re-extraction query above the gate so both paths run independently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:04:49 +01:00
m3taversal	4c3ce265e4	fix: sanitize enrichment target_file path traversal Some checks are pending CI / lint-and-test (push) Waiting to run Details Path(target).name strips directory components from LLM-generated target filenames, preventing path traversal via ../. Same pattern already applied to claim filenames (line 404) and entity filenames (line 416). Ganymede-approved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:40:37 +01:00
m3taversal	46ad508de7	Phase 6b: extract post_merge.py from merge.py — post-merge effects Some checks are pending CI / lint-and-test (push) Waiting to run Details 7 functions extracted to lib/post_merge.py: - embed_merged_claims, reciprocal_edges, find_claim_file, add_edge_to_file, archive_source_for_pr, commit_source_moves, update_source_frontmatter_status git_fn injection pattern (same as contributor.py) for 3 async functions that need git operations. Unused async_main_worktree_lock import removed from merge.py. merge.py: 1562 → 1200 lines (−362). Total reduction from 1912: −712 lines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:20:59 +01:00
m3taversal	ed1edd6466	Phase 6a: extract frontmatter.py from merge.py — pure YAML helpers 4 functions + 2 constants extracted to lib/frontmatter.py: - parse_yaml_frontmatter, union_edge_lists, serialize_edge_fields, serialize_frontmatter, REWEAVE_EDGE_FIELDS, RECIPROCAL_EDGE_MAP merge.py: 1678 → 1562 lines (−116). test_reweave_merge.py: replaced local function copies with imports from frontmatter.py — fixes missing challenged_by in test's REWEAVE_EDGE_FIELDS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:16:38 +01:00

1 2 3 4

169 commits