teleo/teleo-infrastructure

Author	SHA1	Message	Date
m3taversal	762fd4233e	feat(backfill): synthetic PR rows for pre-mirror GitHub PRs #68 (Alex) + #88 (Cameron) Two historical GitHub PRs merged before our sync-mirror.sh tracked github_pr: - GitHub PR #68: alexastrum, 6 claims, merged Mar 9 2026 via squash merge - GitHub PR #88: Cameron-S1, 1 claim, merged early April Their claim files were lost during a Forgejo→GitHub mirror overwrite and later recovered via direct-to-main commits (dba00a79, da64f805). Because the recovery commits bypassed the pipeline, our 'prs' table has no row to attach originator events to — all 4 backfill-events.py strategies returned None, leaving Alex + Cameron at 0 originator credits despite real historical work. This reconstructs synthetic 'prs' rows so the existing github_pr strategy in backfill-events.py attaches 7 originator events on re-run: - Numbers 900068 / 900088 live in a clearly-synthetic range that cannot collide with real Forgejo PRs (current max: 3941) - github_pr=68/88 wires up the existing lookup strategy - submitted_by=alexastrum / cameron-s1 establishes author attribution - merged_at from the recovery commit messages (not recovery-commit time) - last_error tags the rows as synthetic for future audits Idempotent: INSERT OR IGNORE via check on number OR github_pr. Safe to replay. Reversible: DELETE FROM prs WHERE number IN (900068, 900088). After applying this script: python3 ops/backfill-events.py will credit Alex with 6 author + 6 originator events (author=1.80, originator=0.90) and Cameron with 1 author + 1 originator (0.30 + 0.15), all dated to the historical merge dates — so 7d/30d leaderboard windows show them correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 16:33:37 +01:00
m3taversal	10d5c275da	fix(backfill): normalize commit_date via datetime() in time-proximity query Some checks are pending CI / lint-and-test (push) Waiting to run Details SQLite datetime comparison fails lexicographically across ISO-T and space-separator formats: '2026-03-27 18:00:14' < '2026-03-27T17:43:04+00:00' because space (0x20) < T (0x54). PRs merged same-day but earlier than the commit hour were silently excluded from the time-proximity cascade. Shaga's 3 stigmergic-coordination claims resolved to PR #2032 (later, wrong) instead of #2025 (earlier, correct). Fixed by wrapping both sides in datetime(), which normalizes to space-separator before comparison. Verified: all 3 Shaga claims now resolve to #2025 via git_time_proximity. No change to totals (126 originator events, 5 proximity hits) — the fix corrects WHICH PR each proximity-matched claim resolves to, not whether. Caught by Ganymede review of `1d6b515`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 16:16:03 +01:00
m3taversal	1d6b51527a	feat(backfill): 4-strategy PR recovery for originator events Rewrite claim-level pass in backfill-events.py to recover the Forgejo PR that introduced each claim via a cascade of 4 strategies (reliability order), replacing the single title→description match that missed PRs with NULL description (Cameron #3377) and bare-subject extracts (Shaga's Leo research PR). ## Strategies 1. sourced_from frontmatter → prs.source_path stem match 2. git log first-add commit → subject pattern → prs.branch - "<agent>: extract claims from <slug>" → extract/<slug> - "<agent>: research session YYYY-MM-DD" → <agent>/research-<date> - "<agent>: (challenge\|contrib\|entity\|synthesize)" → <agent>/* - "Recover X from GitHub PR #N" → prs.github_pr=N - "Extract N claims from X" (no prefix) → time-proximity on agent-owned branches within 24h 3. Current title_desc fallback for anything the above miss ## Dry-run projection (1,662 merged PRs) Before: Claims processed: 33 Originator events: 6 Breakdown: {no_pr_match: 1608, no_sourcer: 26, invalid_handle: 21, skip_self: 6} After: Claims processed: 505 (+472) Originator events: 126 (+120) Strategy hits: git_subject=412, sourced_from=88, git_time_proximity=5 Breakdown: {no_pr_match: 1095, no_sourcer: 67, invalid_handle: 359, skip_self: 20} ## Verified on real VPS data - @thesensatore claims: 3/5 resolve via git_time_proximity to leo/ PRs - Cameron-S1, alexastrum: remain None — their recovery commits (dba00a79, da64f805) bypassed the pipeline entirely, no Forgejo PR record exists. Requires synthetic prs rows — deferred to separate commit with its own Ganymede review (write operation, larger blast radius than this pure-read backfill change). ## Implementation - New find_pr_for_claim(conn, repo, md) helper returns (pr_number, strategy) - Claim-level pass uses it first, falls back to title_desc map - Strategy counter surfaced in summary output for operator visibility Idempotent — backfill re-runs skip duplicate events via the partial UNIQUE index on contribution_events. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 16:06:52 +01:00
m3taversal	540ba97b9d	fix(attribution): Phase A followup — bug #1 + 4 nits + refactor (Ganymede review) Some checks are pending CI / lint-and-test (push) Waiting to run Details Addresses Apr 24 review of `58fa8c52`. All 6 findings landed. Bug #1 — git log -1 returns latest commit, not first (semantic mismatch with "original author" comment): Drop -1 flag, take last line of default-ordered log output (= oldest). Fixes mis-credit on multi-commit PRs where a reviewer rebased/force-pushed. Nit #2 — forward writer didn't pass merged_at: Fetch merged_at in the prs SELECT, thread pr_merged_at through all 5 insert_contribution_event call sites. Keeps forward-emitted and backfilled event timestamps on the same timeline after merge retries. Nit #3 — legacy-counts fallback paths emit no events (parity gap): git-author and prs.agent fallback paths now emit challenger/synthesizer events via the TRAILER_EVENT_ROLE map when refined_type matches. Closes the gap where external-contributor challenge/enrich PRs would accumulate legacy counts but disappear from event-sourced leaderboards. Nit #4 — migration v24 agent seed missing 'pipeline': Added "pipeline" to the seed list. Plus new migration v25 with idempotent corrective UPDATE so existing envs (where v24 already ran) pick up the fix on restart without requiring manual SQL. Verified on VPS state: pipeline row was kind='person', will flip to 'agent' on redeploy. Nit #5 — backfill summary prints originator attempted=0 in wrong pass: Split the "=== Summary ===" header into "=== PR-level events ===" and "=== Claim-level originator pass ===" with originator counts in the right block. Operator-facing cosmetic. Refactor #6 — AGENT_BRANCH_PREFIXES duplicated in 2 sites: Extracted to lib/attribution.py as single source of truth. contributor.py imports it. backfill-events.py keeps its local copy (runs standalone without pipeline package import) with a sync-reference comment. No behavioral drift for the common case. Backfill re-runs cleanly against existing forward-written events (UNIQUE-index idempotency). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 14:13:54 +01:00
m3taversal	58fa8c5276	feat(attribution): Phase A — event-sourced contribution ledger (schema v24) Some checks are pending CI / lint-and-test (push) Waiting to run Details Introduces contribution_events table + non-breaking double-write. Schema lands today, forward traffic writes events alongside existing count upserts, backfill script replays history. Phase B will add leaderboard API reading from events; Phase C switches Argus dashboard over. ## Schema v24 (lib/db.py) - contribution_events: one row per credit-earning event (id, handle, kind, role, weight, pr_number, claim_path, domain, channel, timestamp) Partial UNIQUE indexes handle SQLite's NULL != NULL semantics: idx_ce_unique_claim on (handle, role, pr_number, claim_path) WHERE claim_path NOT NULL idx_ce_unique_pr on (handle, role, pr_number) WHERE claim_path IS NULL PR-level events (evaluator, author, challenger, synthesizer) dedup on 3-tuple. Per-claim events (originator) dedup on 4-tuple. Idempotent on replay. - contributor_aliases: canonical handle mapping Seeded: @thesensatore → thesensatore, cameron → cameron-s1 - contributors.kind TEXT DEFAULT 'person' Migration seeds 'agent' for known Pentagon agent handles. ## Role model (confirmed by Cory Apr 24) Weights: author 0.30, challenger 0.25, synthesizer 0.20, originator 0.15, evaluator 0.05 - author: human who submitted the PR (curation + submission work) - originator: person who authored the underlying content (rewards external creators) - challenger: agent/person who brought a productive disagreement - synthesizer: cross-domain work (enrichments, research sessions) - evaluator: reviewer who approved (Leo + domain agent) Humans-are-always-author: agents credit is capped at evaluator/synthesizer/ challenger. Pentagon agents classify as kind='agent' and surface in the agent-view leaderboard, not the default person view. ## Writer (lib/contributor.py) - New insert_contribution_event(): idempotent INSERT OR IGNORE with alias normalization + kind classification. Falls back silently on pre-v24 DBs. - record_contributor_attribution double-writes alongside existing upsert_contributor calls. Zero risk to current dashboard. - Author event: emitted once per PR from prs.submitted_by → git author → agent-branch-prefix. - Originator events: emitted per claim from frontmatter sourcer, skipping when sourcer == author (avoids self-credit double-count). - Evaluator events: Leo (always when leo_verdict='approve') + domain_agent (when domain_verdict='approve' and not Leo). - Challenger/Synthesizer: emitted from Pentagon-Agent trailer on agent-owned branches (theseus/, rio/, etc.) based on commit_type. Pipeline-owned branches (extract/, reweave/) get no trailer-based event — infrastructure work isn't contribution credit. ## Helpers (lib/attribution.py) - normalize_handle(raw, conn=None): lowercase + strip @ + alias lookup - classify_kind(handle): returns 'agent' for PENTAGON_AGENTS, else 'person' Intentionally narrow. Orgs get classified by operator review, not heuristics. ## Backfill (scripts/backfill-events.py) Replays all merged PRs into events. Idempotent (safe to re-run). Emits: - PR-level: author, evaluator, challenger, synthesizer - Per-claim: originator (walks knowledge tree, matches via description titles) Known limitation: post-merge PR branches are deleted from Forgejo, so we can't diff them for granular per-claim events. Claim→PR mapping uses prs.description (pipe-separated titles). Misses some edge cases but recovers the bulk of historical originator credit. Forward traffic gets clean per-claim events via the normal record_contributor_attribution path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 13:59:22 +01:00
m3taversal	93917f9fc2	fix(attribution): --diff-filter=A + handle sanity filter + remove legacy fallback Some checks are pending CI / lint-and-test (push) Waiting to run Details Ganymede review findings on epimetheus/contributor-attribution-fix branch: 1. BUG: record_contributor_attribution used `git diff --name-only` (all modified files), not just added. Enrich/challenge PRs re-credited the sourcer on every subsequent modification. Fixed: --diff-filter=A restricts to new files only. The synthesizer/challenger/reviewer roles for enrich PRs are still credited via the Pentagon-Agent trailer path, so this doesn't lose any correct credit. 2. WARNING: Legacy `source`-field heuristic fabricated garbage handles from descriptive strings ("sec-interpretive-release-s7-2026-09-(march-17", "governance---meritocratic-voting-+-futarchy"). Removed outright + added regex handle sanity filter (`^[a-z0-9][a-z0-9_-]{0,38}$`). Applied before every return path in parse_attribution (the nested-block early return was previously bypassing the filter). Dry-run impact: unique handles 83→70 (13 garbage filtered), NEW contributors 49→48, EXISTING drift rows 34→22. The filter drops rows where the literal garbage string lives in frontmatter (Slotkin case: attribution.sourcer.handle was written as "senator-elissa-slotkin-/-the-hill" by the buggy legacy path). 3. NIT: Aligned knowledge_prefixes in the file walker to match is_knowledge_pr (removed entities/, convictions/). Widening those requires Cory sign-off since is_knowledge_pr currently gates entity-only PRs out of CI. Tests: 17 pass (added test_bad_handles_filtered, test_valid_handle_with_hyphen_passes, updated test_legacy_source_fallback → test_legacy_source_fallback_removed). Ganymede review — 3-message protocol msg 3 pending. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 12:58:55 +01:00
m3taversal	3fe0f4b744	fix(attribution): credit sourcer/extractor from claim frontmatter Three layers of contributor-attribution bug surfaced by Apr 24 leaderboard investigation. alexastrum, thesensatore, cameron-s1 all had real merged contributions but zero credit in the contributors table. 1. lib/attribution.py: parse_attribution() only read `attribution_sourcer:` prefix-keyed flat fields. ~42% of claim files (535/1280) use the bare-key form `sourcer: alexastrum` written by extract.py. Added bare-key handling between the prefixed-flat path and the legacy-source-field fallback. Block format (`attribution: { sourcer: [...] }`) still wins when present. 2. lib/contributor.py: record_contributor_attribution() parsed the diff text with regex looking for `+- handle: "X"` lines. This matched neither the bare-key flat format nor the `attribution: { sourcer: [...] }` block format Leo uses for manual extractions. Replaced the regex parser with a file walker that calls attribution.parse_attribution_from_file() on each changed knowledge file — single source of truth for both formats. 3. scripts/backfill-sourcer-attribution.py: walks all merged knowledge files, re-attributes via the canonical parser, upserts contributors. Default additive mode preserves existing high counts (e.g. m3taversal.sourcer=1011 reflects Telegram-curator credit accumulated via a different code path that this fix does not touch). --reset flag for the destructive case. Dry-run preview (additive mode): - 670 NEW contributors to insert (mostly source-citation handles) - 77 EXISTING contributors with under-counted role columns - alexastrum: 0 → 6, thesensatore: 0 → 5, cameron-s1: 0 → 2 - astra.sourcer: 0 → 96, leo.sourcer: 0 → 44, theseus.sourcer: 0 → 18 - m3taversal.sourcer: 1011 (preserved, not 22 from file walk) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 12:48:41 +01:00
m3taversal	05d15cea56	feat(activity): Timeline data gaps — type filter + commit_type classifier + source_channel reshape Three hackathon-critical fixes for Timeline page rendering (Accelerate Solana, May 5): Gap 1 — /api/activity respects ?type= now: - accepts single or comma-separated operation types (extract\|new\|enrich\|challenge\|infra) - over-fetches 5× limit (capped 2000) so post-build filtering still fills the requested page size - unknown types filter out cleanly Gap 2 — classify_pr_operation() replaces STATUS_TO_OPERATION for merged PRs: - commit_type wins over branch prefix for merged PRs so extract/* branches with commit_type='enrich' or 'challenge' surface correctly (same gotcha as the contributor-role wiring fix) - priority: challenge → enrich (incl. reweave/) → maintenance (infra) → new - challenged_by detection carried over from activity_feed_api._classify_event - non-merged statuses unchanged (extract/new/infra/challenge as before) - SQL now selects commit_type + description alongside existing columns - 14 unit tests covering the gotcha matrix Gap 3 — _CHANNEL_MAP reshape: - extract/, ingestion/ default → 'unknown' (was 'telegram'; telegram-origin classification now requires explicit tagging at ingestion time) - agent/maintenance mappings unchanged - github_pr override and gh-pr-* branches continue to return 'github' - 'web' registered as the canonical in-app submission channel (matches the platform-named pattern established by telegram/github/agent) - module docstring enumerates all six valid channels Deployed to VPS; diagnostics + pipeline restarted clean. Smoke: type=enrich returns 22 events (was 0), type=challenge returns 0 (matches DB — zero challenge commit_types). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 19:51:58 +01:00
m3taversal	cfcb06a6dc	fix(diagnostics): commit claims_api + register routes that were VPS-only Some checks are pending CI / lint-and-test (push) Waiting to run Details Root cause (per Epi audit): - /api/claims, /api/contributors/list, /api/contributors/{handle} returned 404 in prod. The route registrations and claims_api.py module existed only on VPS — never committed. Today's auto-deploy of an unrelated app.py change rsync'd the repo (registration-less) version over the VPS edits, wiping endpoints Vercel depended on. - Recurrence of the deploy-without-commit pattern (blindspot #2). Brings repo to parity with the live, working VPS state: - Add diagnostics/claims_api.py (161 lines, was VPS-only) - Wire register_claims_routes + register_contributor_routes in app.py alongside the existing register_activity_feed call beliefs_routes.py is also VPS-only and currently unregistered (orphaned by the same Apr 21 manual edit that dropped its registration). Left out of this commit pending a decision on whether to revive or delete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 12:51:17 +01:00
m3taversal	2f6424617b	feat: wire Timeline activity endpoint + surface source_channel /api/activity and /api/activity-feed were never registered in app.py — both files existed but neither route was reachable (confirmed 404 on VPS). Register both so Timeline and gamification feeds can consume them. Adds source_channel to /api/activity payload (both PR rows and audit events — audit rows return null since they aren't tied to a specific PR). Migration v22 already populated prs.source_channel on VPS with enum: telegram=2340, agent=698, maintenance=102, unknown=11, github=1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 12:22:12 +01:00
m3taversal	9a943e8460	feat: expose source_channel on activity feed Adds p.source_channel to the SELECT and surfaces it on each event. Migration v22 populated the column with enum values: telegram, agent, maintenance, unknown, github. Timeline UI needs this to show per-event provenance (2340 telegram, 698 agent, 102 maintenance, 11 unknown, 1 github). Nulls fall back to "unknown" — only 0 rows currently null, but the fallback is defensive for future inserts before backfill runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 12:20:26 +01:00
m3taversal	84f6d3682c	fix(eval): treat empty diff as conservative fallback in auto-close gate Some checks are pending CI / lint-and-test (push) Waiting to run Details Ganymede review nit: if get_pr_diff returns an empty string (edge case — Forgejo quirk, empty PR), the old `if diff is None` branch would miss it, the `elif diff and ...` would evaluate False (empty string is falsy), and control would fall to `else` — triggering auto-close on zero diff content. Change `if diff is None` → `if not diff` so empty string ALSO falls through to the conservative path. Matches the stated posture: skip auto-close when in doubt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 11:24:16 +01:00
m3taversal	33c17f87a8	feat(eval): auto-close near-duplicate PRs when merged sibling exists Prevents Apr 22 runaway-damage pattern (44 open PRs manually bulk-closed) where a source extracted 20+ times before the cooldown gate landed, each leaving an orphan 'open' PR after eval correctly rejected as near-duplicate. Gate fires in dispose_rejected_pr before attempt-count branches: all_issues == ["near_duplicate"] (exact match — compound carries signal) AND sibling PR exists with same source_path in status='merged' AND diff contains "new file mode" (not enrichment-only) → close on Forgejo + DB with audit, post explanation comment. Ganymede review — 5 must-fix/warnings applied + 1 must-add: - Exact match on single-issue near_duplicate (compound rejections preserved) - Enrichment guard via diff scan (eval_parse regex can flag enrichment prose) - 10s timeout on get_pr_diff — conservative fallback on Forgejo wedge - Forgejo comment with canned explanation (best-effort, try/except) - Partial index idx_prs_source_path + migration v23 - Explicit p1.source_path IS NOT NULL in WHERE Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 11:17:29 +01:00
m3taversal	a053a8ebf9	fix(backfill): don't regress terminal source statuses to unprocessed Some checks are pending CI / lint-and-test (push) Waiting to run Details backfill-sources.py runs every 15 minutes and derives sources.status purely from directory location. If a source file is in inbox/queue/, it blindly overwrites the DB status to 'unprocessed' — even when the DB already had 'extracted' or 'null_result'. This is why the 43 zombies kept coming back after manual backfill: cron re-reset them every 15 minutes, then each 4h cooldown expiry re-triggered runaway extraction on the same source. Fix: never regress from a terminal status (extracted, null_result, error, ghost_no_file) to 'unprocessed'. File location is ambiguous (legitimately new vs. zombie from failed archive); DB is authoritative. Legitimate re-extraction still works — it goes through the needs_reextraction path which is unaffected by this gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 21:29:33 +01:00
m3taversal	97b590acd6	fix: close cooldown-dependence gaps in extract.py (Ganymede review) Some checks are pending CI / lint-and-test (push) Waiting to run Details Three targeted fixes from Ganymede's review of commit `469cb7f`: BUG #1 — Success path now updates sources.status='extracting' before PR creation, so queue scan's DB-authoritative filter catches sources between PR creation and merge. Previously the cooldown gate was load-bearing for this window, not belt-and-suspenders as claimed. BUG #2 — Second null-result path (line 573, triggered when enrichments existed but all targets were missing in worktree) now updates DB. Without this, that path created no PR, no DB mark, and would have re-entered the runaway loop 4h later when the cooldown window expired. NIT #6 — 4h cooldown moved to config.EXTRACTION_COOLDOWN_HOURS. Tunable without code change. Log format now shows the configured hours. Also backfilled 59 pre-existing zombie queue-path rows where the file was already archived but DB status said 'unprocessed' — these would have leaked past the DB filter once the 4h cooldown expired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 11:33:10 +01:00
m3taversal	469cb7f2da	fix: stop runaway re-extraction loop in extract.py Some checks are pending CI / lint-and-test (push) Waiting to run Details Three changes reduce extraction cost and duplicate PR flood: 1. 4-hour cooldown gate — skip sources with ANY PR (merged/closed/open) created in the last 4h. Prevents same source re-extracting every 60s while archive step lags behind merge. 2. DB-authoritative status — sources.status is now updated in the pipeline DB at each extraction terminal point (null_result, success). Queue scan checks DB first so sources with failed archives (e.g., root-owned worktree files blocking git pull --rebase) don't get re-extracted forever. Also moves archival into the extraction branch so it goes through PR merge instead of a fragile separate main-worktree push. 3. source_channel wiring — extract.py PR INSERT now sets source_channel from classify_source_channel(branch). Previously daemon-created PRs had NULL source_channel, breaking Argus dashboard filters. Combined with Ship's in-branch archive refactor. Root incident: blockworks-metadao-strategic-reset.md extracted 31 times in 12 hours. Nine other sources hit 10-22 extractions each. Near-duplicate rejection rate jumped to 94%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 11:19:30 +01:00
m3taversal	8de28d6ee0	feat: bidirectional source↔claim linking Some checks are pending CI / lint-and-test (push) Waiting to run Details Forward link: claims get `sourced_from: {domain}/{filename}` at extraction time. Reverse link: after merge, backlink_source_claims() updates source files with `claims_extracted:` list. All disk writes happen under async_main_worktree_lock. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 13:00:59 +01:00
m3taversal	05f375d775	feat: filter system accounts from leaderboard, add primary_ci field - SYSTEM_ACCOUNTS set excludes pipeline/unknown/teleo-agents from /api/contributors/list - primary_ci field: action_ci.total when available, else role-based ci_score - action_ci included in list endpoint for each contributor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:33:47 +01:00
m3taversal	4101048cd0	feat: wire action-type CI into contributor profiles - contribution_scores table stores per-PR CI with action type - Profile endpoint returns action_ci alongside role-based ci_score - Branch-name attribution: contrib/NAME/ PRs attributed to NAME - Cameron now shows 0.32 CI + BELIEF MOVER badge from challenge - Handle variant matching (cameron-s1 → cameron) for cross-system lookup - Full historical backfill: 985 scores across 9 contributors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:29:01 +01:00
m3taversal	af027d3ced	feat: add contributor profile API endpoint GET /api/contributors/{handle} — returns CI score, badges, domain breakdown, role percentages, contribution timeline, review stats. GET /api/contributors/list — leaderboard with min_claims filter. Git-log fallback for contributors not in pipeline.db (Cameron, Alex). Badge system: FOUNDING CONTRIBUTOR, BELIEF MOVER, KNOWLEDGE SOURCER, DOMAIN SPECIALIST, VETERAN, FIRST BLOOD. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:22:13 +01:00
m3taversal	1b27a2de31	feat: add /api/activity-feed endpoint with hot/recent/important sort Serves contribution events from pipeline.db. Classifies PRs as create/enrich/challenge, normalizes contributors, derives summaries from branch names when descriptions are empty. Hot sort uses challenge3 + enrich2 + signal / hours^1.5 decay from event time. Domain and contributor filters, pagination (limit/offset). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:16:46 +01:00
m3taversal	11e026448a	sync: dashboard_routes.py from VPS — digest + contributor-graph endpoints Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:01:15 +01:00
m3taversal	c3d0b1f5a4	feat: contributor graph PNG generator + API endpoint matplotlib chart with dual axes — cumulative claims (#00d4aa) and contributors (#7c3aed) on dark background. 1200x630 for Twitter. Auto-regenerates hourly via /api/contributor-graph endpoint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:01:02 +01:00
m3taversal	88e8e15c6d	feat: add /api/digest/latest endpoint for scoring digest data Serves the latest scoring-digest-latest.json from cron output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:55:24 +01:00
m3taversal	5463ca0b56	feat: add daily scoring digest with CREATE/ENRICH/CHALLENGE classification Classifies merged PRs by action type, scores with importance multiplier (confidence, domain maturity, connectivity bonus), updates contributor records, posts summary to Telegram, serves via /api/digest/latest. Cron: 7:07 UTC daily (8:07 AM London). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:55:13 +01:00
m3taversal	e043cf98dc	feat: add wiki-link audit script for codex graph integrity Crawls domains/foundations/core/decisions for [[wiki-links]], resolves against claim files, entities, maps, and agents. Reports dead links, orphans, and connectivity stats. Prerequisite for CI scoring connectivity bonus — broken links would inflate scores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:46:55 +01:00
m3taversal	9c0be78620	fix: align CI role weights with contribution-architecture.md config.py had extractor-heavy weights (0.40) from initial bootstrap. Correct weights per approved architecture: challenger 0.35, synthesizer 0.25, reviewer 0.20, sourcer 0.15, extractor 0.05. backfill-ci.py already had correct weights; this fixes the live computation in health.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:37:47 +01:00
m3taversal	c29049924e	fix: wire commit_type into contributor role assignment The contributor attribution always recorded "extractor" regardless of the PR's refined commit_type. Added COMMIT_TYPE_TO_ROLE mapping and applied it in all three attribution paths (Pentagon-Agent trailer, git author fallback, PR agent fallback). Backfill script resets and re-derives role counts from prs.commit_type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:27:36 +01:00
m3taversal	f463f49b46	fix: prevent false 'already up to date' on fork PRs with merge commits When a contributor merges main into their fork branch (standard GitHub workflow), merge-base equals main SHA, triggering the 'already up to date' early return. This closes the PR without cherry-picking the new content. Cameron's PR #3377 hit this exact bug. Fix: add a diff check before returning 'already up to date'. If the branch has actual content changes vs main, proceed to cherry-pick instead of short-circuiting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 22:41:14 +01:00
m3taversal	9505e5b40a	feat: add /api/contributor-growth endpoint + cumulative growth script Adds async git-log-based endpoint for cumulative contributor and claim tracking. 5-minute cache, excludes bot accounts, tags founding contributors. Standalone CLI script also included for ad-hoc data generation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 22:19:42 +01:00
m3taversal	f0cf772182	Merge remote-tracking branch 'origin/epimetheus/reduce-rejections' Some checks are pending CI / lint-and-test (push) Waiting to run Details	2026-04-20 19:03:26 +01:00
m3taversal	4fc541c656	Skip liquidated entities in portfolio fetcher Some checks are pending CI / lint-and-test (push) Waiting to run Details Ranger was liquidated — no point fetching empty data every cron run. Also purged 1,647 pre-Apr-20 snapshot rows (incomplete NAV data from data collection ramp-up, not actual market movement). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 18:55:04 +01:00
m3taversal	b7242d2206	Wire rejection_reason into review records + fix ingestion domain routing Some checks are pending CI / lint-and-test (push) Waiting to run Details rejection_reason was always NULL in review_records — now populated with comma-joined issue tags (near_duplicate, frontmatter_schema, etc.) at both rejection call sites. Also fixes stale reviewer_model="gpt-4o" hardcoding to use config.EVAL_DOMAIN_MODEL (currently Gemini Flash). Ingestion branches (ingestion/futardio-, ingestion/metadao-) now resolve to internet-finance domain instead of falling through to "general". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 18:03:34 +01:00
m3taversal	12078c8707	Reduce near-duplicate and frontmatter schema rejections Near-duplicate (159+ rejections): - Add extract-time dedup gate: SequenceMatcher check before file write ($0) - Strengthen extraction prompt: high-similarity matches (>=0.75) get explicit "DO NOT extract, use enrichment instead" warning - Strip [[wiki link]] brackets from related_claims field Frontmatter schema (129+ rejections): - Normalize LLM confidence aliases (high→likely, medium→experimental, etc.) in both _build_claim_content and validate_schema - Strip code fences (```markdown/```yaml) from entity content in extract.py and from diff content in validate.py tier0.5 check - Code fences were root cause of "no_frontmatter" failures: parser sees ```markdown as first line, not --- Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 18:03:26 +01:00
m3taversal	7a753da68b	fix: auto-deploy.sh rsync excludes broken + add tests/ sync Some checks are pending CI / lint-and-test (push) Waiting to run Details - Switch RSYNC_FLAGS string to RSYNC_OPTS bash array (same fix as deploy.sh in `368b579` — string passed literal quotes to rsync, matching nothing) - Add tests/ to rsync targets and syntax check glob for parity with deploy.sh - All 8 rsync calls now use "${RSYNC_OPTS[@]}" expansion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:22:11 +01:00
m3taversal	febbc7da30	add rio and theseus telegram bot agent configs Some checks are pending CI / lint-and-test (push) Waiting to run Details Two YAML files on VPS but not in repo. Agent identity, KB scope, and voice configs for the Telegram bots. No secrets (tokens reference file paths, not inline values). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:20:21 +01:00
m3taversal	368b5793d3	fix: deploy.sh rsync excludes were broken — quotes passed literally Some checks are pending CI / lint-and-test (push) Waiting to run Details RSYNC_FLAGS as a string meant --exclude='__pycache__' passed literal quotes to rsync, matching nothing. Switched to bash array (RSYNC_OPTS) so excludes work correctly. __pycache__ and .pyc files no longer sync. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:17:31 +01:00
m3taversal	670c50f384	fix: add telegram/ and tests/ to deploy pipeline, remove hardcoded API key Some checks are pending CI / lint-and-test (push) Waiting to run Details deploy.sh was missing telegram/ and tests/ directories — code existed in repo but never synced to VPS. Also removes hardcoded twitterapi.io key from x-ingest.py (reads from secrets file like all other modules). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:15:55 +01:00
m3taversal	a479ab533b	fix: add fetch_coins.py to auto-deploy loop + legacy migration comment Some checks are pending CI / lint-and-test (push) Waiting to run Details - auto-deploy.sh: fetch_coins.py was missing from the root-level .py deploy loop (line 72). Only manual deploy.sh had it. Next cycle syncs it to VPS. - fetch_coins.py: document the ALTER TABLE loop as legacy migration for older DBs that predate the CREATE TABLE columns. Reviewed-by: Ganymede Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:06:35 +01:00
m3taversal	eac5d2f0d3	fix: add fetch_coins.py to deploy.sh deployment list Some checks are pending CI / lint-and-test (push) Waiting to run Details fetch_coins.py was committed to repo root but deploy.sh only deployed teleo-pipeline.py and reweave.py. This meant bug fixes to fetch_coins would silently fail to reach VPS on deploy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:01:31 +01:00
m3taversal	5071ecef16	fix: apply Ganymede review fixes to portfolio code Some checks are pending CI / lint-and-test (push) Waiting to run Details dashboard_portfolio.py: - datetime.utcnow() → datetime.now(timezone.utc) (deprecation fix) - days parameter validation with try/except + min(..., 365) on 2 endpoints fetch_coins.py: - isinstance(chain, str) guard prevents AttributeError on string chain values - Log when adjusted market cap differs from DexScreener value Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 17:00:02 +01:00
m3taversal	ddf3c25e88	sync VPS state: portfolio dashboard + fetch_coins.py Some checks are pending CI / lint-and-test (push) Waiting to run Details Pull live app.py from VPS to close 243-line drift. Add portfolio dashboard (renamed from v2), portfolio nav link, and fetch_coins.py (daily cron script for ownership coin data). Delete stale lib/ copy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 16:55:36 +01:00
m3taversal	cde92d3db1	fix: wrap breaker calls in stage_loop to prevent permanent task death Some checks are pending CI / lint-and-test (push) Waiting to run Details A transient DB lock in breaker.record_failure() inside an except handler killed the asyncio coroutine permanently — snapshot_cycle died Apr 18 and never recovered. All three breaker call sites now have their own try/except. Also includes HTML injection fix for github_feedback review_text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 12:37:28 +01:00
m3taversal	83526bc90e	fix: quote YAML edge values containing colons, skip unparseable files in reweave merge Root cause of 84% reweave PR rejection rate: claim titles with colons (e.g., "COAL: Meta-PoW: The ORE Treasury Protocol") written as bare YAML list items, causing yaml.safe_load to fail during merge. Three changes: 1. frontmatter.py: _yaml_quote() wraps colon-containing values in double quotes 2. reweave.py: _write_edge_regex uses _yaml_quote for new edges 3. merge.py: skip individual files with parse failures instead of aborting entire PR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 12:07:28 +01:00
m3taversal	ae860a1d06	fix: set execute bit on research-session.sh and install-hermes.sh Mode 100644 → 100755. Previous commit added the safety net but missed the actual git mode change due to staging order. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 11:54:39 +01:00
m3taversal	878f6e06e3	fix: restore execute bits on .sh files, add chmod safety net to auto-deploy research-session.sh and install-hermes.sh were committed with mode 100644 during repo reorganization (`d2aec7fe`). rsync -az preserved the non-executable mode, breaking all research agent cron jobs since Apr 15. Safety net in auto-deploy.sh ensures any future permission loss is auto-corrected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 11:54:13 +01:00
m3taversal	ac794f5c68	Fix source_channel migration: add to SCHEMA_SQL, default 'unknown' not 'telegram' Ganymede review findings: 1. source_channel was missing from CREATE TABLE (fresh installs wouldn't have it) 2. Default fallback changed from 'telegram' to 'unknown' — unknown prefixes are genuinely unknown, not telegram 3. Cross-reference comments added between BRANCH_PREFIX_MAP and _CHANNEL_MAP Also wires classify_source_channel into merge.py PR discovery INSERT. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 13:27:15 +01:00
m3taversal	25a537d2e1	fix: divergence alerting — alert suppression bug + stale ref detection Bug: echo "alerted" ran regardless of curl success, permanently suppressing alerts on delivery failure. Fix: if/then/else wraps the state write. Warning: stale tracking refs after push steps caused false divergence. Fix: re-fetch both remotes before comparing. Both findings from Ganymede review of Step 6. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 11:10:32 +01:00
m3taversal	0f868aefab	Add GitHub PR feedback module and fix attribution for mirrored PRs Some checks failed CI / lint-and-test (push) Has been cancelled Details github_feedback.py posts pipeline status to GitHub PRs at three touchpoints: discovery ack, eval review result, and merge/close outcome. Only fires for PRs with a github_pr link (set by sync-mirror.sh). All calls non-fatal. contributor.py: expanded git author fallback to scan all non-merge commits (was only checking last commit), added teleo-bot and github-actions[bot] to bot filter list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:16:28 +01:00
m3taversal	13f21f7732	feat: external contributor pipeline — fork PR handling, attribution, prefix recognition - Mirror: fetch GitHub fork PR refs (refs/pull//head), push to Forgejo as gh-pr-N/branch - Mirror: fork PRs auto-create Forgejo PR with GitHub PR title, link github_pr in DB - db.py: add contrib + gh-pr- to classify_branch for external contributor branches - contributor.py: git commit author as attribution fallback (before branch agent) - contributor.py: skip bot/generic authors (m3taversal, teleo, pipeline) - Tests: fix fallback test for new git author path, add external contributor test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:14:01 +01:00

1 2 3 4

180 commits