teleo/teleo-infrastructure

Author	SHA1	Message	Date
m3taversal	58fa8c5276	feat(attribution): Phase A — event-sourced contribution ledger (schema v24) Some checks are pending CI / lint-and-test (push) Waiting to run Details Introduces contribution_events table + non-breaking double-write. Schema lands today, forward traffic writes events alongside existing count upserts, backfill script replays history. Phase B will add leaderboard API reading from events; Phase C switches Argus dashboard over. ## Schema v24 (lib/db.py) - contribution_events: one row per credit-earning event (id, handle, kind, role, weight, pr_number, claim_path, domain, channel, timestamp) Partial UNIQUE indexes handle SQLite's NULL != NULL semantics: idx_ce_unique_claim on (handle, role, pr_number, claim_path) WHERE claim_path NOT NULL idx_ce_unique_pr on (handle, role, pr_number) WHERE claim_path IS NULL PR-level events (evaluator, author, challenger, synthesizer) dedup on 3-tuple. Per-claim events (originator) dedup on 4-tuple. Idempotent on replay. - contributor_aliases: canonical handle mapping Seeded: @thesensatore → thesensatore, cameron → cameron-s1 - contributors.kind TEXT DEFAULT 'person' Migration seeds 'agent' for known Pentagon agent handles. ## Role model (confirmed by Cory Apr 24) Weights: author 0.30, challenger 0.25, synthesizer 0.20, originator 0.15, evaluator 0.05 - author: human who submitted the PR (curation + submission work) - originator: person who authored the underlying content (rewards external creators) - challenger: agent/person who brought a productive disagreement - synthesizer: cross-domain work (enrichments, research sessions) - evaluator: reviewer who approved (Leo + domain agent) Humans-are-always-author: agents credit is capped at evaluator/synthesizer/ challenger. Pentagon agents classify as kind='agent' and surface in the agent-view leaderboard, not the default person view. ## Writer (lib/contributor.py) - New insert_contribution_event(): idempotent INSERT OR IGNORE with alias normalization + kind classification. Falls back silently on pre-v24 DBs. - record_contributor_attribution double-writes alongside existing upsert_contributor calls. Zero risk to current dashboard. - Author event: emitted once per PR from prs.submitted_by → git author → agent-branch-prefix. - Originator events: emitted per claim from frontmatter sourcer, skipping when sourcer == author (avoids self-credit double-count). - Evaluator events: Leo (always when leo_verdict='approve') + domain_agent (when domain_verdict='approve' and not Leo). - Challenger/Synthesizer: emitted from Pentagon-Agent trailer on agent-owned branches (theseus/, rio/, etc.) based on commit_type. Pipeline-owned branches (extract/, reweave/) get no trailer-based event — infrastructure work isn't contribution credit. ## Helpers (lib/attribution.py) - normalize_handle(raw, conn=None): lowercase + strip @ + alias lookup - classify_kind(handle): returns 'agent' for PENTAGON_AGENTS, else 'person' Intentionally narrow. Orgs get classified by operator review, not heuristics. ## Backfill (scripts/backfill-events.py) Replays all merged PRs into events. Idempotent (safe to re-run). Emits: - PR-level: author, evaluator, challenger, synthesizer - Per-claim: originator (walks knowledge tree, matches via description titles) Known limitation: post-merge PR branches are deleted from Forgejo, so we can't diff them for granular per-claim events. Claim→PR mapping uses prs.description (pipe-separated titles). Misses some edge cases but recovers the bulk of historical originator credit. Forward traffic gets clean per-claim events via the normal record_contributor_attribution path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 13:59:22 +01:00
m3taversal	3fe0f4b744	fix(attribution): credit sourcer/extractor from claim frontmatter Three layers of contributor-attribution bug surfaced by Apr 24 leaderboard investigation. alexastrum, thesensatore, cameron-s1 all had real merged contributions but zero credit in the contributors table. 1. lib/attribution.py: parse_attribution() only read `attribution_sourcer:` prefix-keyed flat fields. ~42% of claim files (535/1280) use the bare-key form `sourcer: alexastrum` written by extract.py. Added bare-key handling between the prefixed-flat path and the legacy-source-field fallback. Block format (`attribution: { sourcer: [...] }`) still wins when present. 2. lib/contributor.py: record_contributor_attribution() parsed the diff text with regex looking for `+- handle: "X"` lines. This matched neither the bare-key flat format nor the `attribution: { sourcer: [...] }` block format Leo uses for manual extractions. Replaced the regex parser with a file walker that calls attribution.parse_attribution_from_file() on each changed knowledge file — single source of truth for both formats. 3. scripts/backfill-sourcer-attribution.py: walks all merged knowledge files, re-attributes via the canonical parser, upserts contributors. Default additive mode preserves existing high counts (e.g. m3taversal.sourcer=1011 reflects Telegram-curator credit accumulated via a different code path that this fix does not touch). --reset flag for the destructive case. Dry-run preview (additive mode): - 670 NEW contributors to insert (mostly source-citation handles) - 77 EXISTING contributors with under-counted role columns - alexastrum: 0 → 6, thesensatore: 0 → 5, cameron-s1: 0 → 2 - astra.sourcer: 0 → 96, leo.sourcer: 0 → 44, theseus.sourcer: 0 → 18 - m3taversal.sourcer: 1011 (preserved, not 22 from file walk) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 12:48:41 +01:00
m3taversal	a053a8ebf9	fix(backfill): don't regress terminal source statuses to unprocessed Some checks are pending CI / lint-and-test (push) Waiting to run Details backfill-sources.py runs every 15 minutes and derives sources.status purely from directory location. If a source file is in inbox/queue/, it blindly overwrites the DB status to 'unprocessed' — even when the DB already had 'extracted' or 'null_result'. This is why the 43 zombies kept coming back after manual backfill: cron re-reset them every 15 minutes, then each 4h cooldown expiry re-triggered runaway extraction on the same source. Fix: never regress from a terminal status (extracted, null_result, error, ghost_no_file) to 'unprocessed'. File location is ambiguous (legitimately new vs. zombie from failed archive); DB is authoritative. Legitimate re-extraction still works — it goes through the needs_reextraction path which is unaffected by this gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 21:29:33 +01:00
m3taversal	4101048cd0	feat: wire action-type CI into contributor profiles - contribution_scores table stores per-PR CI with action type - Profile endpoint returns action_ci alongside role-based ci_score - Branch-name attribution: contrib/NAME/ PRs attributed to NAME - Cameron now shows 0.32 CI + BELIEF MOVER badge from challenge - Handle variant matching (cameron-s1 → cameron) for cross-system lookup - Full historical backfill: 985 scores across 9 contributors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:29:01 +01:00
m3taversal	c3d0b1f5a4	feat: contributor graph PNG generator + API endpoint matplotlib chart with dual axes — cumulative claims (#00d4aa) and contributors (#7c3aed) on dark background. 1200x630 for Twitter. Auto-regenerates hourly via /api/contributor-graph endpoint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:01:02 +01:00
m3taversal	5463ca0b56	feat: add daily scoring digest with CREATE/ENRICH/CHALLENGE classification Classifies merged PRs by action type, scores with importance multiplier (confidence, domain maturity, connectivity bonus), updates contributor records, posts summary to Telegram, serves via /api/digest/latest. Cron: 7:07 UTC daily (8:07 AM London). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:55:13 +01:00
m3taversal	e043cf98dc	feat: add wiki-link audit script for codex graph integrity Crawls domains/foundations/core/decisions for [[wiki-links]], resolves against claim files, entities, maps, and agents. Reports dead links, orphans, and connectivity stats. Prerequisite for CI scoring connectivity bonus — broken links would inflate scores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 10:46:55 +01:00
m3taversal	9505e5b40a	feat: add /api/contributor-growth endpoint + cumulative growth script Adds async git-log-based endpoint for cumulative contributor and claim tracking. 5-minute cache, excludes bot accounts, tags founding contributors. Standalone CLI script also included for ad-hoc data generation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 22:19:42 +01:00
m3taversal	22b6ebb6f6	fix: lower reweave threshold 0.70→0.55, increase batch 50→200 Some checks are pending CI / lint-and-test (push) Waiting to run Details Orphan ratio at 39.6% (443/1118 claims) vs <15% target. Root cause: reweave threshold 0.70 too strict for text-embedding-3-small — 56% of orphans found "no neighbors." At 0.55, dry-run shows 0% no-neighbor skips. Batch size 200 clears backlog in ~3-4 nights at ~$0.20/run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:18:50 +01:00
m3taversal	81afcd319f	fix: sync all code from VPS — repo is now authoritative source of truth Some checks are pending CI / lint-and-test (push) Waiting to run Details 24 files: 8 pipeline lib modules, 6 diagnostics updates, 4 new diagnostics modules, telegram bot fix, 5 active operational scripts. Key changes: - Security: SQL injection prevention (alerting.py), SSL verification (review_queue.py), path traversal guard (extract.py) - Cost tracking: per-PR cost accumulation in evaluate.py - Auto-recovery: watchdog tier0 reset with retry cap + cooldown - Extraction: structured edge fields, post-write vector connection - New modules: vitality, research_tracking, research_routes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:18:01 +01:00
m3taversal	d2aec7fee3	feat: reorganize repo with clear directory boundaries and agent ownership Some checks are pending CI / lint-and-test (push) Waiting to run Details Move scattered root-level files into categorized directories: - deploy/ — deployment + mirror scripts (Ship) - scripts/ — one-off backfills + migrations (Ship) - research/ — nightly research + prompts (Ship) - docs/ — all operational documentation (shared) Delete 3 dead cron scripts replaced by pipeline daemon: - batch-extract-50.sh, evaluate-trigger.sh, extract-cron.sh Add CODEOWNERS mapping every path to its owning agent. Add README with directory structure, ownership table, and VPS layout. Update deploy.sh paths to match new structure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:20:13 +01:00

11 commits