teleo-infrastructure/lib
m3taversal 58fa8c5276
Some checks are pending
CI / lint-and-test (push) Waiting to run
feat(attribution): Phase A — event-sourced contribution ledger (schema v24)
Introduces contribution_events table + non-breaking double-write. Schema
lands today, forward traffic writes events alongside existing count upserts,
backfill script replays history. Phase B will add leaderboard API reading
from events; Phase C switches Argus dashboard over.

## Schema v24 (lib/db.py)

- contribution_events: one row per credit-earning event
  (id, handle, kind, role, weight, pr_number, claim_path, domain, channel, timestamp)
  Partial UNIQUE indexes handle SQLite's NULL != NULL semantics:
    idx_ce_unique_claim on (handle, role, pr_number, claim_path) WHERE claim_path NOT NULL
    idx_ce_unique_pr    on (handle, role, pr_number)             WHERE claim_path IS NULL
  PR-level events (evaluator, author, challenger, synthesizer) dedup on 3-tuple.
  Per-claim events (originator) dedup on 4-tuple. Idempotent on replay.
- contributor_aliases: canonical handle mapping
  Seeded: @thesensatore → thesensatore, cameron → cameron-s1
- contributors.kind TEXT DEFAULT 'person'
  Migration seeds 'agent' for known Pentagon agent handles.

## Role model (confirmed by Cory Apr 24)

Weights: author 0.30, challenger 0.25, synthesizer 0.20, originator 0.15, evaluator 0.05
- author:     human who submitted the PR (curation + submission work)
- originator: person who authored the underlying content (rewards external creators)
- challenger: agent/person who brought a productive disagreement
- synthesizer: cross-domain work (enrichments, research sessions)
- evaluator:  reviewer who approved (Leo + domain agent)

Humans-are-always-author: agents credit is capped at evaluator/synthesizer/
challenger. Pentagon agents classify as kind='agent' and surface in the
agent-view leaderboard, not the default person view.

## Writer (lib/contributor.py)

- New insert_contribution_event(): idempotent INSERT OR IGNORE with alias
  normalization + kind classification. Falls back silently on pre-v24 DBs.
- record_contributor_attribution double-writes alongside existing
  upsert_contributor calls. Zero risk to current dashboard.
- Author event: emitted once per PR from prs.submitted_by → git author →
  agent-branch-prefix.
- Originator events: emitted per claim from frontmatter sourcer, skipping
  when sourcer == author (avoids self-credit double-count).
- Evaluator events: Leo (always when leo_verdict='approve') + domain_agent
  (when domain_verdict='approve' and not Leo).
- Challenger/Synthesizer: emitted from Pentagon-Agent trailer on
  agent-owned branches (theseus/*, rio/*, etc.) based on commit_type.
  Pipeline-owned branches (extract/*, reweave/*) get no trailer-based event —
  infrastructure work isn't contribution credit.

## Helpers (lib/attribution.py)

- normalize_handle(raw, conn=None): lowercase + strip @ + alias lookup
- classify_kind(handle): returns 'agent' for PENTAGON_AGENTS, else 'person'
  Intentionally narrow. Orgs get classified by operator review, not heuristics.

## Backfill (scripts/backfill-events.py)

Replays all merged PRs into events. Idempotent (safe to re-run). Emits:
- PR-level: author, evaluator, challenger, synthesizer
- Per-claim: originator (walks knowledge tree, matches via description titles)

Known limitation: post-merge PR branches are deleted from Forgejo, so we
can't diff them for granular per-claim events. Claim→PR mapping uses
prs.description (pipe-separated titles). Misses some edge cases but
recovers the bulk of historical originator credit. Forward traffic gets
clean per-claim events via the normal record_contributor_attribution path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:59:22 +01:00
..
__init__.py Initial commit: Pipeline v2 daemon + infrastructure docs 2026-03-12 14:11:18 +00:00
analytics.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
attribution.py feat(attribution): Phase A — event-sourced contribution ledger (schema v24) 2026-04-24 13:59:22 +01:00
breaker.py ganymede: add dev infrastructure — pyproject.toml, CI, deploy script 2026-03-13 14:24:27 +00:00
cascade.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
claim_index.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
config.py fix: close cooldown-dependence gaps in extract.py (Ganymede review) 2026-04-22 11:33:10 +01:00
connect.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
contributor.py feat(attribution): Phase A — event-sourced contribution ledger (schema v24) 2026-04-24 13:59:22 +01:00
costs.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
cross_domain.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
db.py feat(attribution): Phase A — event-sourced contribution ledger (schema v24) 2026-04-24 13:59:22 +01:00
dedup.py fix: enrichment idempotency — three-layer dedup prevents duplicate evidence blocks 2026-03-31 13:18:23 +01:00
digest.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
domains.py Wire rejection_reason into review records + fix ingestion domain routing 2026-04-20 18:03:34 +01:00
entity_batch.py fix: enrichment idempotency — three-layer dedup prevents duplicate evidence blocks 2026-03-31 13:18:23 +01:00
entity_queue.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
eval_actions.py fix(eval): treat empty diff as conservative fallback in auto-close gate 2026-04-23 11:24:16 +01:00
eval_parse.py refactor: Phase 4 — extract eval_actions.py, drop underscore prefixes in eval_parse 2026-04-16 12:57:51 +01:00
evaluate.py Wire rejection_reason into review records + fix ingestion domain routing 2026-04-20 18:03:34 +01:00
extract.py fix: close cooldown-dependence gaps in extract.py (Ganymede review) 2026-04-22 11:33:10 +01:00
extraction_prompt.py Reduce near-duplicate and frontmatter schema rejections 2026-04-20 18:03:26 +01:00
feedback.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
fixer.py refactor: Phase 2 — wire pr_state into fixer.py and substantive_fixer.py 2026-04-16 12:21:40 +01:00
forgejo.py leo: handle non-JSON 200 from Forgejo merge API 2026-03-13 17:38:00 +00:00
frontmatter.py fix: quote YAML edge values containing colons, skip unparseable files in reweave merge 2026-04-18 12:07:28 +01:00
github_feedback.py fix: wrap breaker calls in stage_loop to prevent permanent task death 2026-04-20 12:37:28 +01:00
health.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
llm.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
log.py Initial commit: Pipeline v2 daemon + infrastructure docs 2026-03-12 14:11:18 +00:00
merge.py feat: bidirectional source↔claim linking 2026-04-21 13:00:59 +01:00
post_extract.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
post_merge.py feat: bidirectional source↔claim linking 2026-04-21 13:00:59 +01:00
pr_state.py refactor: Phase 3 — fix close_pr ghost bug, wire stale_pr, extract eval_parse 2026-04-16 12:40:23 +01:00
pre_screen.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
search.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
stale_pr.py refactor: Phase 3 — fix close_pr ghost bug, wire stale_pr, extract eval_parse 2026-04-16 12:40:23 +01:00
substantive_fixer.py refactor: Phase 2 — wire pr_state into fixer.py and substantive_fixer.py 2026-04-16 12:21:40 +01:00
validate.py Reduce near-duplicate and frontmatter schema rejections 2026-04-20 18:03:26 +01:00
watchdog.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
worktree_lock.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00