teleo-infrastructure/scripts
m3taversal 3fe0f4b744 fix(attribution): credit sourcer/extractor from claim frontmatter
Three layers of contributor-attribution bug surfaced by Apr 24 leaderboard
investigation. alexastrum, thesensatore, cameron-s1 all had real merged
contributions but zero credit in the contributors table.

1. lib/attribution.py: parse_attribution() only read `attribution_sourcer:`
   prefix-keyed flat fields. ~42% of claim files (535/1280) use the bare-key
   form `sourcer: alexastrum` written by extract.py. Added bare-key handling
   between the prefixed-flat path and the legacy-source-field fallback.
   Block format (`attribution: { sourcer: [...] }`) still wins when present.

2. lib/contributor.py: record_contributor_attribution() parsed the diff text
   with regex looking for `+- handle: "X"` lines. This matched neither the
   bare-key flat format nor the `attribution: { sourcer: [...] }` block
   format Leo uses for manual extractions. Replaced the regex parser with
   a file walker that calls attribution.parse_attribution_from_file() on
   each changed knowledge file — single source of truth for both formats.

3. scripts/backfill-sourcer-attribution.py: walks all merged knowledge files,
   re-attributes via the canonical parser, upserts contributors. Default
   additive mode preserves existing high counts (e.g. m3taversal.sourcer=1011
   reflects Telegram-curator credit accumulated via a different code path
   that this fix does not touch). --reset flag for the destructive case.

Dry-run preview (additive mode):
  - 670 NEW contributors to insert (mostly source-citation handles)
  - 77 EXISTING contributors with under-counted role columns
  - alexastrum: 0 → 6, thesensatore: 0 → 5, cameron-s1: 0 → 2
  - astra.sourcer: 0 → 96, leo.sourcer: 0 → 44, theseus.sourcer: 0 → 18
  - m3taversal.sourcer: 1011 (preserved, not 22 from file walk)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 12:48:41 +01:00
..
audit-wiki-links.py feat: add wiki-link audit script for codex graph integrity 2026-04-21 10:46:55 +01:00
backfill-ci.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-descriptions.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-domains.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-reviewer-count.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
backfill-source-authors.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-sourcer-attribution.py fix(attribution): credit sourcer/extractor from claim frontmatter 2026-04-24 12:48:41 +01:00
backfill-sources.py fix(backfill): don't regress terminal source statuses to unprocessed 2026-04-22 21:29:33 +01:00
bootstrap-contributors.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
contributor-graph.py feat: contributor graph PNG generator + API endpoint 2026-04-21 11:01:02 +01:00
cumulative-growth.py feat: add /api/contributor-growth endpoint + cumulative growth script 2026-04-20 22:19:42 +01:00
embed-claims.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
extract-decisions.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
extract-graph-data.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
migrate-entity-schema.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
migrate-source-archive.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
nightly-reweave.sh fix: lower reweave threshold 0.70→0.55, increase batch 50→200 2026-04-16 14:18:50 +01:00
openrouter-extract-v2.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
reconcile-source-status.sh feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
reconcile-sources.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
scoring_digest.py feat: wire action-type CI into contributor profiles 2026-04-21 11:29:01 +01:00
tier0-gate.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
vector-gc.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00