teleo-infrastructure/scripts
Teleo Agents 74bf0461e8
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
fix(attribution): canonicalize submitted_by at write time + historical normalizer
Companion / write-side fix to fix/activity-feed-canonical-handle.

The activity-feed canonicalization was a read-side guard. The bug at the
source is that extract.py and two backfill scripts write decorated
strings (Vida (self-directed), pipeline (reweave), @m3taversal) into
prs.submitted_by and sources.submitted_by. Downstream readers
(lib.contributor.insert_contribution_event, scripts/scoring_digest,
diagnostics/activity_feed_api) all strip the decorator on read — but
anything that reads the column verbatim (like /api/activity-feed before
the read-side fix) 404s on /contributors/{decorated-handle}.

Stop writing the decorator. The self-directed signal is already carried
by intake_tier == research-task plus the prs.agent column; the suffix
is redundant string noise that costs us correctness at every consumer
that forgets to strip.

Changes:

- lib/extract.py:690 — write canonical handle via attribution.normalize_handle.
  Direct elif for intake_tier == research-task now stores just agent_name.
  @m3taversal -> m3taversal.

- diagnostics/backfill_submitted_by.py — same fix in two branches plus
  the reweave branch (pipeline (reweave) -> pipeline).

- scripts/backfill-research-session-attribution.py — UPDATE prs sets
  agent handle alone, no suffix. Docstring + log line updated.

- scripts/normalize-submitted-by.py (new) — one-time backfill that
  canonicalizes existing prs.submitted_by and sources.submitted_by rows.
  Strips trailing parenthetical decorators, lowercases, drops @. Defaults
  to dry-run; --apply to commit. Skips rows that would normalize to
  invalid handles (no garbage falls through silently).

Dry-run against live pipeline.db:
  prs:     3008 rows need normalization (clean mappings, 0 invalid)
  sources: 730 rows need normalization (clean mappings, 0 invalid)
  Total:   3738 rows. All map to existing handle column values.

After this lands + auto-deploys, the operator should run
  python3 scripts/normalize-submitted-by.py --apply
once to clean historical rows. The read-side canonicalization in
diagnostics/activity_feed_api.py (fix/activity-feed-canonical-handle)
becomes redundant defense-in-depth instead of load-bearing.

No KB writes.
2026-05-13 02:56:50 +00:00
..
audit-wiki-links.py feat: add wiki-link audit script for codex graph integrity 2026-04-21 10:46:55 +01:00
backfill-ci.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-descriptions.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-domains.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-events.py fix(backfill): normalize commit_date via datetime() in time-proximity query 2026-04-24 16:16:03 +01:00
backfill-research-session-attribution.py fix(attribution): canonicalize submitted_by at write time + historical normalizer 2026-05-13 02:56:50 +00:00
backfill-reviewer-count.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
backfill-source-authors.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
backfill-sourcer-attribution.py fix(attribution): credit sourcer/extractor from claim frontmatter 2026-04-24 12:48:41 +01:00
backfill-sources.py fix(backfill): don't regress terminal source statuses to unprocessed 2026-04-22 21:29:33 +01:00
backfill-synthetic-recovery-prs.py fix(backfill): Ganymede review — fix tautological guard + origin='human' 2026-04-24 16:49:12 +01:00
bootstrap-contributors.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
classify-contributors.py fix(classify): Ganymede review fixes — alias cleanup + counter accuracy + handle alignment 2026-04-24 20:47:21 +01:00
contributor-graph.py feat: contributor graph PNG generator + API endpoint 2026-04-21 11:01:02 +01:00
cumulative-growth.py feat: add /api/contributor-growth endpoint + cumulative growth script 2026-04-20 22:19:42 +01:00
embed-claims.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
extract-decisions.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
extract-graph-data.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
migrate-entity-schema.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
migrate-source-archive.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
nightly-reweave.sh fix: lower reweave threshold 0.70→0.55, increase batch 50→200 2026-04-16 14:18:50 +01:00
normalize-submitted-by.py fix(attribution): canonicalize submitted_by at write time + historical normalizer 2026-05-13 02:56:50 +00:00
openrouter-extract-v2.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
reconcile-source-status.sh feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
reconcile-sources.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
reset-m3taversal-sourcer.py fix(tests): apply Ganymede review nits + add m3taversal reset script 2026-04-27 17:35:18 +01:00
scoring_digest.py feat: wire action-type CI into contributor profiles 2026-04-21 11:29:01 +01:00
tier0-gate.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00
vector-gc.py feat: reorganize repo with clear directory boundaries and agent ownership 2026-04-14 18:20:13 +01:00