|
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
Companion / write-side fix to fix/activity-feed-canonical-handle. The activity-feed canonicalization was a read-side guard. The bug at the source is that extract.py and two backfill scripts write decorated strings (Vida (self-directed), pipeline (reweave), @m3taversal) into prs.submitted_by and sources.submitted_by. Downstream readers (lib.contributor.insert_contribution_event, scripts/scoring_digest, diagnostics/activity_feed_api) all strip the decorator on read — but anything that reads the column verbatim (like /api/activity-feed before the read-side fix) 404s on /contributors/{decorated-handle}. Stop writing the decorator. The self-directed signal is already carried by intake_tier == research-task plus the prs.agent column; the suffix is redundant string noise that costs us correctness at every consumer that forgets to strip. Changes: - lib/extract.py:690 — write canonical handle via attribution.normalize_handle. Direct elif for intake_tier == research-task now stores just agent_name. @m3taversal -> m3taversal. - diagnostics/backfill_submitted_by.py — same fix in two branches plus the reweave branch (pipeline (reweave) -> pipeline). - scripts/backfill-research-session-attribution.py — UPDATE prs sets agent handle alone, no suffix. Docstring + log line updated. - scripts/normalize-submitted-by.py (new) — one-time backfill that canonicalizes existing prs.submitted_by and sources.submitted_by rows. Strips trailing parenthetical decorators, lowercases, drops @. Defaults to dry-run; --apply to commit. Skips rows that would normalize to invalid handles (no garbage falls through silently). Dry-run against live pipeline.db: prs: 3008 rows need normalization (clean mappings, 0 invalid) sources: 730 rows need normalization (clean mappings, 0 invalid) Total: 3738 rows. All map to existing handle column values. After this lands + auto-deploys, the operator should run python3 scripts/normalize-submitted-by.py --apply once to clean historical rows. The read-side canonicalization in diagnostics/activity_feed_api.py (fix/activity-feed-canonical-handle) becomes redundant defense-in-depth instead of load-bearing. No KB writes. |
||
|---|---|---|
| .. | ||
| audit-wiki-links.py | ||
| backfill-ci.py | ||
| backfill-descriptions.py | ||
| backfill-domains.py | ||
| backfill-events.py | ||
| backfill-research-session-attribution.py | ||
| backfill-reviewer-count.py | ||
| backfill-source-authors.py | ||
| backfill-sourcer-attribution.py | ||
| backfill-sources.py | ||
| backfill-synthetic-recovery-prs.py | ||
| bootstrap-contributors.py | ||
| classify-contributors.py | ||
| contributor-graph.py | ||
| cumulative-growth.py | ||
| embed-claims.py | ||
| extract-decisions.py | ||
| extract-graph-data.py | ||
| migrate-entity-schema.py | ||
| migrate-source-archive.py | ||
| nightly-reweave.sh | ||
| normalize-submitted-by.py | ||
| openrouter-extract-v2.py | ||
| reconcile-source-status.sh | ||
| reconcile-sources.py | ||
| reset-m3taversal-sourcer.py | ||
| scoring_digest.py | ||
| tier0-gate.py | ||
| vector-gc.py | ||