Commit graph

1 commit

Author SHA1 Message Date
319e03e2c6 test(attribution): prove research-backfill replay is idempotent
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Five tests against the real contribution_events schema (lib/db.py:181-209):
- pr-level dedup with NULL claim_path via idx_ce_unique_pr partial index
- per-claim dedup with non-NULL claim_path via idx_ce_unique_claim partial index
- pr-level and per-claim events coexist on the same pr_number
- backfill (INSERT correct + DELETE wrong) is a true no-op on replay
- replay against already-backfilled state preserves unrelated events

Schema case identified: case 2 with partial-index split solution already in
place. Two partial UNIQUE indexes target disjoint row sets (claim_path IS NULL
vs IS NOT NULL), bypassing SQLite's NULL-not-equal-NULL UNIQUE quirk.

Production replay verified: re-running backfill --apply against the live DB
returns "misattributed PRs found: 0" because the first-run UPDATE flipped the
WHERE predicate. Total contribution_events count: 3839 → 3839.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:50:17 +01:00