fix(attribution): narrow exception + document gate asymmetry (Ganymede review)
Two follow-up fixes from Ganymede's review of d0fb4c9:
1. is_publisher_handle: narrow `except Exception` to sqlite3.OperationalError.
Pre-v26 DB fallback only needs to catch the "table doesn't exist" case;
broader exceptions (programming errors, locks, corruption) should propagate.
2. upsert_contributor gate: add comment documenting the alias-resolution
asymmetry between insert_contribution_event (alias-resolved via
normalize_handle) and upsert_contributor (bare lower+lstrip-@). Today this
is fine because the v26 classifier produced one publisher row per canonical
handle. Branch 3 will normalize alias→canonical at writer entry points,
tightening this gate transparently.
Unit tests for the gates (positive + negative + alias resolution) deferred to
Branch 3 alongside the auto-create flow tests.
Smoke-tested:
- pre-v26 fallback (no publishers table) → None (correct)
- case-insensitive match (CNBC → id=1) → correct
- @ prefix strip (@cnbc → id=1) → correct
- non-publisher handle (alexastrum) → None (correct)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
d0fb4c96e3
commit
dea1b02aa6
2 changed files with 13 additions and 2 deletions
|
|
@ -15,6 +15,7 @@ Epimetheus owns this module. Leo reviews changes.
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
|
import sqlite3
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
logger = logging.getLogger("pipeline.attribution")
|
logger = logging.getLogger("pipeline.attribution")
|
||||||
|
|
@ -130,8 +131,11 @@ def is_publisher_handle(handle: str, conn) -> int | None:
|
||||||
).fetchone()
|
).fetchone()
|
||||||
if row:
|
if row:
|
||||||
return row["id"] if hasattr(row, "keys") else row[0]
|
return row["id"] if hasattr(row, "keys") else row[0]
|
||||||
except Exception:
|
except sqlite3.OperationalError:
|
||||||
logger.debug("is_publisher_handle: lookup failed for %r", h, exc_info=True)
|
# Pre-v26 DB: publishers table doesn't exist yet. Fall through to None
|
||||||
|
# so writer behaves as before. Any other exception class is real signal
|
||||||
|
# (programming error, lock contention, corruption) — let it propagate.
|
||||||
|
logger.debug("is_publisher_handle: publishers table not present (pre-v26?)", exc_info=True)
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -428,6 +428,13 @@ def upsert_contributor(
|
||||||
# Schema v26 gate: orgs/citations live in publishers table, not contributors.
|
# Schema v26 gate: orgs/citations live in publishers table, not contributors.
|
||||||
# Skip without writing so the v26 classifier cleanup isn't undone by every
|
# Skip without writing so the v26 classifier cleanup isn't undone by every
|
||||||
# merge that has `sourcer: cnbc` (or similar) in claim frontmatter.
|
# merge that has `sourcer: cnbc` (or similar) in claim frontmatter.
|
||||||
|
#
|
||||||
|
# Note: bare normalization (lower + lstrip @), no alias resolution. This is
|
||||||
|
# consistent with the existing `SELECT handle FROM contributors WHERE handle = ?`
|
||||||
|
# below — both look up by canonical-form-as-stored. Today's classifier produces
|
||||||
|
# one publisher row per canonical handle, so bare lookup hits. Branch 3 will
|
||||||
|
# normalize alias→canonical at writer entry points (extract.py, post_extract);
|
||||||
|
# at that point this gate auto-tightens because callers pass canonical handles.
|
||||||
canonical_handle = handle.strip().lower().lstrip("@") if handle else ""
|
canonical_handle = handle.strip().lower().lstrip("@") if handle else ""
|
||||||
if canonical_handle and is_publisher_handle(canonical_handle, conn) is not None:
|
if canonical_handle and is_publisher_handle(canonical_handle, conn) is not None:
|
||||||
logger.debug("upsert_contributor: %r is a publisher — skipping contributor row", canonical_handle)
|
logger.debug("upsert_contributor: %r is a publisher — skipping contributor row", canonical_handle)
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue