teleo-infrastructure/lib
Teleo Agents 74bf0461e8
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
fix(attribution): canonicalize submitted_by at write time + historical normalizer
Companion / write-side fix to fix/activity-feed-canonical-handle.

The activity-feed canonicalization was a read-side guard. The bug at the
source is that extract.py and two backfill scripts write decorated
strings (Vida (self-directed), pipeline (reweave), @m3taversal) into
prs.submitted_by and sources.submitted_by. Downstream readers
(lib.contributor.insert_contribution_event, scripts/scoring_digest,
diagnostics/activity_feed_api) all strip the decorator on read — but
anything that reads the column verbatim (like /api/activity-feed before
the read-side fix) 404s on /contributors/{decorated-handle}.

Stop writing the decorator. The self-directed signal is already carried
by intake_tier == research-task plus the prs.agent column; the suffix
is redundant string noise that costs us correctness at every consumer
that forgets to strip.

Changes:

- lib/extract.py:690 — write canonical handle via attribution.normalize_handle.
  Direct elif for intake_tier == research-task now stores just agent_name.
  @m3taversal -> m3taversal.

- diagnostics/backfill_submitted_by.py — same fix in two branches plus
  the reweave branch (pipeline (reweave) -> pipeline).

- scripts/backfill-research-session-attribution.py — UPDATE prs sets
  agent handle alone, no suffix. Docstring + log line updated.

- scripts/normalize-submitted-by.py (new) — one-time backfill that
  canonicalizes existing prs.submitted_by and sources.submitted_by rows.
  Strips trailing parenthetical decorators, lowercases, drops @. Defaults
  to dry-run; --apply to commit. Skips rows that would normalize to
  invalid handles (no garbage falls through silently).

Dry-run against live pipeline.db:
  prs:     3008 rows need normalization (clean mappings, 0 invalid)
  sources: 730 rows need normalization (clean mappings, 0 invalid)
  Total:   3738 rows. All map to existing handle column values.

After this lands + auto-deploys, the operator should run
  python3 scripts/normalize-submitted-by.py --apply
once to clean historical rows. The read-side canonicalization in
diagnostics/activity_feed_api.py (fix/activity-feed-canonical-handle)
becomes redundant defense-in-depth instead of load-bearing.

No KB writes.
2026-05-13 02:56:50 +00:00
..
__init__.py Initial commit: Pipeline v2 daemon + infrastructure docs 2026-03-12 14:11:18 +00:00
analytics.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
attribution.py fix(attribution): unify research-session format on "(self-directed)" suffix 2026-04-27 12:53:52 +01:00
breaker.py ganymede: add dev infrastructure — pyproject.toml, CI, deploy script 2026-03-13 14:24:27 +00:00
cascade.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
claim_index.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
config.py fix(reaper): apply Ganymede review — dual-PATCH drift, breaker isolation, env config 2026-05-07 23:43:53 -04:00
connect.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
contributor.py fix(attribution): narrow exception + document gate asymmetry (Ganymede review) 2026-04-26 14:25:24 +01:00
costs.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
cross_domain.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
db.py feat(schema): v26 — publishers + contributor_identities + sources provenance 2026-04-24 20:47:21 +01:00
dedup.py fix: enrichment idempotency — three-layer dedup prevents duplicate evidence blocks 2026-03-31 13:18:23 +01:00
digest.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
domains.py Wire rejection_reason into review records + fix ingestion domain routing 2026-04-20 18:03:34 +01:00
entity_batch.py fix: enrichment idempotency — three-layer dedup prevents duplicate evidence blocks 2026-03-31 13:18:23 +01:00
entity_queue.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
eval_actions.py fix(eval): treat empty diff as conservative fallback in auto-close gate 2026-04-23 11:24:16 +01:00
eval_parse.py refactor: Phase 4 — extract eval_actions.py, drop underscore prefixes in eval_parse 2026-04-16 12:57:51 +01:00
evaluate.py Wire rejection_reason into review records + fix ingestion domain routing 2026-04-20 18:03:34 +01:00
extract.py fix(attribution): canonicalize submitted_by at write time + historical normalizer 2026-05-13 02:56:50 +00:00
extraction_prompt.py Reduce near-duplicate and frontmatter schema rejections 2026-04-20 18:03:26 +01:00
feedback.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00
fixer.py refactor: Phase 2 — wire pr_state into fixer.py and substantive_fixer.py 2026-04-16 12:21:40 +01:00
forgejo.py leo: handle non-JSON 200 from Forgejo merge API 2026-03-13 17:38:00 +00:00
frontmatter.py fix: quote YAML edge values containing colons, skip unparseable files in reweave merge 2026-04-18 12:07:28 +01:00
github_feedback.py fix: wrap breaker calls in stage_loop to prevent permanent task death 2026-04-20 12:37:28 +01:00
health.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
llm.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
log.py Initial commit: Pipeline v2 daemon + infrastructure docs 2026-03-12 14:11:18 +00:00
merge.py fix(merge): correct audit-ref comment + add sentinel-drift warning 2026-04-28 16:19:08 +01:00
post_extract.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
post_merge.py feat: bidirectional source↔claim linking 2026-04-21 13:00:59 +01:00
pr_state.py refactor: Phase 3 — fix close_pr ghost bug, wire stale_pr, extract eval_parse 2026-04-16 12:40:23 +01:00
pre_screen.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
search.py Consolidate pipeline code from teleo-codex + VPS into single repo 2026-04-07 16:52:26 +01:00
stale_pr.py refactor: Phase 3 — fix close_pr ghost bug, wire stale_pr, extract eval_parse 2026-04-16 12:40:23 +01:00
substantive_fixer.py fix(reaper): tighten research-session pattern to literal YYYY-MM-DD shape 2026-05-10 19:10:49 +01:00
validate.py Reduce near-duplicate and frontmatter schema rejections 2026-04-20 18:03:26 +01:00
watchdog.py fix: sync all code from VPS — repo is now authoritative source of truth 2026-04-15 13:18:01 +01:00
worktree_lock.py epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features 2026-03-20 20:17:27 +00:00