Commit graph

1 commit

Author SHA1 Message Date
8c51e47c4e feat: extraction pre-screening via Qdrant semantic search
Before extraction, the pipeline now:
1. Identifies 3-5 themes from source (Haiku, ~$0.002/source)
2. Searches Qdrant for each theme + title (with author-stripped variant)
3. Injects "Prior Art" into extraction prompt showing existing KB claims
4. Requires ENRICHMENT/CHALLENGE to cite specific target_claim (hard gate)

Reduces near-duplicate extractions (our #1 rejection cause) by showing
the extractor what the KB already knows before it starts.

Prior art also persisted to .prior-art/ sidecar files and included in
PR body for reviewer verification.

Design: Leo. Owner: Epimetheus.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 11:17:38 +01:00