|
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Background: - futard.io retired its /api/graphql endpoint between Apr 17–20 - Cloud Scheduler ingest-futard has been firing into 500s ever since (the AttributeError on e.url masked the real 404 for 5 days; fixed in living-ip/teleo-api@b8eb441 which surfaced the actual root cause) - The ecosystem migrated to metadao.fi, which is Vercel-protected - Direct curl is blocked by Vercel's anti-bot challenge regardless of headers; a real headless browser passes it cleanly Approach: - Playwright-driven scraper, runs as a one-shot - Discovery: scrape /projects DOM for project slugs, then each /projects/{slug} for proposal addresses - For each NEW proposal: visit page for prose body + call /api/decode-proposal/{addr} via in-browser fetch (bypasses challenge via the primed Vercel cookies in the browser context) for structured on-chain instructions - Idempotent: dedup against existing proposal addresses in archive frontmatter AND filename basenames - Filename embeds 8-char address fragment for stable cross-run dedup even on projects that don't use DP-NNNNN naming convention Tested locally against 6 active projects (p2p-protocol, paystream, zklsol, loyal, ranger, solomon). Captured 13 new proposals — including the Solomon Gigabus DP-00003 that triggered this work — with proper titles, status, on-chain instruction decoding (Squads transactions, SPL transfers, memos), and project metadata. Output schema matches existing futardio source files (type: source, event_type: proposal, domain: internet-finance, status: unprocessed) so the existing extract pipeline picks them up unchanged. Architectural note: this script is intentionally NOT wired to systemd yet — VPS deploy needs Playwright + Chromium system libs which require apt sudo (currently scoped to teleo-* services only). Reviewing the script first; deploy path is a separate decision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| audit-wiki-links.py | ||
| backfill-ci.py | ||
| backfill-descriptions.py | ||
| backfill-domains.py | ||
| backfill-events.py | ||
| backfill-reviewer-count.py | ||
| backfill-source-authors.py | ||
| backfill-sourcer-attribution.py | ||
| backfill-sources.py | ||
| backfill-synthetic-recovery-prs.py | ||
| bootstrap-contributors.py | ||
| classify-contributors.py | ||
| contributor-graph.py | ||
| cumulative-growth.py | ||
| embed-claims.py | ||
| extract-decisions.py | ||
| extract-graph-data.py | ||
| metadao-scrape.py | ||
| migrate-entity-schema.py | ||
| migrate-source-archive.py | ||
| nightly-reweave.sh | ||
| openrouter-extract-v2.py | ||
| reconcile-source-status.sh | ||
| reconcile-sources.py | ||
| scoring_digest.py | ||
| tier0-gate.py | ||
| vector-gc.py | ||