feat(ingestion): metadao.fi scraper to replace broken futard.io ingestion #6
Loading…
Reference in a new issue
No description provided.
Delete branch "ship/metadao-scraper"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Why
futard.io retired
/api/graphqlbetween Apr 17–20. Cloud Scheduleringest-futardhas been hammering 500s for 5 days. Root cause was masked because the exception handler hade.url(an AttributeError onaiohttp.ClientResponseError) — diagnostic fix landed inliving-ip/teleo-api@b8eb441which surfaced404 Not Found — https://www.futard.io/api/graphql.The ecosystem moved to metadao.fi. Direct curl is blocked by Vercel anti-bot. Headless Chromium passes the challenge cleanly.
Approach
/projectsDOM for slugs → each/projects/{slug}for proposal addressesfetch(/api/decode-proposal/{addr})returns structured on-chain instructions (Squads tx, SPL transfers, memos)proposal_address:field), address fragment in basename, final filename collisionProbe-tested
All 6 active projects: p2p-protocol, paystream, zklsol, loyal, ranger, solomon. 13 proposals captured (incl. Solomon Gigabus DP-00003). Re-run:
written: 0, skipped_existing: 13— dedup verified.Asks for review
extract_dp_title): three iterations — strict-pattern preference forDP-NNNNN (CAT): Title, stat-bleed stripper for flex-collapsed layouts. Sanity-check there isnt a fourth edge case I missed.get_project_metadata): walks up only while no sibling proposal link is included. Earlier version walked too high, same card_text for every proposal.existing_proposal_addresses): regex onproposal_address:field + URL pattern, reads first 4KB. Worth a sanity check for false positives in tags etc.build_source_markdown): pattern-matched against existing2024-02-05-futardio-proposal-*.mdbut did not trace throughlib/extract.pyend-to-end.Out of scope
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.Merge
Merge the changes and update on Forgejo.Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.