From 8d3ba36b59fd9d906868f2318229cc2d92fd1821 Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Sun, 22 Mar 2026 22:15:42 +0000 Subject: [PATCH 1/4] extract: 2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70> --- ...arkets over polling in 2024 US election.md | 6 +++++ ...-selection-vs-information-acquisition.json | 25 +++++++++++++++++++ ...on-selection-vs-information-acquisition.md | 15 ++++++++++- 3 files changed, 45 insertions(+), 1 deletion(-) create mode 100644 inbox/queue/.extraction-debug/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.json diff --git a/domains/internet-finance/Polymarket vindicated prediction markets over polling in 2024 US election.md b/domains/internet-finance/Polymarket vindicated prediction markets over polling in 2024 US election.md index 5ea7094c7..25df660ad 100644 --- a/domains/internet-finance/Polymarket vindicated prediction markets over polling in 2024 US election.md +++ b/domains/internet-finance/Polymarket vindicated prediction markets over polling in 2024 US election.md @@ -48,6 +48,12 @@ The very success of prediction markets in the 2024 election triggered the state --- +### Additional Evidence (extend) +*Source: [[2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition]] | Added: 2026-03-22* + +The Atanasov/Mellers framework suggests this vindication may be domain-specific. Prediction markets outperformed polls in 2024 election, but GJP research shows algorithm-weighted polls can match market accuracy for geopolitical events with public information. The election result doesn't distinguish whether markets won through better calibration-selection (Mechanism A, replicable by polls) or through information-acquisition advantages (Mechanism B, not replicable). If markets succeeded primarily through Mechanism A, sophisticated poll aggregation could have matched them. + + Relevant Notes: - [[futarchy is manipulation-resistant because attack attempts create profitable opportunities for defenders]] — theoretical property validated by Polymarket's performance - [[MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions]] — shows mechanism robustness even at small scale diff --git a/inbox/queue/.extraction-debug/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.json b/inbox/queue/.extraction-debug/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.json new file mode 100644 index 000000000..9fc081b56 --- /dev/null +++ b/inbox/queue/.extraction-debug/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.json @@ -0,0 +1,25 @@ +{ + "rejected_claims": [ + { + "filename": "prediction-market-epistemic-mechanisms-separate-into-calibration-selection-and-information-acquisition.md", + "issues": [ + "missing_attribution_extractor" + ] + } + ], + "validation_stats": { + "total": 1, + "kept": 0, + "fixed": 2, + "rejected": 1, + "fixes_applied": [ + "prediction-market-epistemic-mechanisms-separate-into-calibration-selection-and-information-acquisition.md:set_created:2026-03-22", + "prediction-market-epistemic-mechanisms-separate-into-calibration-selection-and-information-acquisition.md:stripped_wiki_link:speculative markets aggregate information more accurately th" + ], + "rejections": [ + "prediction-market-epistemic-mechanisms-separate-into-calibration-selection-and-information-acquisition.md:missing_attribution_extractor" + ] + }, + "model": "anthropic/claude-sonnet-4.5", + "date": "2026-03-22" +} \ No newline at end of file diff --git a/inbox/queue/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md b/inbox/queue/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md index f2b43a7a7..19020aee5 100644 --- a/inbox/queue/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md +++ b/inbox/queue/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md @@ -7,9 +7,13 @@ date: 2026-03-22 domain: internet-finance secondary_domains: [ai-alignment, collective-intelligence] format: article -status: unprocessed +status: enrichment priority: high tags: [prediction-markets, superforecasters, epistemic-mechanism, skin-in-the-game, belief-1, disconfirmation, academic, mechanism-design] +processed_by: rio +processed_date: 2026-03-22 +enrichments_applied: ["Polymarket vindicated prediction markets over polling in 2024 US election.md"] +extraction_model: "anthropic/claude-sonnet-4.5" --- ## Content @@ -77,3 +81,12 @@ Financial markets up-weight skilled participants via earnings. Calibration algor PRIMARY CONNECTION: [[speculative markets aggregate information more accurately than expert consensus or voting systems]] WHY ARCHIVED: Resolves the Session 8 challenge to Belief #1; establishes the two-mechanism distinction that reframes multiple existing claims about futarchy's epistemic properties EXTRACTION HINT: The claim to extract is the two-mechanism distinction, not just a summary of the academic findings. Focus on Mechanism A (calibration-selection, replicable by polls) vs. Mechanism B (information-acquisition, not replicable). The finding is architecturally important — it should affect multiple existing claims as enrichments. + + +## Key Facts +- GJP beat all IARPA ACE research teams by 35-72% (Brier score) +- GJP beat intelligence community's internal prediction market by 25-30% +- Top superforecaster Year 2 Brier score: 0.14 vs. random guessing 0.53 +- Year-to-year top forecaster correlation: 0.65 +- GJP reportedly outperformed futures markets by 66% at Fed policy inflection points (Financial Times, July 2024) +- Kalshi real-money markets beat Bloomberg consensus for headline CPI and matched realized fed funds rate on FOMC day (Fed FEDS paper, Diercks/Katz/Wright, 2026) -- 2.45.2 From a5a9ee80c8c54cbb1db5a6bee3632677cf96cf6b Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Sun, 22 Mar 2026 22:16:57 +0000 Subject: [PATCH 2/4] pipeline: archive 1 source(s) post-merge Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70> --- ...on-selection-vs-information-acquisition.md | 79 +++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 inbox/archive/internet-finance/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md diff --git a/inbox/archive/internet-finance/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md b/inbox/archive/internet-finance/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md new file mode 100644 index 000000000..4410df8ca --- /dev/null +++ b/inbox/archive/internet-finance/2026-03-22-atanasov-mellers-calibration-selection-vs-information-acquisition.md @@ -0,0 +1,79 @@ +--- +type: source +title: "Superforecasters vs. Prediction Markets: Calibration-Selection Mechanism Can Be Replicated, Information-Acquisition Mechanism Cannot" +author: "Atanasov, Mellers, Tetlock et al. (multiple papers)" +url: https://pubsonline.informs.org/doi/10.1287/mnsc.2015.2374 +date: 2026-03-22 +domain: internet-finance +secondary_domains: [ai-alignment, collective-intelligence] +format: article +status: processed +priority: high +tags: [prediction-markets, superforecasters, epistemic-mechanism, skin-in-the-game, belief-1, disconfirmation, academic, mechanism-design] +--- + +## Content + +Synthesis of the Atanasov/Mellers/Tetlock prediction market vs. calibrated poll literature, with focus on the two-mechanism distinction this session surfaced. + +**Primary sources:** +1. Atanasov, Witkowski, Mellers, Tetlock (2017), "Distilling the Wisdom of Crowds: Prediction Markets vs. Prediction Polls," *Management Science* Vol. 63, No. 3, pp. 691–706 +2. Mellers, Ungar, Baron, Ramos, Gurcay, Fincher, Scott, Moore, Atanasov, Swift, Murray, Stone, Tetlock (2015), "Psychological Strategies for Winning a Geopolitical Forecasting Tournament," *Perspectives on Psychological Science* +3. Atanasov, Witkowski, Mellers, Tetlock (2024), "Crowd Prediction Systems: Markets, Polls, and Elite Forecasters," *International Journal of Forecasting* +4. Mellers, McCoy, Lu, Tetlock (2024), "Human and Algorithmic Predictions in Geopolitical Forecasting," *Perspectives on Psychological Science* + +**Core finding (2017/2024):** When polls are combined with skill-based weighting algorithms (tracking prior performance and behavioral patterns), team polls match or exceed prediction market accuracy for geopolitical event forecasting. Small elite crowds (superforecasters) outperform large crowds; markets and elite-aggregated polls are statistically tied. + +**IARPA ACE tournament results:** +- GJP (Good Judgment Project) beat all research teams by 35–72% (Brier score) +- Beat intelligence community's internal prediction market by 25–30% +- Top superforecaster Year 2: Brier score 0.14 vs. random guessing 0.53 +- Year-to-year top forecaster correlation: 0.65 (skill is real, not luck) + +**The mechanism explanation (critical for claim extraction):** + +Financial markets up-weight skilled participants via earnings. Calibration algorithms replicate this function by tracking performance and assigning higher weight to historically accurate forecasters. Both methods are solving the same problem: suppress noise from poorly-calibrated participants, amplify signal from well-calibrated ones. + +**This is Mechanism A: Calibration selection.** Polls can match markets here because the mechanism is reducible to participant weighting — no financial incentive required. + +**Mechanism B: Information acquisition and strategic revelation.** Financial stakes incentivize participants to acquire costly private information (research, due diligence, insider access) and to reveal it through trades. Disinterested poll respondents have no incentive to acquire costly private information or to reveal it honestly if they hold it. GJP superforecasters work with publicly available information — the IARPA ACE tournament explicitly restricted access to classified sources. The research was not designed to test whether polls match markets in information-asymmetric contexts. + +**Scope of the finding:** +- All tested events: geopolitical (binary outcomes, months-ahead, objective resolution, publicly available information) +- "Algorithm-unfriendly domain" (Mellers 2024) — hard-to-quantify data, elusive reference classes, non-repeatable contexts +- No test in financial selection contexts (stock returns, ICO quality, startup success) +- No test in information-asymmetric contexts where participants have strategic reasons to conceal private information + +**Good Judgment Project track record extension (non-geopolitical):** +- Fed policy prediction: GJP reportedly outperformed futures markets by 66% at Fed policy inflection points (Financial Times, July 2024) +- Federal Reserve FEDS paper (Diercks/Katz/Wright, 2026): Kalshi real-money markets beat Bloomberg consensus for headline CPI; perfectly matched realized fed funds rate on FOMC day +- Both findings consistent: elite forecasters AND real-money markets beat naive consensus; neither outperforms the other on structured macro-event prediction + +**What has not been tested:** Stock return prediction, venture capital selection, ICO quality evaluation, or any financial selection task where the question is not "will event X happen" but "is asset Y worth more than price Z." + +## Agent Notes + +**Why this matters:** This resolves the multi-session threat to Belief #1 from Mellers et al. The challenge was real but domain-scoped. Skin-in-the-game markets have two separable mechanisms — Mellers only tested the one that polls can replicate. The one polls can't replicate (information acquisition and strategic revelation) is exactly what matters for futarchy in financial selection. + +**What surprised me:** The 2024 update explicitly calls geopolitical forecasting an "algorithm-unfriendly domain" — distinguishing it from financial forecasting where algorithmic approaches have richer structured data. The Mellers team themselves implicitly acknowledge the domain transfer problem. + +**What I expected but didn't find:** Any study testing calibrated polls vs. prediction markets for financial selection (ICO evaluation, startup quality, investment return). The gap in the literature is almost total on this question. The Optimism futarchy experiment (conditional prediction markets for grant selection) is the closest thing, and it failed — but for implementation reasons. + +**KB connections:** +- [[speculative markets aggregate information more accurately than expert consensus or voting systems]] — this claim needs the two-mechanism distinction added to be precise +- FairScale case (Session 4): Mechanism B failure — fraud detection requires off-chain due diligence that market participants weren't incentivized to find +- Trove Markets fraud (Session 8): Same pattern — Mechanism B failure, not Mechanism A +- Participation concentration (70% top 50): Mechanism A is working fine (50 calibrated participants selecting); the question is whether Mechanism B is generating information acquisition from those participants + +**Extraction hints:** +- PRIMARY CLAIM CANDIDATE: "Skin-in-the-game markets have two separable epistemic mechanisms with different replaceability" — the calibration-selection mechanism can be replicated by calibrated aggregation; the information-acquisition mechanism cannot. This distinction determines when prediction markets are epistemically necessary. +- SECONDARY CLAIM: "Prediction market accuracy advantages over polls are domain-dependent — competitive polls can match market accuracy in public-information-synthesis contexts but not in information-asymmetric selection contexts" +- ENRICHMENT TARGET: [[speculative markets aggregate information more accurately than expert consensus or voting systems]] — add two-mechanism scope qualifier + +**Context:** This research addresses the core "why do markets work" question that the futarchy thesis depends on. Mellers et al. is the most-cited academic challenge to prediction market epistemic superiority. Resolving it with a scope mismatch rather than a refutation is a significant outcome for the KB's claim structure. + +## Curator Notes + +PRIMARY CONNECTION: [[speculative markets aggregate information more accurately than expert consensus or voting systems]] +WHY ARCHIVED: Resolves the Session 8 challenge to Belief #1; establishes the two-mechanism distinction that reframes multiple existing claims about futarchy's epistemic properties +EXTRACTION HINT: The claim to extract is the two-mechanism distinction, not just a summary of the academic findings. Focus on Mechanism A (calibration-selection, replicable by polls) vs. Mechanism B (information-acquisition, not replicable). The finding is architecturally important — it should affect multiple existing claims as enrichments. -- 2.45.2 From 85af09a5b9384ead3b1981055e9a56f4b54908c1 Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Sun, 22 Mar 2026 22:17:14 +0000 Subject: [PATCH 3/4] entity-batch: update 1 entities - Applied 1 entity operations from queue - Files: entities/internet-finance/kalshi.md Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA> --- entities/internet-finance/kalshi.md | 1 + 1 file changed, 1 insertion(+) diff --git a/entities/internet-finance/kalshi.md b/entities/internet-finance/kalshi.md index f9bdf09b5..883e54c30 100644 --- a/entities/internet-finance/kalshi.md +++ b/entities/internet-finance/kalshi.md @@ -52,6 +52,7 @@ CFTC-designated contract market for event-based trading. USD-denominated, KYC-re - **2026-03-17** — Arizona AG filed 20 criminal counts including illegal gambling and election wagering — first-ever criminal charges against a US prediction market platform - **2026-01-09** — Tennessee court ruled in favor of Kalshi in KalshiEx v. Orgel, finding impossibility of dual compliance and obstacle to federal objectives, creating circuit split with Maryland - **2026-03-19** — Ninth Circuit denied administrative stay motion, allowing Nevada to proceed with temporary restraining order that would exclude Kalshi from Nevada for at least two weeks pending preliminary injunction hearing +- **2026-03-16** — Federal Reserve Board paper validates Kalshi prediction market accuracy, showing statistically significant improvement over Bloomberg consensus for CPI forecasting and perfect FOMC rate matching ## Competitive Position - **Regulation-first**: Only CFTC-designated prediction market exchange. Institutional credibility. - **vs Polymarket**: Different market — Kalshi targets mainstream/institutional users who won't touch crypto. Polymarket targets crypto-native users who want permissionless market creation. Both grew massively post-2024 election. -- 2.45.2 From b6cbf8618ec90a8fa30c361c2a84747cfb3285dd Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Sun, 22 Mar 2026 22:16:44 +0000 Subject: [PATCH 4/4] extract: 2026-03-22-fed-research-kalshi-cpi-prediction-accuracy Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70> --- ...-fed-research-kalshi-cpi-prediction-accuracy.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/inbox/queue/2026-03-22-fed-research-kalshi-cpi-prediction-accuracy.md b/inbox/queue/2026-03-22-fed-research-kalshi-cpi-prediction-accuracy.md index 2cd0e0728..9e7330a22 100644 --- a/inbox/queue/2026-03-22-fed-research-kalshi-cpi-prediction-accuracy.md +++ b/inbox/queue/2026-03-22-fed-research-kalshi-cpi-prediction-accuracy.md @@ -7,9 +7,12 @@ date: 2026-03-16 domain: internet-finance secondary_domains: [] format: article -status: unprocessed +status: enrichment priority: medium tags: [prediction-markets, kalshi, federal-reserve, cpi, accuracy, academic, markets-beat-consensus, macro-forecasting] +processed_by: rio +processed_date: 2026-03-22 +extraction_model: "anthropic/claude-sonnet-4.5" --- ## Content @@ -56,3 +59,12 @@ A Federal Reserve Board paper (authors: Diercks, Katz, Wright) published March 2 PRIMARY CONNECTION: [[speculative markets aggregate information more accurately than expert consensus or voting systems]] WHY ARCHIVED: Federal Reserve institutional validation of real-money prediction market accuracy; complements the Mellers academic literature and rounds out the evidence base for Belief #1's grounding claims EXTRACTION HINT: Archive as supporting evidence for the prediction markets accuracy claim, scoped to "structured macroeconomic event prediction." The FOMC-day perfect match finding is the most archivable specific claim. Note it doesn't address financial selection. + + +## Key Facts +- Federal Reserve Board published FEDS paper by Diercks, Katz, Wright in March 2026 evaluating Kalshi prediction market accuracy +- Kalshi markets showed statistically significant improvement over Bloomberg consensus for headline CPI prediction +- Kalshi markets achieved parity with Bloomberg consensus for core CPI and unemployment forecasting +- Kalshi perfectly matched realized fed funds rate on the day before every FOMC meeting since 2022 +- Fed paper published same day as CFTC ANPRM (March 16, 2026) +- Good Judgment Project superforecasters reportedly outperformed futures markets for Fed policy predictions by 66% (FT, July 2024) -- 2.45.2