Sync Graph Data to teleo-app / sync (push) Waiting to run

Details

Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.

2026-03-25 01:28:31 +00:00

7.6 KiB

Raw Blame History

type

domain

secondary_domains

description

confidence

source

created

depends_on

claim

internet-finance

collective-intelligence

Optimism's futarchy experiment outperformed traditional grants by $32.5M TVL but overshot magnitude predictions by 8x, revealing mechanism's strength is comparative ranking not absolute forecasting

experimental

Optimism Futarchy v1 Preliminary Findings (2025-06-12), 21-day experiment with 430 forecasters

2025-06-12

MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions.md

Futarchy excels at relative selection but fails at absolute prediction because ordinal ranking works while cardinal estimation requires calibration

Optimism's 21-day futarchy experiment (March-June 2025) reveals a critical distinction between futarchy's selection capability and prediction accuracy. The mechanism selected grants that outperformed traditional Grants Council picks by ~~$32.5M TVL, primarily through choosing Balancer & Beets (~~$27.8M gain) over Grants Council alternatives. Both methods converged on 2 of 5 projects (Rocket Pool, SuperForm), but futarchy's unique selections drove superior aggregate outcomes.

However, prediction accuracy was catastrophically poor. Markets predicted aggregate TVL increase of ~$239M against actual ~$31M—an 8x overshoot. Specific misses: Rocket Pool predicted $59.4M (actual: 0), SuperForm predicted $48.5M (actual: -$1.2M), Balancer & Beets predicted $47.9M (actual: -$13.7M despite being the top performer).

The mechanism's strength is ordinal ranking weighted by conviction—markets correctly identified which projects would perform better relative to alternatives. The failure is cardinal estimation—markets could not calibrate absolute magnitudes. This suggests futarchy works through comparative advantage assessment ("this will outperform that") rather than precise forecasting ("this will generate exactly $X").

Contributing factors to prediction failure: play-money environment created no downside risk for inflated predictions; $50M initial liquidity anchor may have skewed price discovery; strategic voting to influence allocations; TVL metric conflated ETH price movements with project quality.

Evidence

Optimism Futarchy v1 experiment: 430 active forecasters, 5,898 trades, selected 5 of 23 grant candidates
Selection performance: futarchy +$32.5M vs Grants Council, driven by Balancer & Beets (+$27.8M)
Prediction accuracy: predicted $239M aggregate TVL, actual $31M (8x overshoot)
Individual project misses: Rocket Pool 0 vs $59.4M predicted, SuperForm -$1.2M vs $48.5M predicted, Balancer & Beets -$13.7M vs $47.9M predicted
Play-money structure: no real capital at risk, 41% of participants hedged in final days to avoid losses

Challenges

This was a play-money experiment, which is the primary confound. Real-money futarchy may produce different calibration through actual downside risk. The 84-day measurement window may have been too short for TVL impact to materialize. ETH price volatility during the measurement period confounded project-specific performance attribution.

Additional Evidence (extend)

Source: 2024-11-25-futardio-proposal-launch-a-boost-for-hnt-ore | Added: 2026-03-12 | Extractor: anthropic/claude-sonnet-4.5

ORE's HNT-ORE boost proposal demonstrates futarchy's strength in relative selection: the market validated HNT as the next liquidity pair to boost relative to other candidates (ISC already had a boost at equivalent multiplier), but the proposal does not require absolute prediction of HNT's future price or utility—only that HNT is a better strategic choice than alternatives. The proposal passed by market consensus on relative positioning (HNT as flagship DePIN project post-HIP-138), not by predicting absolute HNT performance metrics.

Additional Evidence (confirm)

Source: 2024-11-25-futardio-proposal-launch-a-boost-for-hnt-ore | Added: 2026-03-16

ORE's three-tier boost multiplier system (vanilla stake, critical pairs, extended pairs) demonstrates futarchy's strength at relative ranking. The proposal doesn't require markets to predict absolute HNT-ORE liquidity outcomes, only to rank this boost against alternatives. Future proposals apply to tiers as wholes, further simplifying the ordinal comparison task.

Additional Evidence (extend)

Source: 2026-03-05-futardio-launch-blockrock | Added: 2026-03-16

BlockRock explicitly argues futarchy works better for liquid asset allocation than illiquid VC: 'Futarchy governance works by letting markets price competing outcomes, but private VC deals are difficult to price with asymmetric information, long timelines, and binary outcomes. Liquid asset allocation for risk-adjusted returns gives futarchy the pricing efficiency it requires.' This identifies information asymmetry and timeline as the boundary conditions where futarchy pricing breaks down.

Additional Evidence (extend)

Source: 2026-03-21-blockworks-ranger-ico-outcome | Added: 2026-03-21

Ranger Finance case shows futarchy can succeed at ordinal selection (this project vs. others for fundraising) while failing at cardinal prediction (what will the token price be post-TGE given unlock schedules). The market selected Ranger successfully for ICO but didn't price in the 40% seed unlock creating 74-90% drawdown, suggesting the mechanism works for relative comparison but not for absolute outcome forecasting when structural features like vesting schedules matter.

Additional Evidence (challenge)

Source: 2026-03-21-phemex-hurupay-ico-failure | Added: 2026-03-21

Hurupay had $7.2M/month transaction volume and $500K+ monthly revenue but failed to raise $3M. The market rejection is interpretively ambiguous: either (A) correct valuation assessment (mechanism working) or (B) platform reputation contamination from prior Trove/Ranger failures (mechanism producing noise). Without controls, we cannot distinguish quality signal from sentiment contagion, revealing a fundamental limitation in interpreting futarchy selection outcomes.

Additional Evidence (extend)

Source: 2026-03-24-gg-research-futarchy-vs-grants-council-optimism-experiment | Added: 2026-03-24

The Optimism comparison adds the EV vs. variance dimension: futarchy's relative selection advantage (+$32.5M aggregate TVL) held despite 8x absolute prediction overshoot. The selection quality (which projects to fund) was superior even when the prediction quality (how much TVL they would generate) was catastrophically wrong. This suggests the relative selection mechanism is robust to calibration failures.

Additional Evidence (extend)

Source: 2026-03-23-ranger-finance-metadao-liquidation-5m-usdc | Added: 2026-03-25

Ranger Finance reveals a critical scope boundary: futarchy's ICO selection market chose the project without pricing in false volume claims during fundraising (~$8M raised), but POST-discovery, the liquidation governance mechanism worked decisively. The mechanism is better at enforcing governance decisions after information emerges than at doing pre-launch due diligence with thin markets and off-chain information asymmetries. This suggests futarchy handles relative selection among known options better than absolute quality assessment with hidden information.

Relevant Notes:

MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions.md
speculative markets aggregate information through incentive and selection effects not wisdom of crowds.md
optimal governance requires mixing mechanisms because different decisions have different manipulation risk profiles.md

Topics:

domains/internet-finance/_map
foundations/collective-intelligence/_map

7.6 KiB Raw Blame History

Futarchy excels at relative selection but fails at absolute prediction because ordinal ranking works while cardinal estimation requires calibration

Evidence

Challenges

Additional Evidence (extend)

Additional Evidence (confirm)

Additional Evidence (extend)

Additional Evidence (extend)

Additional Evidence (challenge)

Additional Evidence (extend)

Additional Evidence (extend)

7.6 KiB

Raw Blame History