Compare commits

...

7 commits

Author SHA1 Message Date
Teleo Agents
8299f0abfd extract: 2025-00-00-em-dpo-heterogeneous-preferences
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-16 14:47:39 +00:00
Leo
29a7e87561 Merge pull request 'extract: 2026-03-05-futardio-launch-phonon-studio-ai' (#1125) from extract/2026-03-05-futardio-launch-phonon-studio-ai into main
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
2026-03-16 14:38:33 +00:00
Teleo Agents
0cddd00834 auto-fix: strip 1 broken wiki links
Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.
2026-03-16 14:38:31 +00:00
Teleo Agents
addb1a0ae4 extract: 2026-03-05-futardio-launch-phonon-studio-ai
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-16 14:38:31 +00:00
Leo
0de2d6f707 Merge pull request 'extract: 2026-02-00-an-differentiable-social-choice' (#1113) from extract/2026-02-00-an-differentiable-social-choice into main
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
2026-03-16 14:36:55 +00:00
Teleo Agents
79bb2e382b auto-fix: strip 4 broken wiki links
Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.
2026-03-16 14:36:53 +00:00
Teleo Agents
5d73336c5c extract: 2026-02-00-an-differentiable-social-choice
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-16 14:36:53 +00:00
11 changed files with 182 additions and 8 deletions

View file

@ -37,6 +37,12 @@ Chakraborty et al., "MaxMin-RLHF: Alignment with Diverse Human Preferences," ICM
- Tulu2-7B: 56.67% win rate across both groups vs 42% minority/70.4% majority for single reward
- 33% improvement for minority groups without majority compromise
### Additional Evidence (extend)
*Source: [[2025-00-00-em-dpo-heterogeneous-preferences]] | Added: 2026-03-16*
MMRA extends maxmin RLHF to the deployment phase by minimizing maximum regret across preference groups when user type is unknown at inference, showing how egalitarian principles can govern both training and inference in pluralistic systems.
---
Relevant Notes:

View file

@ -25,6 +25,12 @@ Since [[universal alignment is mathematically impossible because Arrows impossib
MaxMin-RLHF provides a constructive implementation of pluralistic alignment through mixture-of-rewards and egalitarian optimization. Rather than converging preferences, it learns separate reward models for each subpopulation and optimizes for the worst-off group (Sen's Egalitarian principle). At Tulu2-7B scale, this achieved 56.67% win rate across both majority and minority groups, compared to single-reward's 70.4%/42% split. The mechanism accommodates irreducible diversity by maintaining separate reward functions rather than forcing convergence.
### Additional Evidence (confirm)
*Source: [[2025-00-00-em-dpo-heterogeneous-preferences]] | Added: 2026-03-16*
EM-DPO implements this through ensemble architecture: discovers K latent preference types, trains K specialized models, and deploys them simultaneously with egalitarian aggregation. Demonstrates that pluralistic alignment is technically feasible without requiring demographic labels or manual preference specification.
---
Relevant Notes:

View file

@ -29,10 +29,22 @@ The paper's proposed solution—RLCHF with explicit social welfare functions—c
### Additional Evidence (extend)
*Source: [[2025-06-00-li-scaling-human-judgment-community-notes-llms]] | Added: 2026-03-15*
*Source: 2025-06-00-li-scaling-human-judgment-community-notes-llms | Added: 2026-03-15*
RLCF makes the social choice mechanism explicit through the bridging algorithm (matrix factorization with intercept scores). Unlike standard RLHF which aggregates preferences opaquely through reward model training, RLCF's use of intercepts as the training signal is a deliberate choice to optimize for cross-partisan agreement—a specific social welfare function.
### Additional Evidence (confirm)
*Source: [[2026-02-00-an-differentiable-social-choice]] | Added: 2026-03-16*
Comprehensive February 2026 survey by An & Du documents that contemporary ML systems implement social choice mechanisms implicitly across RLHF, participatory budgeting, and liquid democracy applications, with 18 identified open problems spanning incentive guarantees and pluralistic preference aggregation.
### Additional Evidence (extend)
*Source: [[2025-00-00-em-dpo-heterogeneous-preferences]] | Added: 2026-03-16*
EM-DPO makes the social choice function explicit by using MinMax Regret Aggregation based on egalitarian fairness principles, demonstrating that pluralistic alignment requires choosing a specific social welfare function (here: maximin regret) rather than pretending aggregation is value-neutral.
---
Relevant Notes:

View file

@ -29,10 +29,22 @@ Chakraborty, Qiu, Yuan, Koppel, Manocha, Huang, Bedi, Wang. "MaxMin-RLHF: Alignm
### Additional Evidence (confirm)
*Source: [[2025-11-00-operationalizing-pluralistic-values-llm-alignment]] | Added: 2026-03-15*
*Source: 2025-11-00-operationalizing-pluralistic-values-llm-alignment | Added: 2026-03-15*
Study demonstrates that models trained on different demographic populations show measurable behavioral divergence (3-5 percentage points), providing empirical evidence that single-reward functions trained on one population systematically misalign with others.
### Additional Evidence (extend)
*Source: [[2026-02-00-an-differentiable-social-choice]] | Added: 2026-03-16*
An & Du's survey reveals the mechanism behind single-reward failure: RLHF is doing social choice (preference aggregation) but treating it as an engineering detail rather than a normative design choice, which means the aggregation function is chosen implicitly and without examination of which fairness criteria it satisfies.
### Additional Evidence (extend)
*Source: [[2025-00-00-em-dpo-heterogeneous-preferences]] | Added: 2026-03-16*
EM-DPO provides formal proof that binary comparisons are mathematically insufficient for preference type identification, explaining WHY single-reward RLHF fails: the training signal format cannot contain the information needed to discover heterogeneity, regardless of dataset size. Rankings over 3+ responses are necessary.
---
Relevant Notes:

View file

@ -27,6 +27,12 @@ From the MetaDAO proposal:
This claim extends futarchy-governed-permissionless-launches-require-brand-separation-to-manage-reputational-liability-because-failed-projects-on-a-curated-platform-damage-the-platforms-credibility by showing the reputational concern operates at the mechanism level, not just the platform level. The market's rejection of Futardio suggests futarchy stakeholders prioritize mechanism credibility over short-term adoption metrics.
### Additional Evidence (confirm)
*Source: [[2026-03-05-futardio-launch-phonon-studio-ai]] | Added: 2026-03-16*
Phonon Studio AI raised $88,888 target but ended in 'Refunding' status within one day (launched 2026-03-05, closed 2026-03-06). The project had live product traction (1000+ songs generated in first week, functional tokenized AI artist logic) but still failed to attract capital, suggesting futarchy-governed launches face quality perception issues even when projects demonstrate real product-market validation.
---
Relevant Notes:

View file

@ -56,10 +56,16 @@ Hurupay raised $2,003,593 against a $3,000,000 target (67% of goal) and entered
### Additional Evidence (challenge)
*Source: [[2026-03-03-futardio-launch-cloak]] | Added: 2026-03-16*
*Source: 2026-03-03-futardio-launch-cloak | Added: 2026-03-16*
Cloak raised only $1,455 against a $300,000 target (0.5% of target), entering refunding status. This represents a near-total failure of market validation, contrasting sharply with the 15x oversubscription pattern. The project had shipped product (live mainnet beta with Oro integration), had credible team (repeat builders, Superteam contributors), and addressed a real problem (MEV extraction on DCA orders). Despite these fundamentals, the futarchy-governed raise failed to attract capital, suggesting that product-market fit and team credibility are insufficient without pre-existing community or distribution.
### Additional Evidence (challenge)
*Source: [[2026-03-05-futardio-launch-phonon-studio-ai]] | Added: 2026-03-16*
Phonon Studio AI launch failed to reach its $88,888 target and entered refunding status, demonstrating that not all futarchy-governed raises succeed. The project had demonstrable traction (live product, 1000+ songs generated, functional token mechanics) but still failed to attract sufficient capital, suggesting futarchy capital formation success is not uniform across project types or market conditions.
---
Relevant Notes:

View file

@ -0,0 +1,48 @@
{
"rejected_claims": [
{
"filename": "binary-preference-comparisons-cannot-identify-latent-preference-types-making-pairwise-rlhf-structurally-blind-to-diversity.md",
"issues": [
"missing_attribution_extractor"
]
},
{
"filename": "em-algorithm-preference-clustering-discovers-latent-user-types-without-demographic-labels-enabling-unsupervised-pluralistic-alignment.md",
"issues": [
"missing_attribution_extractor"
]
},
{
"filename": "minmax-regret-aggregation-ensures-no-preference-group-is-severely-underserved-by-applying-egalitarian-fairness-to-ensemble-deployment.md",
"issues": [
"missing_attribution_extractor"
]
}
],
"validation_stats": {
"total": 3,
"kept": 0,
"fixed": 11,
"rejected": 3,
"fixes_applied": [
"binary-preference-comparisons-cannot-identify-latent-preference-types-making-pairwise-rlhf-structurally-blind-to-diversity.md:set_created:2026-03-16",
"binary-preference-comparisons-cannot-identify-latent-preference-types-making-pairwise-rlhf-structurally-blind-to-diversity.md:stripped_wiki_link:single-reward-rlhf-cannot-align-diverse-preferences-because-",
"binary-preference-comparisons-cannot-identify-latent-preference-types-making-pairwise-rlhf-structurally-blind-to-diversity.md:stripped_wiki_link:rlhf-is-implicit-social-choice-without-normative-scrutiny.md",
"binary-preference-comparisons-cannot-identify-latent-preference-types-making-pairwise-rlhf-structurally-blind-to-diversity.md:stripped_wiki_link:pluralistic alignment must accommodate irreducibly diverse v",
"em-algorithm-preference-clustering-discovers-latent-user-types-without-demographic-labels-enabling-unsupervised-pluralistic-alignment.md:set_created:2026-03-16",
"em-algorithm-preference-clustering-discovers-latent-user-types-without-demographic-labels-enabling-unsupervised-pluralistic-alignment.md:stripped_wiki_link:modeling preference sensitivity as a learned distribution ra",
"em-algorithm-preference-clustering-discovers-latent-user-types-without-demographic-labels-enabling-unsupervised-pluralistic-alignment.md:stripped_wiki_link:pluralistic alignment must accommodate irreducibly diverse v",
"minmax-regret-aggregation-ensures-no-preference-group-is-severely-underserved-by-applying-egalitarian-fairness-to-ensemble-deployment.md:set_created:2026-03-16",
"minmax-regret-aggregation-ensures-no-preference-group-is-severely-underserved-by-applying-egalitarian-fairness-to-ensemble-deployment.md:stripped_wiki_link:maxmin-rlhf-applies-egalitarian-social-choice-to-alignment-b",
"minmax-regret-aggregation-ensures-no-preference-group-is-severely-underserved-by-applying-egalitarian-fairness-to-ensemble-deployment.md:stripped_wiki_link:post-arrow-social-choice-mechanisms-work-by-weakening-indepe",
"minmax-regret-aggregation-ensures-no-preference-group-is-severely-underserved-by-applying-egalitarian-fairness-to-ensemble-deployment.md:stripped_wiki_link:minority-preference-alignment-improves-33-percent-without-ma"
],
"rejections": [
"binary-preference-comparisons-cannot-identify-latent-preference-types-making-pairwise-rlhf-structurally-blind-to-diversity.md:missing_attribution_extractor",
"em-algorithm-preference-clustering-discovers-latent-user-types-without-demographic-labels-enabling-unsupervised-pluralistic-alignment.md:missing_attribution_extractor",
"minmax-regret-aggregation-ensures-no-preference-group-is-severely-underserved-by-applying-egalitarian-fairness-to-ensemble-deployment.md:missing_attribution_extractor"
]
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-16"
}

View file

@ -0,0 +1,42 @@
{
"rejected_claims": [
{
"filename": "rlhf-implements-implicit-social-choice-without-normative-scrutiny.md",
"issues": [
"missing_attribution_extractor"
]
},
{
"filename": "impossibility-results-become-optimization-tradeoffs-in-learned-mechanisms.md",
"issues": [
"missing_attribution_extractor"
]
},
{
"filename": "inverse-mechanism-learning-can-detect-implicit-social-choice-functions.md",
"issues": [
"missing_attribution_extractor"
]
}
],
"validation_stats": {
"total": 3,
"kept": 0,
"fixed": 5,
"rejected": 3,
"fixes_applied": [
"rlhf-implements-implicit-social-choice-without-normative-scrutiny.md:set_created:2026-03-16",
"rlhf-implements-implicit-social-choice-without-normative-scrutiny.md:stripped_wiki_link:universal-alignment-is-mathematically-impossible-because-Arr",
"impossibility-results-become-optimization-tradeoffs-in-learned-mechanisms.md:set_created:2026-03-16",
"impossibility-results-become-optimization-tradeoffs-in-learned-mechanisms.md:stripped_wiki_link:universal-alignment-is-mathematically-impossible-because-Arr",
"inverse-mechanism-learning-can-detect-implicit-social-choice-functions.md:set_created:2026-03-16"
],
"rejections": [
"rlhf-implements-implicit-social-choice-without-normative-scrutiny.md:missing_attribution_extractor",
"impossibility-results-become-optimization-tradeoffs-in-learned-mechanisms.md:missing_attribution_extractor",
"inverse-mechanism-learning-can-detect-implicit-social-choice-functions.md:missing_attribution_extractor"
]
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-16"
}

View file

@ -7,9 +7,13 @@ date: 2025-01-01
domain: ai-alignment
secondary_domains: []
format: paper
status: unprocessed
status: enrichment
priority: medium
tags: [pluralistic-alignment, EM-algorithm, preference-clustering, ensemble-LLM, fairness]
processed_by: theseus
processed_date: 2026-03-16
enrichments_applied: ["single-reward-rlhf-cannot-align-diverse-preferences-because-alignment-gap-grows-proportional-to-minority-distinctiveness.md", "rlhf-is-implicit-social-choice-without-normative-scrutiny.md", "pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state.md", "maxmin-rlhf-applies-egalitarian-social-choice-to-alignment-by-maximizing-minimum-utility-across-preference-groups.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content
@ -39,3 +43,10 @@ EM-DPO uses expectation-maximization to simultaneously uncover latent user prefe
PRIMARY CONNECTION: RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values
WHY ARCHIVED: The binary-comparison insufficiency claim is a novel formal result that strengthens the case against standard alignment approaches
EXTRACTION HINT: Focus on the formal insufficiency of binary comparisons and the EM + egalitarian aggregation combination
## Key Facts
- EM-DPO presented at EAAMO 2025 (Equity and Access in Algorithms, Mechanisms, and Optimization)
- EM-DPO uses rankings over 3+ responses rather than binary comparisons for preference data
- MinMax Regret Aggregation is based on egalitarian social choice theory
- The paper focuses on fairness rather than efficiency, distinguishing it from PAL's approach

View file

@ -7,10 +7,14 @@ date: 2026-02-01
domain: ai-alignment
secondary_domains: [mechanisms, collective-intelligence]
format: paper
status: unprocessed
status: enrichment
priority: medium
tags: [differentiable-social-choice, learned-mechanisms, voting-rules, rlhf-as-voting, impossibility-as-tradeoff, open-problems]
flagged_for_rio: ["Differentiable auctions and economic mechanisms — direct overlap with mechanism design territory"]
processed_by: theseus
processed_date: 2026-03-16
enrichments_applied: ["rlhf-is-implicit-social-choice-without-normative-scrutiny.md", "single-reward-rlhf-cannot-align-diverse-preferences-because-alignment-gap-grows-proportional-to-minority-distinctiveness.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content
@ -40,8 +44,8 @@ Published February 2026. Comprehensive survey of differentiable social choice
**What I expected but didn't find:** No specific engagement with RLCF or bridging-based approaches. The paper is a survey, not a solution proposal.
**KB connections:**
- [[designing coordination rules is categorically different from designing coordination outcomes]] — differentiable social choice designs rules that learn outcomes
- [[universal alignment is mathematically impossible because Arrows impossibility theorem applies]] — impossibility results become optimization constraints
- designing coordination rules is categorically different from designing coordination outcomes — differentiable social choice designs rules that learn outcomes
- universal alignment is mathematically impossible because Arrows impossibility theorem applies — impossibility results become optimization constraints
**Extraction hints:** Claims about (1) RLHF as implicit social choice without normative scrutiny, (2) impossibility results as optimization trade-offs not brick walls, (3) differentiable mechanisms as learnable alternatives to designed ones.
@ -51,3 +55,10 @@ Published February 2026. Comprehensive survey of differentiable social choice
PRIMARY CONNECTION: [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]]
WHY ARCHIVED: RLHF-as-social-choice framing + impossibility-as-optimization-tradeoff = new lens on our coordination thesis
EXTRACTION HINT: Focus on "RLHF is implicit social choice" and "impossibility as optimization trade-off" — these are the novel framing claims
## Key Facts
- An & Du published comprehensive survey of differentiable social choice in February 2026
- Survey identifies 18 open problems in the field
- Six interconnected domains surveyed: differentiable economics, neural social choice, AI alignment as social choice, participatory budgeting, liquid democracy, inverse mechanism learning
- Field of differentiable social choice emerged within last 5 years

View file

@ -6,9 +6,13 @@ url: "https://www.futard.io/launch/x1yqPH8mutuiqkrz66DPwFw1ykQqT4v5KyUUtUzBgPA"
date: 2026-03-05
domain: internet-finance
format: data
status: unprocessed
status: enrichment
tags: [futardio, metadao, futarchy, solana]
event_type: launch
processed_by: rio
processed_date: 2026-03-16
enrichments_applied: ["futarchy-governed-memecoin-launchpads-face-reputational-risk-tradeoff-between-adoption-and-credibility.md", "metadao-ico-platform-demonstrates-15x-oversubscription-validating-futarchy-governed-capital-formation.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Launch Details
@ -173,3 +177,13 @@ Phonon is already live which means there is real product market validation, meas
- Token mint: `J697wnGGP8yWhYSrrMNsfH7cpKqp8up4uteigCHZmeta`
- Version: v0.7
- Closed: 2026-03-06
## Key Facts
- Phonon Studio AI launched on Futardio 2026-03-05 with $88,888 USDC target
- Phonon Studio AI fundraise entered refunding status by 2026-03-06
- Phonon generated 1000+ AI songs in first week of operation
- Phonon uses Meteora Dynamic Bonding Pool protocol for artist token trading
- Phonon proposed $11,777 monthly operational allowance
- Phonon token: J69, mint address J697wnGGP8yWhYSrrMNsfH7cpKqp8up4uteigCHZmeta
- Phonon launch address: x1yqPH8mutuiqkrz66DPwFw1ykQqT4v5KyUUtUzBgPA