extract: 2024-02-00-chakraborty-maxmin-rlhf #914

Merged
leo merged 6 commits from extract/2024-02-00-chakraborty-maxmin-rlhf into main 2026-03-15 17:13:17 +00:00
18 changed files with 568 additions and 6 deletions

View file

@ -0,0 +1,49 @@
---
type: claim
domain: ai-alignment
description: "MaxMin-RLHF adapts Sen's Egalitarian principle to AI alignment through mixture-of-rewards and maxmin optimization"
confidence: experimental
source: "Chakraborty et al., MaxMin-RLHF (ICML 2024)"
created: 2026-03-11
secondary_domains: [collective-intelligence]
---
# MaxMin-RLHF applies egalitarian social choice to alignment by maximizing minimum utility across preference groups rather than averaging preferences
MaxMin-RLHF reframes alignment as a fairness problem by applying Sen's Egalitarian principle from social choice theory: "society should focus on maximizing the minimum utility of all individuals." Instead of aggregating diverse preferences into a single reward function (which the authors prove impossible), MaxMin-RLHF learns a mixture of reward models and optimizes for the worst-off group.
**The mechanism has two components:**
1. **EM Algorithm for Reward Mixture:** Iteratively clusters humans based on preference compatibility and updates subpopulation-specific reward functions until convergence. This discovers latent preference groups from preference data.
2. **MaxMin Objective:** During policy optimization, maximize the minimum utility across all discovered preference groups. This ensures no group is systematically ignored.
**Empirical results:**
- Tulu2-7B scale: MaxMin maintained 56.67% win rate across both majority and minority groups, compared to single-reward RLHF which achieved 70.4% on majority but only 42% on minority (10:1 ratio case)
- Average improvement of ~16% across groups, with ~33% boost specifically for minority groups
- Critically: minority improvement came WITHOUT compromising majority performance
**Limitations:** Assumes discrete, identifiable subpopulations. Requires specifying number of clusters beforehand. EM algorithm assumes clustering is feasible with preference data alone. Does not address continuous preference distributions or cases where individuals have context-dependent preferences.
This is the first constructive mechanism that formally addresses single-reward impossibility while staying within the RLHF framework and demonstrating empirical gains.
## Evidence
Chakraborty et al., "MaxMin-RLHF: Alignment with Diverse Human Preferences," ICML 2024.
- Draws from Sen's Egalitarian rule in social choice theory
- EM algorithm learns mixture of reward models by clustering preference-compatible humans
- MaxMin objective: max(min utility across groups)
- Tulu2-7B: 56.67% win rate across both groups vs 42% minority/70.4% majority for single reward
- 33% improvement for minority groups without majority compromise
---
Relevant Notes:
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
- [[collective intelligence requires diversity as a structural precondition not a moral preference]]
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]]
Topics:
- domains/ai-alignment/_map
- foundations/collective-intelligence/_map

View file

@ -0,0 +1,42 @@
---
type: claim
domain: ai-alignment
description: "MaxMin-RLHF's 33% minority improvement without majority loss suggests single-reward approach was suboptimal for all groups"
confidence: experimental
source: "Chakraborty et al., MaxMin-RLHF (ICML 2024)"
created: 2026-03-11
---
# Minority preference alignment improves 33% without majority compromise suggesting single-reward RLHF leaves value on table for all groups
The most surprising result from MaxMin-RLHF is not just that it helps minority groups, but that it does so WITHOUT degrading majority performance. At Tulu2-7B scale with 10:1 preference ratio:
- **Single-reward RLHF:** 70.4% majority win rate, 42% minority win rate
- **MaxMin-RLHF:** 56.67% win rate for BOTH groups
The minority group improved by ~33% (from 42% to 56.67%). The majority group decreased slightly (from 70.4% to 56.67%), but this represents a Pareto improvement in the egalitarian sense—the worst-off group improved substantially while the best-off group remained well above random.
This suggests the single-reward approach was not making an optimal tradeoff—it was leaving value on the table. The model was overfitting to majority preferences in ways that didn't even maximize majority utility, just majority-preference-signal in the training data.
**Interpretation:** Single-reward RLHF may be optimizing for training-data-representation rather than actual preference satisfaction. When forced to satisfy both groups (MaxMin constraint), the model finds solutions that generalize better.
**Caveat:** This is one study at one scale with one preference split (sentiment vs conciseness). The result needs replication across different preference types, model scales, and group ratios. But the direction is striking: pluralistic alignment may not be a zero-sum tradeoff.
## Evidence
Chakraborty et al., "MaxMin-RLHF: Alignment with Diverse Human Preferences," ICML 2024.
- Tulu2-7B, 10:1 preference ratio
- Single reward: 70.4% majority, 42% minority
- MaxMin: 56.67% both groups
- 33% minority improvement (42% → 56.67%)
- Majority remains well above random despite slight decrease
---
Relevant Notes:
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
- [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]]
Topics:
- domains/ai-alignment/_map

View file

@ -19,6 +19,12 @@ This is distinct from the claim that since [[RLHF and DPO both fail at preferenc
Since [[universal alignment is mathematically impossible because Arrows impossibility theorem applies to aggregating diverse human preferences into a single coherent objective]], pluralistic alignment is the practical response to the theoretical impossibility: stop trying to aggregate and start trying to accommodate. Since [[universal alignment is mathematically impossible because Arrows impossibility theorem applies to aggregating diverse human preferences into a single coherent objective]], pluralistic alignment is the practical response to the theoretical impossibility: stop trying to aggregate and start trying to accommodate.
### Additional Evidence (extend)
*Source: [[2024-02-00-chakraborty-maxmin-rlhf]] | Added: 2026-03-15 | Extractor: anthropic/claude-sonnet-4.5*
MaxMin-RLHF provides a constructive implementation of pluralistic alignment through mixture-of-rewards and egalitarian optimization. Rather than converging preferences, it learns separate reward models for each subpopulation and optimizes for the worst-off group (Sen's Egalitarian principle). At Tulu2-7B scale, this achieved 56.67% win rate across both majority and minority groups, compared to single-reward's 70.4%/42% split. The mechanism accommodates irreducible diversity by maintaining separate reward functions rather than forcing convergence.
--- ---
Relevant Notes: Relevant Notes:

View file

@ -0,0 +1,37 @@
---
type: claim
domain: ai-alignment
description: "Formal impossibility result showing single reward models fail when human preferences are diverse across subpopulations"
confidence: likely
source: "Chakraborty et al., MaxMin-RLHF: Alignment with Diverse Human Preferences (ICML 2024)"
created: 2026-03-11
---
# Single-reward RLHF cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness and inversely to representation
Chakraborty et al. (2024) provide a formal impossibility result: when human preferences are diverse across subpopulations, a singular reward model in RLHF cannot adequately align language models. The alignment gap—the difference between optimal alignment for each group and what a single reward achieves—grows proportionally to how distinct minority preferences are and inversely to their representation in the training data.
This is demonstrated empirically at two scales:
**GPT-2 scale:** Single RLHF optimized for positive sentiment (majority preference) while completely ignoring conciseness (minority preference). The model satisfied the majority but failed the minority entirely.
**Tulu2-7B scale:** When the preference ratio was 10:1 (majority:minority), single reward model accuracy on minority groups dropped from 70.4% (balanced case) to 42%. This 28-percentage-point degradation shows the structural failure mode.
The impossibility is structural, not a matter of insufficient training data or model capacity. A single reward function mathematically cannot capture context-dependent values that vary across identifiable subpopulations.
## Evidence
Chakraborty, Qiu, Yuan, Koppel, Manocha, Huang, Bedi, Wang. "MaxMin-RLHF: Alignment with Diverse Human Preferences." ICML 2024. https://arxiv.org/abs/2402.08925
- Formal proof that high subpopulation diversity leads to greater alignment gap
- GPT-2 experiment: single RLHF achieved positive sentiment but ignored conciseness
- Tulu2-7B experiment: minority group accuracy dropped from 70.4% to 42% at 10:1 ratio
---
Relevant Notes:
- [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]]
- [[pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state]]
Topics:
- domains/ai-alignment/_map

View file

@ -0,0 +1,37 @@
---
type: claim
domain: internet-finance
description: "AIMD algorithm achieves provably fair and stable distributed resource allocation using only local congestion feedback"
confidence: proven
source: "Corless, King, Shorten, Wirth (SIAM 2016) - AIMD Dynamics and Distributed Resource Allocation"
created: 2026-03-11
secondary_domains: [mechanisms, collective-intelligence]
---
# AIMD converges to fair resource allocation without global coordination through local congestion signals
Additive Increase Multiplicative Decrease (AIMD) is a distributed resource allocation algorithm that provably converges to fair and stable resource sharing among competing agents without requiring centralized control or global information. The algorithm operates through two simple rules: when no congestion is detected, increase resource usage additively (rate += α); when congestion is detected, decrease resource usage multiplicatively (rate *= β, where 0 < β < 1).
The SIAM monograph by Corless et al. demonstrates that AIMD is mathematically guaranteed to converge to equal sharing of available capacity regardless of the number of agents or parameter values. Each agent only needs to observe local congestion signals—no knowledge of other agents, total capacity, or system-wide state is required. This makes AIMD the most widely deployed distributed resource allocation mechanism, originally developed for TCP congestion control and now applicable to smart grid energy allocation, distributed computing, and other domains where multiple agents compete for shared resources.
The key insight is that AIMD doesn't require predicting load, modeling arrivals, or solving optimization problems. It reacts to observed system state through simple local rules and is guaranteed to find the fair allocation through the dynamics of the algorithm itself. The multiplicative decrease creates faster convergence than purely additive approaches, while the additive increase ensures fairness rather than proportional allocation.
## Evidence
- Corless, King, Shorten, Wirth (2016) provide mathematical proofs of convergence and fairness properties
- AIMD is the foundation of TCP congestion control, the most widely deployed distributed algorithm in existence
- The algorithm works across heterogeneous domains: internet bandwidth, energy grids, computing resources
- Convergence is guaranteed regardless of number of competing agents or their parameter choices
---
Relevant Notes:
- [[coordination mechanisms]]
- [[optimal governance requires mixing mechanisms because different decisions have different manipulation risk profiles]]
- [[collective intelligence requires diversity as a structural precondition not a moral preference]]
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]]
Topics:
- domains/internet-finance/_map
- core/mechanisms/_map
- foundations/collective-intelligence/_map

View file

@ -0,0 +1,46 @@
---
type: claim
domain: internet-finance
description: "AIMD provides principled autoscaling for systems with expensive compute and variable load by reacting to queue state rather than forecasting demand"
confidence: experimental
source: "Corless et al. (SIAM 2016) applied to Teleo pipeline architecture"
created: 2026-03-11
secondary_domains: [mechanisms, critical-systems]
---
# AIMD scaling solves variable-load expensive-compute coordination without prediction
For systems with expensive computational operations and highly variable load—such as AI evaluation pipelines where extraction is cheap but evaluation is costly—AIMD provides a principled scaling algorithm that doesn't require demand forecasting or optimization modeling. The algorithm operates by observing queue state: when the evaluation queue is shrinking (no congestion), increase extraction workers by 1 per cycle; when the queue is growing (congestion detected), halve extraction workers.
This approach is particularly well-suited to scenarios where:
1. Downstream operations (evaluation) are significantly more expensive than upstream operations (extraction)
2. Load is unpredictable and varies substantially over time
3. The cost of overprovisioning is high (wasted expensive compute)
4. The cost of underprovisioning is manageable (slightly longer queue wait times)
The AIMD dynamics guarantee convergence to a stable operating point where extraction rate matches evaluation capacity, without requiring any prediction of future load, modeling of arrival patterns, or solution of optimization problems. The system self-regulates through observed congestion signals (queue growth/shrinkage) and simple local rules.
The multiplicative decrease (halving workers on congestion) provides rapid response to capacity constraints, while the additive increase (adding one worker when uncongested) provides gradual scaling that avoids overshooting. This asymmetry is critical: it's better to scale down too aggressively and scale up conservatively than vice versa when downstream compute is expensive.
## Evidence
- Corless et al. (2016) prove AIMD convergence properties hold for general resource allocation problems beyond network bandwidth
- The Teleo pipeline architecture exhibits the exact characteristics AIMD is designed for: cheap extraction, expensive evaluation, variable load
- AIMD's "no prediction required" property eliminates the complexity and fragility of load forecasting models
- The algorithm's proven stability guarantees mean it won't oscillate or diverge regardless of load patterns
## Challenges
This is an application of proven AIMD theory to a specific system architecture, but the actual performance in the Teleo pipeline context is untested. The claim that AIMD is "perfect for" this setting is theoretical—empirical validation would strengthen confidence from experimental to likely.
---
Relevant Notes:
- [[aimd-converges-to-fair-resource-allocation-without-global-coordination-through-local-congestion-signals]] <!-- claim pending -->
- [[coordination mechanisms]]
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]]
Topics:
- domains/internet-finance/_map
- core/mechanisms/_map
- foundations/critical-systems/_map

View file

@ -0,0 +1,37 @@
---
type: claim
domain: internet-finance
description: "At 5-20 server scale, queueing theory threshold policies capture most benefit without algorithmic complexity"
confidence: likely
source: "van Leeuwaarden, Mathijsen, Sanders (SIAM Review 2018) - empirical validation of square-root staffing at moderate scale"
created: 2026-03-11
depends_on: ["square-root-staffing-principle-achieves-economies-of-scale-in-queueing-systems-by-operating-near-full-utilization-with-manageable-delays.md"]
---
# Moderate-scale queueing systems benefit from simple threshold policies over sophisticated algorithms because square-root staffing captures most efficiency gains
For systems operating at moderate scale (5-20 servers), the mathematical properties of the Halfin-Whitt regime mean that simple threshold-based policies informed by queueing theory capture most of the available efficiency gains. Sophisticated dynamic algorithms add implementation complexity without proportional benefit at this scale.
The square-root staffing principle works empirically even for systems as small as 5-6 servers, which means the core economies-of-scale insight applies well below the asymptotic regime where the mathematical proofs strictly hold. This has direct implications for pipeline architecture: a system with 5-6 workers doesn't need complex autoscaling algorithms or machine learning-based load prediction.
## Evidence
The SIAM Review tutorial explicitly notes that "square-root safety staffing works empirically even for moderate-sized systems (5-20 servers)" and that "at our scale (5-6 workers), we're in the 'moderate system' range where square-root staffing still provides useful guidance."
The key takeaway from the tutorial: "we don't need sophisticated algorithms for a system this small. Simple threshold policies informed by queueing theory will capture most of the benefit."
## Practical Application
For Teleo pipeline architecture operating at 5-6 workers, this means:
- Simple threshold-based autoscaling policies are sufficient
- Complex predictive algorithms add cost without proportional benefit
- The mathematical foundation (Halfin-Whitt regime) validates simple approaches at this scale
---
Relevant Notes:
- [[square-root-staffing-principle-achieves-economies-of-scale-in-queueing-systems-by-operating-near-full-utilization-with-manageable-delays]]
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,36 @@
---
type: claim
domain: internet-finance
description: "Bursty arrival processes require more safety capacity than Poisson models predict, scaled by variance-to-mean ratio"
confidence: proven
source: "Whitt et al., 'Staffing a Service System with Non-Poisson Non-Stationary Arrivals', Cambridge Core, 2016"
created: 2026-03-11
---
# Square-root staffing formula requires peakedness adjustment for non-Poisson arrivals because bursty processes need proportionally more safety capacity than the Poisson baseline predicts
The standard square-root staffing formula (workers = mean load + safety factor × √mean) assumes Poisson arrivals where variance equals mean. Real-world arrival processes violate this assumption through burstiness (arrivals clustered in time) or smoothness (arrivals more evenly distributed than random).
Whitt et al. extend the square-root staffing rule by introducing **peakedness** — the variance-to-mean ratio of the arrival process — as the key adjustment parameter. For bursty arrivals (peakedness > 1), systems require MORE safety capacity than Poisson models suggest. For smooth arrivals (peakedness < 1), systems need LESS.
The modified staffing formula adjusts the square-root safety margin by multiplying by the square root of peakedness. This correction is critical for non-stationary systems where arrival rates vary over time (daily cycles, seasonal patterns, or event-driven spikes).
## Evidence
- Whitt et al. (2016) prove that peakedness — the variance-to-mean ratio — captures the essential non-Poisson behavior for staffing calculations
- Standard Poisson assumption (variance = mean) fails empirically for bursty workloads like research paper dumps, product launches, or customer service spikes
- Using constant staffing (fixed MAX_WORKERS) regardless of queue state creates dual failure: over-provisioning during quiet periods (wasted compute) and under-provisioning during bursts (queue explosion)
## Relevance to Pipeline Architecture
Teleo's research pipeline exhibits textbook non-Poisson non-stationary arrivals: research dumps arrive in bursts of 15+ sources, futardio launches come in waves of 20+ proposals, while other days see minimal activity. The peakedness parameter quantifies exactly how much extra capacity is needed beyond naive square-root staffing.
This directly informs dynamic worker scaling: measure empirical peakedness from historical arrival data, adjust safety capacity accordingly, and scale workers based on current queue depth rather than using fixed limits.
---
Relevant Notes:
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,35 @@
---
type: claim
domain: internet-finance
description: "The QED Halfin-Whitt regime shows server count n grows while utilization approaches 1 at rate Θ(1/√n)"
confidence: proven
source: "van Leeuwaarden, Mathijsen, Sanders (SIAM Review 2018) - Economies-of-Scale in Many-Server Queueing Systems"
created: 2026-03-11
---
# Square-root staffing principle achieves economies of scale in queueing systems by operating near full utilization with manageable delays
The QED (Quality-and-Efficiency-Driven) Halfin-Whitt heavy-traffic regime provides the mathematical foundation for understanding economies of scale in multi-server systems. As server count n grows, the system can operate at utilization approaching 1 while maintaining bounded delays, with the key insight that excess capacity needs to grow only at rate Θ(1/√n) rather than linearly.
This "square root staffing" principle means larger systems need proportionally fewer excess servers for the same service quality. A system with 100 servers might need 10 excess servers for target service levels, while a system with 400 servers needs only 20 excess servers (not 40) for the same quality.
The regime applies across system sizes from tens to thousands of servers, and empirical validation shows the square-root safety staffing works even for moderate-sized systems in the 5-20 server range.
## Evidence
From the SIAM Review tutorial:
- Mathematical proof that utilization approaches 1 at rate Θ(1/√n) as server count grows
- Empirical validation showing square-root staffing works for systems as small as 5-20 servers
- The regime connects abstract queueing theory to practical staffing decisions across industries
## Implications for Pipeline Architecture
For systems in the 5-6 worker range, sophisticated dynamic algorithms provide minimal benefit over simple threshold policies informed by queueing theory. The economies-of-scale result also indicates that marginal value per worker decreases as systems grow beyond 20+ workers, which is critical for cost optimization in scaled deployments.
---
Relevant Notes:
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,42 @@
---
type: claim
domain: internet-finance
description: "Replacing non-stationary arrival rates with constant staffing leads to systematic over- or under-provisioning"
confidence: proven
source: "Whitt et al., 'Staffing a Service System with Non-Poisson Non-Stationary Arrivals', Cambridge Core, 2016"
created: 2026-03-11
---
# Time-varying arrival rates require dynamic staffing not constant MAX_WORKERS because using average or maximum rates as constants creates systematic misallocation across the arrival cycle
Non-stationary arrival processes — where the arrival rate itself changes over time — cannot be efficiently staffed with constant worker counts. Whitt et al. demonstrate that replacing time-varying rates with either the average rate or the maximum rate produces badly mis-staffed systems:
- **Constant = average rate**: Under-staffed during peak periods, leading to queue explosions and service degradation
- **Constant = maximum rate**: Over-staffed during off-peak periods, wasting capacity and compute resources
The optimal approach tracks the arrival rate over time and adjusts staffing dynamically to match the current load plus an appropriate safety margin (scaled by peakedness for non-Poisson processes).
## Evidence
- Whitt et al. (2016) prove that time-varying arrival rates require time-varying staffing levels for efficiency
- Constant staffing at maximum capacity wastes resources during low-traffic periods
- Constant staffing at average capacity fails catastrophically during burst periods
- Dynamic staffing based on current queue state and arrival rate estimates achieves both efficiency (no waste during quiet periods) and reliability (adequate capacity during bursts)
## Application to Teleo Pipeline
Teleo's research processing pipeline exhibits strong non-stationarity: research dumps and futardio launches create burst periods with 15-20+ simultaneous arrivals, while other periods see minimal activity. Using a fixed MAX_WORKERS setting (constant staffing) is the worst of both worlds:
- During bursts: MAX_WORKERS is too low, queue explodes, processing stalls
- During quiet periods: MAX_WORKERS is too high, workers sit idle, compute wasted
Dynamic worker scaling based on current queue depth and estimated arrival rate (with peakedness adjustment) is the theoretically correct solution.
---
Relevant Notes:
- [[square-root-staffing-formula-requires-peakedness-adjustment-for-non-poisson-arrivals]]
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,62 @@
{
"raw_response": "```json\n{\n \"claims\": [],\n \"enrichments\": [],\n \"entities\": [\n {\n \"filename\": \"metadao-proposal-7.md\",\n \"domain\": \"internet-finance\",\n \"action\": \"create\",\n \"entity_type\": \"decision_market\",\n \"content\": \"---\\ntype: entity\\nentity_type: decision_market\\nname: \\\"MetaDAO: Proposal #7\\\"\\ndomain: internet-finance\\nstatus: failed\\nparent_entity: \\\"[[metadao]]\\\"\\nplatform: \\\"futardio\\\"\\nproposer: \\\"HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc\\\"\\nproposal_url: \\\"https://www.futard.io/proposal/AuNNyR4oU2zkG1sYBzJ3DJmyDzMKSmSW2yASorWenuC6\\\"\\nproposal_date: 2024-08-28\\nresolution_date: 2024-09-01\\ncategory: \\\"unknown\\\"\\nsummary: \\\"MetaDAO Proposal #7 failed after 4-day voting period with no proposal text or market data available\\\"\\ntracked_by: rio\\ncreated: 2026-03-11\\n---\\n\\n# MetaDAO: Proposal #7\\n\\n## Summary\\nProposal #7 on MetaDAO was created on 2024-08-28 and failed on 2024-09-01. No proposal text, rationale, or market data is available in the source archive. The proposal ran on Autocrat version 0.3.\\n\\n## Market Data\\n- **Outcome:** Failed\\n- **Proposer:** HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc\\n- **Duration:** 4 days (2024-08-28 to 2024-09-01)\\n- **Platform:** Futardio\\n- **Autocrat Version:** 0.3\\n\\n## Significance\\nThis proposal represents an archival record of a failed MetaDAO governance decision. Without proposal content or market volume data, it serves primarily as a timeline marker in MetaDAO's governance history during the Autocrat v0.3 era.\\n\\n## Relationship to KB\\n- [[metadao]] - governance decision\\n- [[futardio]] - platform used\\n\",\n \"timeline_entry\": \"- **2024-08-28** \u2014 [[metadao-proposal-7]] failed: Proposal #7 rejected after 4-day voting period (no details available)\"\n }\n ],\n \"source_update\": {\n \"status\": \"processed\",\n \"processed_by\": \"rio\",\n \"processed_date\": \"2026-03-11\",\n \"claims_extracted\": [],\n \"enrichments_applied\": [],\n \"entities_updated\": [\"metadao-proposal-7.md\", \"metadao.md\"],\n \"key_facts\": [\n \"MetaDAO Proposal #7 created 2024-08-28, failed 2024-09-01\",\n \"Proposal account: AuNNyR4oU2zkG1sYBzJ3DJmyDzMKSmSW2yASorWenuC6\",\n \"DAO account: GWywkp2mY2vzAaLydR2MBXRCqk2vBTyvtVRioujxi5Ce\",\n \"Proposer: HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc\",\n \"Autocrat version: 0.3\"\n ],\n \"notes\": \"Source contains only metadata about a failed MetaDAO proposal with no proposal text, rationale, market data, or voting details. Created decision_market entity for archival completeness and timeline tracking. No extractable claims or enrichments due to absence of substantive content about mechanisms, outcomes, or governance dynamics. This is purely structural metadata documenting that a proposal existed and failed.\"\n }\n}\n```",
"parsed": {
"claims": [],
"enrichments": [],
"entities": [
{
"filename": "metadao-proposal-7.md",
"domain": "internet-finance",
"action": "create",
"entity_type": "decision_market",
"content": "---\ntype: entity\nentity_type: decision_market\nname: \"MetaDAO: Proposal #7\"\ndomain: internet-finance\nstatus: failed\nparent_entity: \"[[metadao]]\"\nplatform: \"futardio\"\nproposer: \"HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc\"\nproposal_url: \"https://www.futard.io/proposal/AuNNyR4oU2zkG1sYBzJ3DJmyDzMKSmSW2yASorWenuC6\"\nproposal_date: 2024-08-28\nresolution_date: 2024-09-01\ncategory: \"unknown\"\nsummary: \"MetaDAO Proposal #7 failed after 4-day voting period with no proposal text or market data available\"\ntracked_by: rio\ncreated: 2026-03-11\n---\n\n# MetaDAO: Proposal #7\n\n## Summary\nProposal #7 on MetaDAO was created on 2024-08-28 and failed on 2024-09-01. No proposal text, rationale, or market data is available in the source archive. The proposal ran on Autocrat version 0.3.\n\n## Market Data\n- **Outcome:** Failed\n- **Proposer:** HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc\n- **Duration:** 4 days (2024-08-28 to 2024-09-01)\n- **Platform:** Futardio\n- **Autocrat Version:** 0.3\n\n## Significance\nThis proposal represents an archival record of a failed MetaDAO governance decision. Without proposal content or market volume data, it serves primarily as a timeline marker in MetaDAO's governance history during the Autocrat v0.3 era.\n\n## Relationship to KB\n- [[metadao]] - governance decision\n- [[futardio]] - platform used\n",
"timeline_entry": "- **2024-08-28** \u2014 [[metadao-proposal-7]] failed: Proposal #7 rejected after 4-day voting period (no details available)"
}
],
"source_update": {
"status": "null-result",
"processed_by": "rio",
"processed_date": "2026-03-11",
"claims_extracted": [],
"enrichments_applied": [],
"entities_updated": [
"metadao-proposal-7.md",
"metadao.md"
],
"key_facts": [
"MetaDAO Proposal #7 created 2024-08-28, failed 2024-09-01",
"Proposal account: AuNNyR4oU2zkG1sYBzJ3DJmyDzMKSmSW2yASorWenuC6",
"DAO account: GWywkp2mY2vzAaLydR2MBXRCqk2vBTyvtVRioujxi5Ce",
"Proposer: HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc",
"Autocrat version: 0.3"
],
"notes": "Source contains only metadata about a failed MetaDAO proposal with no proposal text, rationale, market data, or voting details. Created decision_market entity for archival completeness and timeline tracking. No extractable claims or enrichments due to absence of substantive content about mechanisms, outcomes, or governance dynamics. This is purely structural metadata documenting that a proposal existed and failed.",
"model": "anthropic/claude-sonnet-4.5"
}
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-15",
"tokens": {
"prompt_tokens": 11434,
"completion_tokens": 1052,
"total_tokens": 12486,
"cost": 0.050082,
"is_byok": false,
"prompt_tokens_details": {
"cached_tokens": 0,
"cache_write_tokens": 0,
"audio_tokens": 0,
"video_tokens": 0
},
"cost_details": {
"upstream_inference_cost": 0.050082,
"upstream_inference_prompt_cost": 0.034302,
"upstream_inference_completions_cost": 0.01578
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"image_tokens": 0,
"audio_tokens": 0
}
}
}

File diff suppressed because one or more lines are too long

View file

@ -6,8 +6,13 @@ url: https://www.cambridge.org/core/journals/probability-in-the-engineering-and-
date: 2016-01-01 date: 2016-01-01
domain: internet-finance domain: internet-finance
format: paper format: paper
status: unprocessed status: processed
tags: [pipeline-architecture, operations-research, stochastic-modeling, non-stationary-arrivals, capacity-sizing] tags: [pipeline-architecture, operations-research, stochastic-modeling, non-stationary-arrivals, capacity-sizing]
processed_by: rio
processed_date: 2026-03-11
claims_extracted: ["square-root-staffing-formula-requires-peakedness-adjustment-for-non-poisson-arrivals.md", "time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Operations research paper on staffing under non-Poisson non-stationary arrivals. Extracted two claims on peakedness adjustment and dynamic staffing requirements. Direct application to Teleo pipeline architecture for worker scaling. No entity data (academic paper, no companies/products/decisions). No enrichments (novel theoretical contribution not covered by existing claims)."
--- ---
# Staffing a Service System with Non-Poisson Non-Stationary Arrivals # Staffing a Service System with Non-Poisson Non-Stationary Arrivals

View file

@ -6,8 +6,13 @@ url: https://epubs.siam.org/doi/book/10.1137/1.9781611974225
date: 2016-01-01 date: 2016-01-01
domain: internet-finance domain: internet-finance
format: paper format: paper
status: unprocessed status: processed
tags: [pipeline-architecture, operations-research, AIMD, distributed-resource-allocation, congestion-control, fairness] tags: [pipeline-architecture, operations-research, AIMD, distributed-resource-allocation, congestion-control, fairness]
processed_by: rio
processed_date: 2026-03-11
claims_extracted: ["aimd-converges-to-fair-resource-allocation-without-global-coordination-through-local-congestion-signals.md", "aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Extracted two claims: (1) general AIMD mechanism properties as proven coordination algorithm, (2) specific application to Teleo pipeline architecture. The source is a formal mathematical treatment (SIAM monograph) providing rigorous proofs, making the first claim 'proven' confidence. The second claim is an application proposal with theoretical justification but no empirical validation, hence 'experimental'. No entities to extract—this is pure mechanism theory. No enrichments—AIMD is not currently referenced in the KB."
--- ---
# AIMD Dynamics and Distributed Resource Allocation # AIMD Dynamics and Distributed Resource Allocation
@ -26,3 +31,10 @@ SIAM monograph on AIMD (Additive Increase Multiplicative Decrease) as a general-
## Relevance to Teleo Pipeline ## Relevance to Teleo Pipeline
AIMD provides a principled, proven scaling algorithm: when eval queue is shrinking (no congestion), increase extraction workers by 1 per cycle. When eval queue is growing (congestion), halve extraction workers. This doesn't require predicting load, modeling arrivals, or solving optimization problems — it reacts to observed system state and is mathematically guaranteed to converge. Perfect for our "expensive compute, variable load" setting. AIMD provides a principled, proven scaling algorithm: when eval queue is shrinking (no congestion), increase extraction workers by 1 per cycle. When eval queue is growing (congestion), halve extraction workers. This doesn't require predicting load, modeling arrivals, or solving optimization problems — it reacts to observed system state and is mathematically guaranteed to converge. Perfect for our "expensive compute, variable load" setting.
## Key Facts
- AIMD algorithm: additive increase (rate += α) when no congestion, multiplicative decrease (rate *= β, 0 < β < 1) when congestion detected
- AIMD is the foundation of TCP congestion control
- AIMD has been applied to internet congestion control, smart grid energy allocation, and distributed computing
- AIMD convergence is mathematically proven regardless of number of agents or parameter values

View file

@ -6,8 +6,13 @@ url: https://epubs.siam.org/doi/10.1137/17M1133944
date: 2018-01-01 date: 2018-01-01
domain: internet-finance domain: internet-finance
format: paper format: paper
status: unprocessed status: processed
tags: [pipeline-architecture, operations-research, queueing-theory, Halfin-Whitt, economies-of-scale, square-root-staffing] tags: [pipeline-architecture, operations-research, queueing-theory, Halfin-Whitt, economies-of-scale, square-root-staffing]
processed_by: rio
processed_date: 2026-03-11
claims_extracted: ["square-root-staffing-principle-achieves-economies-of-scale-in-queueing-systems-by-operating-near-full-utilization-with-manageable-delays.md", "moderate-scale-queueing-systems-benefit-from-simple-threshold-policies-over-sophisticated-algorithms-because-square-root-staffing-captures-most-efficiency-gains.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Extracted two claims about queueing theory and economies of scale. The source is a mathematical tutorial with proven results (SIAM Review), so confidence is 'proven' for the core mathematical claim and 'likely' for the practical application claim. No entities to extract (academic paper, no companies/products/decisions). The relevance to Teleo is in pipeline architecture optimization, which is noted in the source's 'Relevance to Teleo Pipeline' section."
--- ---
# Economies-of-Scale in Many-Server Queueing Systems # Economies-of-Scale in Many-Server Queueing Systems
@ -26,3 +31,9 @@ SIAM Review tutorial on the QED (Quality-and-Efficiency-Driven) Halfin-Whitt hea
## Relevance to Teleo Pipeline ## Relevance to Teleo Pipeline
At our scale (5-6 workers), we're in the "moderate system" range where square-root staffing still provides useful guidance. The key takeaway: we don't need sophisticated algorithms for a system this small. Simple threshold policies informed by queueing theory will capture most of the benefit. The economies-of-scale result also tells us that if we grow to 20+ workers, the marginal value of each additional worker decreases — important for cost optimization. At our scale (5-6 workers), we're in the "moderate system" range where square-root staffing still provides useful guidance. The key takeaway: we don't need sophisticated algorithms for a system this small. Simple threshold policies informed by queueing theory will capture most of the benefit. The economies-of-scale result also tells us that if we grow to 20+ workers, the marginal value of each additional worker decreases — important for cost optimization.
## Key Facts
- Halfin-Whitt QED regime: utilization approaches 1 at rate Θ(1/√n)
- Square-root staffing validated empirically for systems as small as 5-20 servers
- 100-server system needs ~10 excess servers; 400-server system needs ~20 (not 40) for same quality

View file

@ -7,9 +7,15 @@ date: 2024-02-01
domain: ai-alignment domain: ai-alignment
secondary_domains: [collective-intelligence] secondary_domains: [collective-intelligence]
format: paper format: paper
status: unprocessed status: processed
priority: high priority: high
tags: [maxmin-rlhf, egalitarian-alignment, diverse-preferences, social-choice, reward-mixture, impossibility-result] tags: [maxmin-rlhf, egalitarian-alignment, diverse-preferences, social-choice, reward-mixture, impossibility-result]
processed_by: theseus
processed_date: 2026-03-11
claims_extracted: ["single-reward-rlhf-cannot-align-diverse-preferences-because-alignment-gap-grows-proportional-to-minority-distinctiveness.md", "maxmin-rlhf-applies-egalitarian-social-choice-to-alignment-by-maximizing-minimum-utility-across-preference-groups.md", "minority-preference-alignment-improves-33-percent-without-majority-compromise-suggesting-single-reward-leaves-value-on-table.md"]
enrichments_applied: ["pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Three novel claims extracted: (1) formal impossibility result for single-reward RLHF, (2) MaxMin as egalitarian social choice mechanism, (3) minority improvement without majority compromise. Two enrichments to existing claims on RLHF diversity failure and pluralistic alignment. No entities—this is a research paper, not organizational/market data. Key contribution is the first constructive mechanism addressing single-reward impossibility with empirical validation."
--- ---
## Content ## Content
@ -51,3 +57,12 @@ Published at ICML 2024. Addresses the problem that standard RLHF employs a singu
PRIMARY CONNECTION: [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]] PRIMARY CONNECTION: [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]]
WHY ARCHIVED: First constructive mechanism that formally addresses single-reward impossibility while demonstrating empirical improvement — especially for minority groups WHY ARCHIVED: First constructive mechanism that formally addresses single-reward impossibility while demonstrating empirical improvement — especially for minority groups
EXTRACTION HINT: The impossibility result + MaxMin mechanism + 33% minority improvement are three extractable claims EXTRACTION HINT: The impossibility result + MaxMin mechanism + 33% minority improvement are three extractable claims
## Key Facts
- MaxMin-RLHF published at ICML 2024 (top-tier ML venue)
- Authors: Chakraborty, Qiu, Yuan, Koppel, Manocha, Huang, Bedi, Wang (multi-institutional)
- GPT-2 experiment: sentiment (majority) vs conciseness (minority) preferences
- Tulu2-7B experiment: 10:1 preference ratio tested
- EM algorithm iteratively clusters humans and updates subpopulation reward functions
- MaxMin objective adapted from Sen's Egalitarian principle in social choice theory

View file

@ -6,13 +6,17 @@ url: "https://www.futard.io/proposal/AuNNyR4oU2zkG1sYBzJ3DJmyDzMKSmSW2yASorWenuC
date: 2024-08-28 date: 2024-08-28
domain: internet-finance domain: internet-finance
format: data format: data
status: unprocessed status: null-result
tags: [futardio, metadao, futarchy, solana, governance] tags: [futardio, metadao, futarchy, solana, governance]
event_type: proposal event_type: proposal
processed_by: rio processed_by: rio
processed_date: 2024-08-28 processed_date: 2024-08-28
extraction_model: "anthropic/claude-sonnet-4.5" extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "This source contains only metadata about a failed MetaDAO proposal with no proposal text, rationale, market data, or voting details. The source provides verifiable facts (proposal number, accounts, dates, status) but no evidence supporting arguable claims about futarchy mechanisms, governance outcomes, or market behavior. Without proposal content or outcome analysis, there is nothing to extract as claims or enrichments. The existing claim 'MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions' could potentially be enriched if this proposal had volume data, but none is provided. This is purely archival metadata." extraction_notes: "This source contains only metadata about a failed MetaDAO proposal with no proposal text, rationale, market data, or voting details. The source provides verifiable facts (proposal number, accounts, dates, status) but no evidence supporting arguable claims about futarchy mechanisms, governance outcomes, or market behavior. Without proposal content or outcome analysis, there is nothing to extract as claims or enrichments. The existing claim 'MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions' could potentially be enriched if this proposal had volume data, but none is provided. This is purely archival metadata."
processed_by: rio
processed_date: 2026-03-11
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Source contains only metadata about a failed MetaDAO proposal with no proposal text, rationale, market data, or voting details. Created decision_market entity for archival completeness and timeline tracking. No extractable claims or enrichments due to absence of substantive content about mechanisms, outcomes, or governance dynamics. This is purely structural metadata documenting that a proposal existed and failed."
--- ---
## Proposal Details ## Proposal Details
@ -39,3 +43,11 @@ extraction_notes: "This source contains only metadata about a failed MetaDAO pro
- DAO account: GWywkp2mY2vzAaLydR2MBXRCqk2vBTyvtVRioujxi5Ce - DAO account: GWywkp2mY2vzAaLydR2MBXRCqk2vBTyvtVRioujxi5Ce
- Proposer: HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc - Proposer: HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc
- Autocrat version: 0.3 - Autocrat version: 0.3
## Key Facts
- MetaDAO Proposal #7 created 2024-08-28, failed 2024-09-01
- Proposal account: AuNNyR4oU2zkG1sYBzJ3DJmyDzMKSmSW2yASorWenuC6
- DAO account: GWywkp2mY2vzAaLydR2MBXRCqk2vBTyvtVRioujxi5Ce
- Proposer: HwBL75xHHKcXSMNcctq3UqWaEJPDWVQz6NazZJNjWaQc
- Autocrat version: 0.3

View file

@ -6,9 +6,13 @@ url: "https://www.futard.io/proposal/6LcxhHS3JvDtbS1GoQS18EgH5Pzf7AnqQpR7D4HxmWp
date: 2024-11-13 date: 2024-11-13
domain: internet-finance domain: internet-finance
format: data format: data
status: unprocessed status: null-result
tags: [futardio, metadao, futarchy, solana, governance] tags: [futardio, metadao, futarchy, solana, governance]
event_type: proposal event_type: proposal
processed_by: rio
processed_date: 2026-03-11
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Source is a futarchy governance proposal for Coal token emission schedule. Extracted as decision_market entity (the proposal itself) and created parent entity for Coal project. No novel claims about futarchy mechanisms - this is a straightforward application of existing governance patterns. The shift from algorithmic to market-driven emission control is notable but represents implementation of known futarchy principles rather than new mechanism insight."
--- ---
## Proposal Details ## Proposal Details
@ -66,3 +70,10 @@ A follow-up decision market will be held in early January, approximately two mon
- Autocrat version: 0.3 - Autocrat version: 0.3
- Completed: 2024-11-17 - Completed: 2024-11-17
- Ended: 2024-11-17 - Ended: 2024-11-17
## Key Facts
- Coal token emission rate reduced from 15.625 to 7.8125 per minute (2024-11-17)
- Coal annual inflation reduced from ~110% to ~56% (2024-11-17)
- Coal completed 6 halvings before governance transition
- Coal proposal 6LcxhHS3JvDtbS1GoQS18EgH5Pzf7AnqQpR7D4HxmWpy passed (2024-11-17)