Merge branch 'main' into extract/2024-11-13-futardio-proposal-cut-emissions-by-50

This commit is contained in:
Leo 2026-03-15 16:25:54 +00:00
commit d1b962c5fd
9 changed files with 264 additions and 3 deletions

View file

@ -0,0 +1,37 @@
---
type: claim
domain: internet-finance
description: "AIMD algorithm achieves provably fair and stable distributed resource allocation using only local congestion feedback"
confidence: proven
source: "Corless, King, Shorten, Wirth (SIAM 2016) - AIMD Dynamics and Distributed Resource Allocation"
created: 2026-03-11
secondary_domains: [mechanisms, collective-intelligence]
---
# AIMD converges to fair resource allocation without global coordination through local congestion signals
Additive Increase Multiplicative Decrease (AIMD) is a distributed resource allocation algorithm that provably converges to fair and stable resource sharing among competing agents without requiring centralized control or global information. The algorithm operates through two simple rules: when no congestion is detected, increase resource usage additively (rate += α); when congestion is detected, decrease resource usage multiplicatively (rate *= β, where 0 < β < 1).
The SIAM monograph by Corless et al. demonstrates that AIMD is mathematically guaranteed to converge to equal sharing of available capacity regardless of the number of agents or parameter values. Each agent only needs to observe local congestion signals—no knowledge of other agents, total capacity, or system-wide state is required. This makes AIMD the most widely deployed distributed resource allocation mechanism, originally developed for TCP congestion control and now applicable to smart grid energy allocation, distributed computing, and other domains where multiple agents compete for shared resources.
The key insight is that AIMD doesn't require predicting load, modeling arrivals, or solving optimization problems. It reacts to observed system state through simple local rules and is guaranteed to find the fair allocation through the dynamics of the algorithm itself. The multiplicative decrease creates faster convergence than purely additive approaches, while the additive increase ensures fairness rather than proportional allocation.
## Evidence
- Corless, King, Shorten, Wirth (2016) provide mathematical proofs of convergence and fairness properties
- AIMD is the foundation of TCP congestion control, the most widely deployed distributed algorithm in existence
- The algorithm works across heterogeneous domains: internet bandwidth, energy grids, computing resources
- Convergence is guaranteed regardless of number of competing agents or their parameter choices
---
Relevant Notes:
- [[coordination mechanisms]]
- [[optimal governance requires mixing mechanisms because different decisions have different manipulation risk profiles]]
- [[collective intelligence requires diversity as a structural precondition not a moral preference]]
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]]
Topics:
- domains/internet-finance/_map
- core/mechanisms/_map
- foundations/collective-intelligence/_map

View file

@ -0,0 +1,46 @@
---
type: claim
domain: internet-finance
description: "AIMD provides principled autoscaling for systems with expensive compute and variable load by reacting to queue state rather than forecasting demand"
confidence: experimental
source: "Corless et al. (SIAM 2016) applied to Teleo pipeline architecture"
created: 2026-03-11
secondary_domains: [mechanisms, critical-systems]
---
# AIMD scaling solves variable-load expensive-compute coordination without prediction
For systems with expensive computational operations and highly variable load—such as AI evaluation pipelines where extraction is cheap but evaluation is costly—AIMD provides a principled scaling algorithm that doesn't require demand forecasting or optimization modeling. The algorithm operates by observing queue state: when the evaluation queue is shrinking (no congestion), increase extraction workers by 1 per cycle; when the queue is growing (congestion detected), halve extraction workers.
This approach is particularly well-suited to scenarios where:
1. Downstream operations (evaluation) are significantly more expensive than upstream operations (extraction)
2. Load is unpredictable and varies substantially over time
3. The cost of overprovisioning is high (wasted expensive compute)
4. The cost of underprovisioning is manageable (slightly longer queue wait times)
The AIMD dynamics guarantee convergence to a stable operating point where extraction rate matches evaluation capacity, without requiring any prediction of future load, modeling of arrival patterns, or solution of optimization problems. The system self-regulates through observed congestion signals (queue growth/shrinkage) and simple local rules.
The multiplicative decrease (halving workers on congestion) provides rapid response to capacity constraints, while the additive increase (adding one worker when uncongested) provides gradual scaling that avoids overshooting. This asymmetry is critical: it's better to scale down too aggressively and scale up conservatively than vice versa when downstream compute is expensive.
## Evidence
- Corless et al. (2016) prove AIMD convergence properties hold for general resource allocation problems beyond network bandwidth
- The Teleo pipeline architecture exhibits the exact characteristics AIMD is designed for: cheap extraction, expensive evaluation, variable load
- AIMD's "no prediction required" property eliminates the complexity and fragility of load forecasting models
- The algorithm's proven stability guarantees mean it won't oscillate or diverge regardless of load patterns
## Challenges
This is an application of proven AIMD theory to a specific system architecture, but the actual performance in the Teleo pipeline context is untested. The claim that AIMD is "perfect for" this setting is theoretical—empirical validation would strengthen confidence from experimental to likely.
---
Relevant Notes:
- [[aimd-converges-to-fair-resource-allocation-without-global-coordination-through-local-congestion-signals]] <!-- claim pending -->
- [[coordination mechanisms]]
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]]
Topics:
- domains/internet-finance/_map
- core/mechanisms/_map
- foundations/critical-systems/_map

View file

@ -0,0 +1,37 @@
---
type: claim
domain: internet-finance
description: "At 5-20 server scale, queueing theory threshold policies capture most benefit without algorithmic complexity"
confidence: likely
source: "van Leeuwaarden, Mathijsen, Sanders (SIAM Review 2018) - empirical validation of square-root staffing at moderate scale"
created: 2026-03-11
depends_on: ["square-root-staffing-principle-achieves-economies-of-scale-in-queueing-systems-by-operating-near-full-utilization-with-manageable-delays.md"]
---
# Moderate-scale queueing systems benefit from simple threshold policies over sophisticated algorithms because square-root staffing captures most efficiency gains
For systems operating at moderate scale (5-20 servers), the mathematical properties of the Halfin-Whitt regime mean that simple threshold-based policies informed by queueing theory capture most of the available efficiency gains. Sophisticated dynamic algorithms add implementation complexity without proportional benefit at this scale.
The square-root staffing principle works empirically even for systems as small as 5-6 servers, which means the core economies-of-scale insight applies well below the asymptotic regime where the mathematical proofs strictly hold. This has direct implications for pipeline architecture: a system with 5-6 workers doesn't need complex autoscaling algorithms or machine learning-based load prediction.
## Evidence
The SIAM Review tutorial explicitly notes that "square-root safety staffing works empirically even for moderate-sized systems (5-20 servers)" and that "at our scale (5-6 workers), we're in the 'moderate system' range where square-root staffing still provides useful guidance."
The key takeaway from the tutorial: "we don't need sophisticated algorithms for a system this small. Simple threshold policies informed by queueing theory will capture most of the benefit."
## Practical Application
For Teleo pipeline architecture operating at 5-6 workers, this means:
- Simple threshold-based autoscaling policies are sufficient
- Complex predictive algorithms add cost without proportional benefit
- The mathematical foundation (Halfin-Whitt regime) validates simple approaches at this scale
---
Relevant Notes:
- [[square-root-staffing-principle-achieves-economies-of-scale-in-queueing-systems-by-operating-near-full-utilization-with-manageable-delays]]
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,36 @@
---
type: claim
domain: internet-finance
description: "Bursty arrival processes require more safety capacity than Poisson models predict, scaled by variance-to-mean ratio"
confidence: proven
source: "Whitt et al., 'Staffing a Service System with Non-Poisson Non-Stationary Arrivals', Cambridge Core, 2016"
created: 2026-03-11
---
# Square-root staffing formula requires peakedness adjustment for non-Poisson arrivals because bursty processes need proportionally more safety capacity than the Poisson baseline predicts
The standard square-root staffing formula (workers = mean load + safety factor × √mean) assumes Poisson arrivals where variance equals mean. Real-world arrival processes violate this assumption through burstiness (arrivals clustered in time) or smoothness (arrivals more evenly distributed than random).
Whitt et al. extend the square-root staffing rule by introducing **peakedness** — the variance-to-mean ratio of the arrival process — as the key adjustment parameter. For bursty arrivals (peakedness > 1), systems require MORE safety capacity than Poisson models suggest. For smooth arrivals (peakedness < 1), systems need LESS.
The modified staffing formula adjusts the square-root safety margin by multiplying by the square root of peakedness. This correction is critical for non-stationary systems where arrival rates vary over time (daily cycles, seasonal patterns, or event-driven spikes).
## Evidence
- Whitt et al. (2016) prove that peakedness — the variance-to-mean ratio — captures the essential non-Poisson behavior for staffing calculations
- Standard Poisson assumption (variance = mean) fails empirically for bursty workloads like research paper dumps, product launches, or customer service spikes
- Using constant staffing (fixed MAX_WORKERS) regardless of queue state creates dual failure: over-provisioning during quiet periods (wasted compute) and under-provisioning during bursts (queue explosion)
## Relevance to Pipeline Architecture
Teleo's research pipeline exhibits textbook non-Poisson non-stationary arrivals: research dumps arrive in bursts of 15+ sources, futardio launches come in waves of 20+ proposals, while other days see minimal activity. The peakedness parameter quantifies exactly how much extra capacity is needed beyond naive square-root staffing.
This directly informs dynamic worker scaling: measure empirical peakedness from historical arrival data, adjust safety capacity accordingly, and scale workers based on current queue depth rather than using fixed limits.
---
Relevant Notes:
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,35 @@
---
type: claim
domain: internet-finance
description: "The QED Halfin-Whitt regime shows server count n grows while utilization approaches 1 at rate Θ(1/√n)"
confidence: proven
source: "van Leeuwaarden, Mathijsen, Sanders (SIAM Review 2018) - Economies-of-Scale in Many-Server Queueing Systems"
created: 2026-03-11
---
# Square-root staffing principle achieves economies of scale in queueing systems by operating near full utilization with manageable delays
The QED (Quality-and-Efficiency-Driven) Halfin-Whitt heavy-traffic regime provides the mathematical foundation for understanding economies of scale in multi-server systems. As server count n grows, the system can operate at utilization approaching 1 while maintaining bounded delays, with the key insight that excess capacity needs to grow only at rate Θ(1/√n) rather than linearly.
This "square root staffing" principle means larger systems need proportionally fewer excess servers for the same service quality. A system with 100 servers might need 10 excess servers for target service levels, while a system with 400 servers needs only 20 excess servers (not 40) for the same quality.
The regime applies across system sizes from tens to thousands of servers, and empirical validation shows the square-root safety staffing works even for moderate-sized systems in the 5-20 server range.
## Evidence
From the SIAM Review tutorial:
- Mathematical proof that utilization approaches 1 at rate Θ(1/√n) as server count grows
- Empirical validation showing square-root staffing works for systems as small as 5-20 servers
- The regime connects abstract queueing theory to practical staffing decisions across industries
## Implications for Pipeline Architecture
For systems in the 5-6 worker range, sophisticated dynamic algorithms provide minimal benefit over simple threshold policies informed by queueing theory. The economies-of-scale result also indicates that marginal value per worker decreases as systems grow beyond 20+ workers, which is critical for cost optimization in scaled deployments.
---
Relevant Notes:
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -0,0 +1,42 @@
---
type: claim
domain: internet-finance
description: "Replacing non-stationary arrival rates with constant staffing leads to systematic over- or under-provisioning"
confidence: proven
source: "Whitt et al., 'Staffing a Service System with Non-Poisson Non-Stationary Arrivals', Cambridge Core, 2016"
created: 2026-03-11
---
# Time-varying arrival rates require dynamic staffing not constant MAX_WORKERS because using average or maximum rates as constants creates systematic misallocation across the arrival cycle
Non-stationary arrival processes — where the arrival rate itself changes over time — cannot be efficiently staffed with constant worker counts. Whitt et al. demonstrate that replacing time-varying rates with either the average rate or the maximum rate produces badly mis-staffed systems:
- **Constant = average rate**: Under-staffed during peak periods, leading to queue explosions and service degradation
- **Constant = maximum rate**: Over-staffed during off-peak periods, wasting capacity and compute resources
The optimal approach tracks the arrival rate over time and adjusts staffing dynamically to match the current load plus an appropriate safety margin (scaled by peakedness for non-Poisson processes).
## Evidence
- Whitt et al. (2016) prove that time-varying arrival rates require time-varying staffing levels for efficiency
- Constant staffing at maximum capacity wastes resources during low-traffic periods
- Constant staffing at average capacity fails catastrophically during burst periods
- Dynamic staffing based on current queue state and arrival rate estimates achieves both efficiency (no waste during quiet periods) and reliability (adequate capacity during bursts)
## Application to Teleo Pipeline
Teleo's research processing pipeline exhibits strong non-stationarity: research dumps and futardio launches create burst periods with 15-20+ simultaneous arrivals, while other periods see minimal activity. Using a fixed MAX_WORKERS setting (constant staffing) is the worst of both worlds:
- During bursts: MAX_WORKERS is too low, queue explodes, processing stalls
- During quiet periods: MAX_WORKERS is too high, workers sit idle, compute wasted
Dynamic worker scaling based on current queue depth and estimated arrival rate (with peakedness adjustment) is the theoretically correct solution.
---
Relevant Notes:
- [[square-root-staffing-formula-requires-peakedness-adjustment-for-non-poisson-arrivals]]
- domains/internet-finance/_map
Topics:
- core/mechanisms/_map

View file

@ -6,8 +6,13 @@ url: https://www.cambridge.org/core/journals/probability-in-the-engineering-and-
date: 2016-01-01
domain: internet-finance
format: paper
status: unprocessed
status: processed
tags: [pipeline-architecture, operations-research, stochastic-modeling, non-stationary-arrivals, capacity-sizing]
processed_by: rio
processed_date: 2026-03-11
claims_extracted: ["square-root-staffing-formula-requires-peakedness-adjustment-for-non-poisson-arrivals.md", "time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Operations research paper on staffing under non-Poisson non-stationary arrivals. Extracted two claims on peakedness adjustment and dynamic staffing requirements. Direct application to Teleo pipeline architecture for worker scaling. No entity data (academic paper, no companies/products/decisions). No enrichments (novel theoretical contribution not covered by existing claims)."
---
# Staffing a Service System with Non-Poisson Non-Stationary Arrivals

View file

@ -6,8 +6,13 @@ url: https://epubs.siam.org/doi/book/10.1137/1.9781611974225
date: 2016-01-01
domain: internet-finance
format: paper
status: unprocessed
status: processed
tags: [pipeline-architecture, operations-research, AIMD, distributed-resource-allocation, congestion-control, fairness]
processed_by: rio
processed_date: 2026-03-11
claims_extracted: ["aimd-converges-to-fair-resource-allocation-without-global-coordination-through-local-congestion-signals.md", "aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Extracted two claims: (1) general AIMD mechanism properties as proven coordination algorithm, (2) specific application to Teleo pipeline architecture. The source is a formal mathematical treatment (SIAM monograph) providing rigorous proofs, making the first claim 'proven' confidence. The second claim is an application proposal with theoretical justification but no empirical validation, hence 'experimental'. No entities to extract—this is pure mechanism theory. No enrichments—AIMD is not currently referenced in the KB."
---
# AIMD Dynamics and Distributed Resource Allocation
@ -26,3 +31,10 @@ SIAM monograph on AIMD (Additive Increase Multiplicative Decrease) as a general-
## Relevance to Teleo Pipeline
AIMD provides a principled, proven scaling algorithm: when eval queue is shrinking (no congestion), increase extraction workers by 1 per cycle. When eval queue is growing (congestion), halve extraction workers. This doesn't require predicting load, modeling arrivals, or solving optimization problems — it reacts to observed system state and is mathematically guaranteed to converge. Perfect for our "expensive compute, variable load" setting.
## Key Facts
- AIMD algorithm: additive increase (rate += α) when no congestion, multiplicative decrease (rate *= β, 0 < β < 1) when congestion detected
- AIMD is the foundation of TCP congestion control
- AIMD has been applied to internet congestion control, smart grid energy allocation, and distributed computing
- AIMD convergence is mathematically proven regardless of number of agents or parameter values

View file

@ -6,8 +6,13 @@ url: https://epubs.siam.org/doi/10.1137/17M1133944
date: 2018-01-01
domain: internet-finance
format: paper
status: unprocessed
status: processed
tags: [pipeline-architecture, operations-research, queueing-theory, Halfin-Whitt, economies-of-scale, square-root-staffing]
processed_by: rio
processed_date: 2026-03-11
claims_extracted: ["square-root-staffing-principle-achieves-economies-of-scale-in-queueing-systems-by-operating-near-full-utilization-with-manageable-delays.md", "moderate-scale-queueing-systems-benefit-from-simple-threshold-policies-over-sophisticated-algorithms-because-square-root-staffing-captures-most-efficiency-gains.md"]
extraction_model: "anthropic/claude-sonnet-4.5"
extraction_notes: "Extracted two claims about queueing theory and economies of scale. The source is a mathematical tutorial with proven results (SIAM Review), so confidence is 'proven' for the core mathematical claim and 'likely' for the practical application claim. No entities to extract (academic paper, no companies/products/decisions). The relevance to Teleo is in pipeline architecture optimization, which is noted in the source's 'Relevance to Teleo Pipeline' section."
---
# Economies-of-Scale in Many-Server Queueing Systems
@ -26,3 +31,9 @@ SIAM Review tutorial on the QED (Quality-and-Efficiency-Driven) Halfin-Whitt hea
## Relevance to Teleo Pipeline
At our scale (5-6 workers), we're in the "moderate system" range where square-root staffing still provides useful guidance. The key takeaway: we don't need sophisticated algorithms for a system this small. Simple threshold policies informed by queueing theory will capture most of the benefit. The economies-of-scale result also tells us that if we grow to 20+ workers, the marginal value of each additional worker decreases — important for cost optimization.
## Key Facts
- Halfin-Whitt QED regime: utilization approaches 1 at rate Θ(1/√n)
- Square-root staffing validated empirically for systems as small as 5-20 servers
- 100-server system needs ~10 excess servers; 400-server system needs ~20 (not 40) for same quality