teleo-codex/domains/internet-finance/aimd-worker-scaling-requires-only-queue-state-observation-not-load-prediction-making-it-simpler-than-ml-based-autoscaling.md
Teleo Agents 12c20ce27c extract: 2025-04-25-bournassenko-queueing-theory-cicd-pipelines
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-16 13:27:33 +00:00

2.8 KiB

type domain description confidence source created
claim internet-finance AIMD autoscaling reacts to observed queue dynamics rather than forecasting demand, eliminating prediction error and model complexity experimental Vlahakis, Athanasopoulos et al., AIMD Scheduling (2021), applied to Teleo pipeline context 2026-03-11

AIMD worker scaling requires only queue state observation not load prediction making it simpler than ML-based autoscaling

Traditional autoscaling approaches attempt to predict future load and preemptively adjust capacity. This requires:

  • Historical load data and pattern recognition
  • ML models to forecast demand
  • Tuning of prediction windows and confidence thresholds
  • Handling of prediction errors and their cascading effects

AIMD eliminates this entire complexity layer by operating purely on observed queue state. The control law is:

  • If queue_length is decreasing: add workers linearly (additive increase)
  • If queue_length is increasing: remove workers multiplicatively (multiplicative decrease)

This reactive approach has several advantages:

  1. No prediction error — the system responds to actual observed state, not forecasts
  2. No training data required — works immediately without historical patterns
  3. Self-correcting — wrong adjustments are automatically reversed by subsequent queue observations
  4. Proven stable — mathematical guarantees from control theory, not empirical tuning

The Vlahakis et al. (2021) paper proves that this decentralized approach achieves global convergence to bounded queue lengths in finite time, regardless of system size or AIMD parameters. The stability is structural, not empirical.

For the Teleo pipeline specifically: when extract produces claims faster than eval can process them, the eval queue grows. AIMD detects this and scales up eval workers. When the queue shrinks below target, AIMD scales down. No load forecasting, no ML models, no hyperparameter tuning — just queue observation and a simple control law.

The tradeoff: AIMD is reactive rather than predictive, so it responds to load changes rather than anticipating them. For bursty workloads with predictable patterns, ML-based prediction might provision capacity faster. But for unpredictable workloads or systems where prediction accuracy is low, AIMD's simplicity and guaranteed stability are compelling.

Additional Evidence (extend)

Source: 2025-04-25-bournassenko-queueing-theory-cicd-pipelines | Added: 2026-03-16

M/M/c queueing models provide theoretical foundation for why queue-state-based scaling works: closed-form solutions exist for wait times given arrival rates and server counts, meaning optimal worker allocation can be computed from observable queue depth without predicting future load.


Relevant Notes:

  • core/mechanisms/_map

Topics:

  • domains/internet-finance/_map