Teleo Agents 34dd5bf93d extract: 2026-02-09-oneuptime-hpa-object-metrics-queue-scaling

Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>

2026-03-16 13:33:14 +00:00

3.1 KiB

Raw Blame History

type	domain	description	confidence	source	created
claim	internet-finance	Replacing non-stationary arrival rates with constant staffing leads to systematic over- or under-provisioning	proven	Whitt et al., 'Staffing a Service System with Non-Poisson Non-Stationary Arrivals', Cambridge Core, 2016	2026-03-11

Time-varying arrival rates require dynamic staffing not constant MAX_WORKERS because using average or maximum rates as constants creates systematic misallocation across the arrival cycle

Non-stationary arrival processes — where the arrival rate itself changes over time — cannot be efficiently staffed with constant worker counts. Whitt et al. demonstrate that replacing time-varying rates with either the average rate or the maximum rate produces badly mis-staffed systems:

Constant = average rate: Under-staffed during peak periods, leading to queue explosions and service degradation
Constant = maximum rate: Over-staffed during off-peak periods, wasting capacity and compute resources

The optimal approach tracks the arrival rate over time and adjusts staffing dynamically to match the current load plus an appropriate safety margin (scaled by peakedness for non-Poisson processes).

Evidence

Whitt et al. (2016) prove that time-varying arrival rates require time-varying staffing levels for efficiency
Constant staffing at maximum capacity wastes resources during low-traffic periods
Constant staffing at average capacity fails catastrophically during burst periods
Dynamic staffing based on current queue state and arrival rate estimates achieves both efficiency (no waste during quiet periods) and reliability (adequate capacity during bursts)

Application to Teleo Pipeline

Teleo's research processing pipeline exhibits strong non-stationarity: research dumps and futardio launches create burst periods with 15-20+ simultaneous arrivals, while other periods see minimal activity. Using a fixed MAX_WORKERS setting (constant staffing) is the worst of both worlds:

During bursts: MAX_WORKERS is too low, queue explodes, processing stalls
During quiet periods: MAX_WORKERS is too high, workers sit idle, compute wasted

Dynamic worker scaling based on current queue depth and estimated arrival rate (with peakedness adjustment) is the theoretically correct solution.

Additional Evidence (extend)

Source: 2026-02-09-oneuptime-hpa-object-metrics-queue-scaling | Added: 2026-03-16

Kubernetes HPA with object metrics demonstrates production implementation of dynamic worker allocation based on queue state. The pattern uses ConfigMaps or custom resources to expose queue depth, which HPA monitors to scale worker replicas. Multi-metric HPA evaluates several metrics simultaneously and scales to whichever requires the most replicas, handling complex workload patterns. KEDA extends this with 70+ built-in scalers for different queue types (RabbitMQ, Kafka, SQS, etc.) and scale-to-zero capability, proving dynamic staffing is production-ready at scale.

Relevant Notes:

square-root-staffing-formula-requires-peakedness-adjustment-for-non-poisson-arrivals
domains/internet-finance/_map

Topics:

core/mechanisms/_map

3.1 KiB Raw Blame History

Time-varying arrival rates require dynamic staffing not constant MAX_WORKERS because using average or maximum rates as constants creates systematic misallocation across the arrival cycle

Evidence

Application to Teleo Pipeline

Additional Evidence (extend)

3.1 KiB

Raw Blame History