--- type: source title: "How to Implement HPA with Object Metrics for Queue-Based Scaling" author: "OneUptime" url: https://oneuptime.com/blog/post/2026-02-09-hpa-object-metrics-queue/view date: 2026-02-09 domain: internet-finance format: essay status: enrichment tags: [pipeline-architecture, kubernetes, autoscaling, queue-based-scaling, KEDA, HPA] processed_by: rio processed_date: 2026-03-16 enrichments_applied: ["time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md", "aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md"] extraction_model: "anthropic/claude-sonnet-4.5" --- # How to Implement HPA with Object Metrics for Queue-Based Scaling Practical guide to implementing Kubernetes HPA scaling based on queue depth rather than CPU/memory metrics. Covers object metrics, custom metrics, and integration patterns. ## Key Content - Queue depth is a better scaling signal than CPU for worker-style workloads - Object metrics in HPA allow scaling based on custom Kubernetes objects (ConfigMaps, custom resources) - Pattern: monitor pending messages in queue → scale workers to process them - Multi-metric HPA: evaluate several metrics simultaneously, scale to whichever requires most replicas - KEDA (Kubernetes Event Driven Autoscaler): scale-to-zero capability, 70+ built-in scalers - KEDA pattern: 0 → 1 via event trigger, 1 → N via HPA metrics feed - Key insight: scale proactively based on how much work is waiting, not reactively based on how busy workers are ## Relevance to Teleo Pipeline We don't run Kubernetes, but the patterns are directly transferable to our cron-based system: 1. Replace fixed MAX_WORKERS with queue-depth-based scaling: workers = f(queue_depth) 2. Implement scale-to-zero: if no unprocessed sources, don't spawn workers at all (we already do this) 3. Multi-metric scaling: consider both extract queue depth AND eval queue depth when deciding extraction worker count 4. The proactive scaling insight is key: our dispatcher should look at queue depth, not just worker availability ## Key Facts - KEDA (Kubernetes Event Driven Autoscaler) supports 70+ built-in scalers for different event sources - KEDA implements scale-to-zero capability: 0→1 replicas via event trigger, 1→N replicas via HPA metrics - HPA object metrics allow scaling based on custom Kubernetes objects like ConfigMaps and custom resources - Multi-metric HPA evaluates several metrics simultaneously and scales to whichever requires the most replicas