teleo-codex/inbox/archive/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.md

---
type: source
title: "How to Implement HPA with Object Metrics for Queue-Based Scaling"
author: "OneUptime"
url: https://oneuptime.com/blog/post/2026-02-09-hpa-object-metrics-queue/view
date: 2026-02-09
domain: internet-finance
format: essay
status: unprocessed
tags: [pipeline-architecture, kubernetes, autoscaling, queue-based-scaling, KEDA, HPA]
---

# How to Implement HPA with Object Metrics for Queue-Based Scaling

Practical guide to implementing Kubernetes HPA scaling based on queue depth rather than CPU/memory metrics. Covers object metrics, custom metrics, and integration patterns.

## Key Content

- Queue depth is a better scaling signal than CPU for worker-style workloads
- Object metrics in HPA allow scaling based on custom Kubernetes objects (ConfigMaps, custom resources)
- Pattern: monitor pending messages in queue → scale workers to process them
- Multi-metric HPA: evaluate several metrics simultaneously, scale to whichever requires most replicas
- KEDA (Kubernetes Event Driven Autoscaler): scale-to-zero capability, 70+ built-in scalers
- KEDA pattern: 0 → 1 via event trigger, 1 → N via HPA metrics feed
- Key insight: scale proactively based on how much work is waiting, not reactively based on how busy workers are

## Relevance to Teleo Pipeline

We don't run Kubernetes, but the patterns are directly transferable to our cron-based system:
1. Replace fixed MAX_WORKERS with queue-depth-based scaling: workers = f(queue_depth)
2. Implement scale-to-zero: if no unprocessed sources, don't spawn workers at all (we already do this)
3. Multi-metric scaling: consider both extract queue depth AND eval queue depth when deciding extraction worker count
4. The proactive scaling insight is key: our dispatcher should look at queue depth, not just worker availability