From 34dd5bf93dca00908b74e63a37c3f1b30e335fed Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Mon, 16 Mar 2026 13:06:26 +0000 Subject: [PATCH] extract: 2026-02-09-oneuptime-hpa-object-metrics-queue-scaling Pentagon-Agent: Ganymede --- ...compute-coordination-without-prediction.md | 6 +++++ ...namic-staffing-not-constant-max-workers.md | 6 +++++ ...time-hpa-object-metrics-queue-scaling.json | 24 +++++++++++++++++++ ...uptime-hpa-object-metrics-queue-scaling.md | 13 +++++++++- 4 files changed, 48 insertions(+), 1 deletion(-) create mode 100644 inbox/archive/.extraction-debug/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.json diff --git a/domains/internet-finance/aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md b/domains/internet-finance/aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md index 18938df7..f623de6c 100644 --- a/domains/internet-finance/aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md +++ b/domains/internet-finance/aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md @@ -33,6 +33,12 @@ The multiplicative decrease (halving workers on congestion) provides rapid respo This is an application of proven AIMD theory to a specific system architecture, but the actual performance in the Teleo pipeline context is untested. The claim that AIMD is "perfect for" this setting is theoretical—empirical validation would strengthen confidence from experimental to likely. + +### Additional Evidence (extend) +*Source: [[2026-02-09-oneuptime-hpa-object-metrics-queue-scaling]] | Added: 2026-03-16* + +KEDA's two-phase scaling (0→1 via event trigger, 1→N via HPA metrics) implements a form of threshold-based scaling without requiring load prediction. The system observes queue state and responds with simple rules: any messages present triggers minimum capacity, then HPA scales linearly with queue depth. This validates that simple observation-based policies work in production without sophisticated prediction models. + --- Relevant Notes: diff --git a/domains/internet-finance/time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md b/domains/internet-finance/time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md index 6cc1d956..4af9fcb0 100644 --- a/domains/internet-finance/time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md +++ b/domains/internet-finance/time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md @@ -32,6 +32,12 @@ Teleo's research processing pipeline exhibits strong non-stationarity: research Dynamic worker scaling based on current queue depth and estimated arrival rate (with peakedness adjustment) is the theoretically correct solution. + +### Additional Evidence (extend) +*Source: [[2026-02-09-oneuptime-hpa-object-metrics-queue-scaling]] | Added: 2026-03-16* + +Kubernetes HPA with object metrics demonstrates production implementation of dynamic worker allocation based on queue state. The pattern uses ConfigMaps or custom resources to expose queue depth, which HPA monitors to scale worker replicas. Multi-metric HPA evaluates several metrics simultaneously and scales to whichever requires the most replicas, handling complex workload patterns. KEDA extends this with 70+ built-in scalers for different queue types (RabbitMQ, Kafka, SQS, etc.) and scale-to-zero capability, proving dynamic staffing is production-ready at scale. + --- Relevant Notes: diff --git a/inbox/archive/.extraction-debug/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.json b/inbox/archive/.extraction-debug/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.json new file mode 100644 index 00000000..e235a7d2 --- /dev/null +++ b/inbox/archive/.extraction-debug/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.json @@ -0,0 +1,24 @@ +{ + "rejected_claims": [ + { + "filename": "queue-depth-based-scaling-outperforms-cpu-based-scaling-for-worker-style-workloads-because-pending-work-is-a-leading-indicator-while-resource-utilization-is-lagging.md", + "issues": [ + "missing_attribution_extractor" + ] + } + ], + "validation_stats": { + "total": 1, + "kept": 0, + "fixed": 1, + "rejected": 1, + "fixes_applied": [ + "queue-depth-based-scaling-outperforms-cpu-based-scaling-for-worker-style-workloads-because-pending-work-is-a-leading-indicator-while-resource-utilization-is-lagging.md:set_created:2026-03-16" + ], + "rejections": [ + "queue-depth-based-scaling-outperforms-cpu-based-scaling-for-worker-style-workloads-because-pending-work-is-a-leading-indicator-while-resource-utilization-is-lagging.md:missing_attribution_extractor" + ] + }, + "model": "anthropic/claude-sonnet-4.5", + "date": "2026-03-16" +} \ No newline at end of file diff --git a/inbox/archive/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.md b/inbox/archive/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.md index 365e59b8..afb11094 100644 --- a/inbox/archive/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.md +++ b/inbox/archive/2026-02-09-oneuptime-hpa-object-metrics-queue-scaling.md @@ -6,8 +6,12 @@ url: https://oneuptime.com/blog/post/2026-02-09-hpa-object-metrics-queue/view date: 2026-02-09 domain: internet-finance format: essay -status: unprocessed +status: enrichment tags: [pipeline-architecture, kubernetes, autoscaling, queue-based-scaling, KEDA, HPA] +processed_by: rio +processed_date: 2026-03-16 +enrichments_applied: ["time-varying-arrival-rates-require-dynamic-staffing-not-constant-max-workers.md", "aimd-scaling-solves-variable-load-expensive-compute-coordination-without-prediction.md"] +extraction_model: "anthropic/claude-sonnet-4.5" --- # How to Implement HPA with Object Metrics for Queue-Based Scaling @@ -31,3 +35,10 @@ We don't run Kubernetes, but the patterns are directly transferable to our cron- 2. Implement scale-to-zero: if no unprocessed sources, don't spawn workers at all (we already do this) 3. Multi-metric scaling: consider both extract queue depth AND eval queue depth when deciding extraction worker count 4. The proactive scaling insight is key: our dispatcher should look at queue depth, not just worker availability + + +## Key Facts +- KEDA (Kubernetes Event Driven Autoscaler) supports 70+ built-in scalers for different event sources +- KEDA implements scale-to-zero capability: 0→1 replicas via event trigger, 1→N replicas via HPA metrics +- HPA object metrics allow scaling based on custom Kubernetes objects like ConfigMaps and custom resources +- Multi-metric HPA evaluates several metrics simultaneously and scales to whichever requires the most replicas