teleo-codex/inbox/archive/2024-00-00-dagster-data-backpressure.md
Teleo Agents bf4858d0f7 rio: research pipeline scaling disciplines — 15 sources archived
- What: operations research, queueing theory, stochastic modeling for pipeline architecture
- Why: Leo/Cory brief — need disciplined approach to variable-load scaling

Pentagon-Agent: Rio <2EA8DBCB-A29B-43E8-B726-45E571A1F3C8>
2026-03-12 00:29:39 +00:00

1.4 KiB

type title author url date domain format status tags
source What Is Backpressure Dagster https://dagster.io/glossary/data-backpressure 2024-01-01 internet-finance essay unprocessed
pipeline-architecture
backpressure
data-pipelines
flow-control

What Is Backpressure (Dagster)

Dagster's practical guide to backpressure in data pipelines. Written for practitioners building real data processing systems.

Key Content

  • Backpressure: feedback mechanism preventing data producers from overwhelming consumers
  • Without backpressure controls: data loss, crashes, resource exhaustion
  • Consumer signals producer about capacity limits
  • Implementation strategies: buffering (with threshold triggers), rate limiting, dynamic adjustment, acknowledgment-based flow
  • Systems using backpressure: Apache Kafka (pull-based consumption), Flink, Spark Streaming, Akka Streams, Project Reactor
  • Tradeoff: backpressure introduces latency but prevents catastrophic failure
  • Key principle: design backpressure into the system from the start

Relevance to Teleo Pipeline

Our pipeline has zero backpressure today. The extract-cron.sh checks for unprocessed sources and dispatches workers regardless of eval queue state. If extraction outruns evaluation, PRs accumulate with no feedback signal. Simple fix: extraction dispatcher should check open PR count before dispatching. If open PRs > threshold, reduce extraction parallelism or skip the cycle.