teleo-codex/inbox/archive/2024-00-00-dagster-data-backpressure.md
Teleo Pipeline 74a5a7ae64
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
extract: 2024-00-00-dagster-data-backpressure
Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-15 17:13:11 +00:00

2.3 KiB

type title author url date domain format status tags processed_by processed_date claims_extracted extraction_model extraction_notes
source What Is Backpressure Dagster https://dagster.io/glossary/data-backpressure 2024-01-01 internet-finance essay processed
pipeline-architecture
backpressure
data-pipelines
flow-control
rio 2026-03-11
backpressure-prevents-pipeline-failure-by-creating-feedback-loop-between-consumer-capacity-and-producer-rate.md
anthropic/claude-sonnet-4.5 Single claim extracted on backpressure as flow control mechanism. Source is practical implementation guide rather than research, so confidence is 'proven' based on widespread production adoption. Teleo pipeline relevance noted in claim body as concrete application context.

What Is Backpressure (Dagster)

Dagster's practical guide to backpressure in data pipelines. Written for practitioners building real data processing systems.

Key Content

  • Backpressure: feedback mechanism preventing data producers from overwhelming consumers
  • Without backpressure controls: data loss, crashes, resource exhaustion
  • Consumer signals producer about capacity limits
  • Implementation strategies: buffering (with threshold triggers), rate limiting, dynamic adjustment, acknowledgment-based flow
  • Systems using backpressure: Apache Kafka (pull-based consumption), Flink, Spark Streaming, Akka Streams, Project Reactor
  • Tradeoff: backpressure introduces latency but prevents catastrophic failure
  • Key principle: design backpressure into the system from the start

Relevance to Teleo Pipeline

Our pipeline has zero backpressure today. The extract-cron.sh checks for unprocessed sources and dispatches workers regardless of eval queue state. If extraction outruns evaluation, PRs accumulate with no feedback signal. Simple fix: extraction dispatcher should check open PR count before dispatching. If open PRs > threshold, reduce extraction parallelism or skip the cycle.

Key Facts

  • Backpressure implementations: buffering with thresholds, rate limiting, dynamic adjustment, acknowledgment-based flow
  • Systems using backpressure: Apache Kafka (pull-based), Flink, Spark Streaming, Akka Streams, Project Reactor
  • Failure modes without backpressure: data loss, crashes, resource exhaustion