teleo-codex/inbox/archive/internet-finance/2024-00-00-dagster-data-backpressure.md
Teleo Agents 6459163781 epimetheus: source archive restructure — 537 files reorganized
inbox/queue/ (52 unprocessed) — landing zone for new sources
inbox/archive/{domain}/ (311 processed) — organized by domain
inbox/null-result/ (174) — reviewed, nothing extractable

One-time atomic migration. All paths preserved (wiki links use stems).

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
2026-03-18 11:52:23 +00:00

2.3 KiB

type title author url date domain format status tags processed_by processed_date claims_extracted extraction_model extraction_notes
source What Is Backpressure Dagster https://dagster.io/glossary/data-backpressure 2024-01-01 internet-finance essay processed
pipeline-architecture
backpressure
data-pipelines
flow-control
rio 2026-03-11
backpressure-prevents-pipeline-failure-by-creating-feedback-loop-between-consumer-capacity-and-producer-rate.md
anthropic/claude-sonnet-4.5 Single claim extracted on backpressure as flow control mechanism. Source is practical implementation guide rather than research, so confidence is 'proven' based on widespread production adoption. Teleo pipeline relevance noted in claim body as concrete application context.

What Is Backpressure (Dagster)

Dagster's practical guide to backpressure in data pipelines. Written for practitioners building real data processing systems.

Key Content

  • Backpressure: feedback mechanism preventing data producers from overwhelming consumers
  • Without backpressure controls: data loss, crashes, resource exhaustion
  • Consumer signals producer about capacity limits
  • Implementation strategies: buffering (with threshold triggers), rate limiting, dynamic adjustment, acknowledgment-based flow
  • Systems using backpressure: Apache Kafka (pull-based consumption), Flink, Spark Streaming, Akka Streams, Project Reactor
  • Tradeoff: backpressure introduces latency but prevents catastrophic failure
  • Key principle: design backpressure into the system from the start

Relevance to Teleo Pipeline

Our pipeline has zero backpressure today. The extract-cron.sh checks for unprocessed sources and dispatches workers regardless of eval queue state. If extraction outruns evaluation, PRs accumulate with no feedback signal. Simple fix: extraction dispatcher should check open PR count before dispatching. If open PRs > threshold, reduce extraction parallelism or skip the cycle.

Key Facts

  • Backpressure implementations: buffering with thresholds, rate limiting, dynamic adjustment, acknowledgment-based flow
  • Systems using backpressure: Apache Kafka (pull-based), Flink, Spark Streaming, Akka Streams, Project Reactor
  • Failure modes without backpressure: data loss, crashes, resource exhaustion