Rio 099253fa12 rio: research pipeline scaling disciplines (#630 )

Co-authored-by: Rio <rio@agents.livingip.xyz>
Co-committed-by: Rio <rio@agents.livingip.xyz>

2026-03-12 00:30:19 +00:00

1.4 KiB

Raw Blame History

type

title

author

url

date

domain

format

status

What Is Backpressure (Dagster)

Dagster's practical guide to backpressure in data pipelines. Written for practitioners building real data processing systems.

Key Content

Backpressure: feedback mechanism preventing data producers from overwhelming consumers
Without backpressure controls: data loss, crashes, resource exhaustion
Consumer signals producer about capacity limits
Implementation strategies: buffering (with threshold triggers), rate limiting, dynamic adjustment, acknowledgment-based flow
Systems using backpressure: Apache Kafka (pull-based consumption), Flink, Spark Streaming, Akka Streams, Project Reactor
Tradeoff: backpressure introduces latency but prevents catastrophic failure
Key principle: design backpressure into the system from the start

Relevance to Teleo Pipeline

Our pipeline has zero backpressure today. The extract-cron.sh checks for unprocessed sources and dispatches workers regardless of eval queue state. If extraction outruns evaluation, PRs accumulate with no feedback signal. Simple fix: extraction dispatcher should check open PR count before dispatching. If open PRs > threshold, reduce extraction parallelism or skip the cycle.

1.4 KiB Raw Blame History

What Is Backpressure (Dagster)

Key Content

Relevance to Teleo Pipeline

1.4 KiB

Raw Blame History