extract: 2019-07-00-li-overview-mdp-queues-networks

Pentagon-Agent: Ganymede <F99EBFA6-547B-4096-BEEA-1D59C3E4028A>
2026-03-15 15:55:57 +00:00 · 2026-03-15 15:55:57 +00:00 · 51a2ed39fc
commit 51a2ed39fc
parent e0c9323264
3 changed files with 87 additions and 1 deletions
--- a/domains/internet-finance/optimal-queue-policies-have-threshold-structure-making-simple-rules-near-optimal.md
+++ b/domains/internet-finance/optimal-queue-policies-have-threshold-structure-making-simple-rules-near-optimal.md
@ -0,0 +1,36 @@
 ---
 type: claim
 domain: internet-finance
 description: "MDP research shows threshold policies are provably optimal for most queueing systems"
 confidence: proven
 source: "Li et al., 'An Overview for Markov Decision Processes in Queues and Networks' (2019)"
 created: 2026-03-11
 ---
 # Optimal queue policies have threshold structure making simple rules near-optimal
 Six decades of operations research on Markov Decision Processes applied to queueing systems consistently shows that optimal policies have threshold structure: "serve if queue > K, idle if queue < K" or "spawn worker if queue > X and workers < Y." This means even without solving the full MDP, well-tuned threshold policies achieve near-optimal performance.
 For multi-server systems, optimal admission and routing policies follow similar patterns: join-shortest-queue, threshold-based admission control. The structural simplicity emerges from the mathematical properties of the value function in continuous-time MDPs where decisions happen at state transitions (arrivals, departures).
 This has direct implications for pipeline architecture: systems with manageable state spaces (queue depths across stages, worker counts, time-of-day) can use exact MDP solution via value iteration, but even approximate threshold policies will perform near-optimally due to the underlying structure.
 ## Evidence
 Li et al. survey 60+ years of MDP research in queueing theory (1960s to 2019), covering:
 - Continuous-time MDPs for queue management with decisions at state transitions
 - Classic results showing threshold structure in optimal policies
 - Multi-server systems where optimal policies are simple (join-shortest-queue, threshold-based)
 - Dynamic programming and stochastic optimization methods for deriving optimal policies
 The key challenge identified is curse of dimensionality: state space explodes with multiple queues/stages. Practical approaches include approximate dynamic programming and reinforcement learning for large state spaces.
 Emerging direction: deep RL for queue management in networks and cloud computing.
 ---
 Relevant Notes:
 - domains/internet-finance/_map
 Topics:
 - domains/internet-finance/_map
--- a/domains/internet-finance/pipeline-state-space-size-determines-whether-exact-mdp-solution-or-threshold-heuristics-are-optimal.md
+++ b/domains/internet-finance/pipeline-state-space-size-determines-whether-exact-mdp-solution-or-threshold-heuristics-are-optimal.md
@ -0,0 +1,35 @@
 ---
 type: claim
 domain: internet-finance
 description: "Small state spaces enable exact value iteration while large spaces require approximate policies"
 confidence: likely
 source: "Li et al., 'An Overview for Markov Decision Processes in Queues and Networks' (2019)"
 created: 2026-03-11
 ---
 # Pipeline state space size determines whether exact MDP solution or threshold heuristics are optimal
 The curse of dimensionality in queueing MDPs creates a sharp divide in optimal solution approaches. Systems with manageable state spaces—such as pipelines with queue depths across 3 stages, worker counts, and time-of-day variables—can use exact MDP solution via value iteration to derive provably optimal policies.
 However, as state space grows (multiple queues, many stages, complex dependencies), exact solution becomes computationally intractable. For these systems, approximate dynamic programming or reinforcement learning becomes necessary, accepting near-optimal performance in exchange for tractability.
 The Teleo pipeline architecture sits in the tractable regime: queue depths across 3 stages, worker counts, and time-of-day create a state space small enough for exact solution. This means the system can compute provably optimal policies rather than relying on heuristics, though the threshold structure of optimal policies means well-tuned simple rules would also perform near-optimally.
 ## Evidence
 Li et al. identify curse of dimensionality as the key challenge: "state space explodes with multiple queues/stages." The survey distinguishes between:
 - Small state spaces: exact MDP solution via value iteration
 - Large state spaces: approximate dynamic programming, reinforcement learning
 Practical approaches for large systems include deep RL for queue management in networks and cloud computing, accepting approximation in exchange for scalability.
 The source explicitly notes that Teleo pipeline has "a manageable state space (queue depths across 3 stages, worker counts, time-of-day)—small enough for exact MDP solution via value iteration."
 ---
 Relevant Notes:
 - optimal queue policies have threshold structure making simple rules near-optimal
 - domains/internet-finance/_map
 Topics:
 - domains/internet-finance/_map
--- a/inbox/archive/2019-07-00-li-overview-mdp-queues-networks.md
+++ b/inbox/archive/2019-07-00-li-overview-mdp-queues-networks.md
@ -6,8 +6,13 @@ url: https://arxiv.org/abs/1907.10243
 date: 2019-07-24
 domain: internet-finance
 format: paper
-status: unprocessed
+status: processed
 tags: [pipeline-architecture, operations-research, markov-decision-process, queueing-theory, dynamic-programming]
 processed_by: rio
 processed_date: 2026-03-11
 claims_extracted: ["optimal-queue-policies-have-threshold-structure-making-simple-rules-near-optimal.md", "pipeline-state-space-size-determines-whether-exact-mdp-solution-or-threshold-heuristics-are-optimal.md"]
 extraction_model: "anthropic/claude-sonnet-4.5"
 extraction_notes: "Academic survey of MDP applications to queueing theory. Extracted two claims about optimal policy structure and state space tractability. No entities (academic paper, no companies/products). No enrichments (claims are foundational operations research results, not directly connected to existing futarchy/capital formation claims in KB)."
 ---
 # An Overview for Markov Decision Processes in Queues and Networks
@ -27,3 +32,13 @@ Comprehensive 42-page survey of MDP applications in queueing systems, covering 6
 ## Relevance to Teleo Pipeline
 Our pipeline has a manageable state space (queue depths across 3 stages, worker counts, time-of-day) — small enough for exact MDP solution via value iteration. The survey confirms that optimal policies for our type of system typically have threshold structure: "if queue > X and workers < Y, spawn a worker." This means even without solving the full MDP, a well-tuned threshold policy will be near-optimal.
 ## Key Facts
 - Li et al. survey covers 60+ years of MDP research in queueing systems (1960s-2019)
 - Continuous-time MDPs for queues: decisions happen at state transitions (arrivals, departures)
 - Classic optimal policies: threshold structure (serve if queue > K, idle if queue < K)
 - Multi-server optimal policies: join-shortest-queue, threshold-based admission
 - Key challenge: curse of dimensionality with multiple queues/stages
 - Practical approaches: approximate dynamic programming, reinforcement learning for large state spaces
 - Emerging direction: deep RL for queue management in networks and cloud computing