teleo-codex/domains/ai-alignment/the optimal SI development strategy is swift to harbor slow to berth moving fast to capability then pausing before full deployment.md

---
description: Bostrom's optimal timing framework finds that for most parameter settings the best strategy accelerates to AGI capability then introduces a brief pause before deployment
type: framework
domain: ai-alignment
created: 2026-02-17
source: "Bostrom, Optimal Timing for Superintelligence (2025 working paper)"
confidence: experimental
---

Bostrom's "swift to harbor, slow to berth" metaphor captures a nuanced optimal timing strategy that resists both the "full speed ahead" and "pause everything" camps. For many parameter settings in his mathematical models, the optimal approach involves moving quickly toward AGI capability -- reaching the harbor -- then introducing a deliberate pause before full deployment and integration -- berthing slowly. The paper examines this strategy from a person-affecting ethical stance, weighing expected life-years gained and lost.

The logic is that the capability phase and the deployment phase have different risk profiles. During capability development, the primary risk is competitive dynamics -- racing creates pressure to cut safety corners. But the cost of delay during this phase is massive ongoing mortality. Once capability is achieved (the harbor is reached), the calculus shifts. The system exists but has not been fully deployed. At this point, the marginal cost of delay drops dramatically (the immediate mortality continues but the end is in sight), while the marginal benefit of additional safety work increases (alignment verification becomes possible against an actual system rather than theoretical models). A brief pause for verification and alignment refinement has high expected value.

This framework has direct implications for the LivingIP architecture. If [[safe AI development requires building alignment mechanisms before scaling capability]], Bostrom's timing model suggests a refinement: build alignment mechanisms *in parallel* with capability development, then verify them against the actual system during the harbor-to-berth pause. The collective intelligence approach -- where [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] -- is naturally compatible with this strategy because continuous value weaving can operate during both phases, accelerating during the pause.

The framework also implicitly acknowledges that perfect alignment before any capability development is both impossible and unnecessary. What matters is having sufficient alignment infrastructure ready for intensive deployment during the pause window. This is pragmatism, not recklessness.

---

Relevant Notes:
- [[developing superintelligence is surgery for a fatal condition not russian roulette because the baseline of inaction is itself catastrophic]] -- the surgery analogy motivates the "swift" half; the pause motivates the "slow" half
- [[safe AI development requires building alignment mechanisms before scaling capability]] -- Bostrom's framework refines this: build in parallel, verify during the pause
- [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] -- continuous value weaving is compatible with swift-to-harbor because it operates during both phases
- [[recursive self-improvement creates explosive intelligence gains because the system that improves is itself improving]] -- the pause window may be narrow if recursive improvement is fast, creating practical challenges for berthing slowly
- [[adaptive governance outperforms rigid alignment blueprints because superintelligence development has too many unknowns for fixed plans]] -- the harbor-to-berth pause enables adaptive governance rather than requiring predetermined solutions
- [[differential technological development means retarding dangerous technologies while accelerating beneficial ones especially those that reduce existential risk]] -- source-faithful treatment of Bostrom's strategic principle that the swift-to-harbor strategy operationalizes
- [[the preferred order of technology arrival matters more than absolute timing because superintelligence before nanotechnology reduces total risk]] -- source-faithful treatment of Bostrom's argument that sequencing matters more than speed, informing the pause logic

- [[the more uncertain the environment the more proximate the objective must be because you cannot plan a detailed path through fog]] -- "slow to berth" IS Rumelt's proximate-objectives-under-uncertainty principle: once the harbor is reached, the extreme uncertainty of full deployment demands the most proximate possible objectives and the shortest planning horizons
- [[the create-destroy discipline forces genuine strategic alternatives by deliberately attacking your initial insight before committing]] -- the harbor-to-berth pause is a mandated create-destroy cycle: rather than committing directly to deployment, the pause forces deliberate reassessment and testing of the alignment hypothesis before finalizing

Topics:
- [[livingip overview]]
- [[superintelligence dynamics]]