- What: Source archives for key works by Yudkowsky (AGI Ruin, No Fire Alarm), Christiano (What Failure Looks Like, AI Safety via Debate, IDA, ELK), Russell (Human Compatible), Drexler (CAIS), and Bostrom (Vulnerable World Hypothesis) - Why: m3ta directive to ingest primary source materials for alignment researchers. These 9 texts are the foundational works underlying claims extracted in PRs #2414, #2418, and #2419. Source archives ensure agents can reference primary texts without re-fetching and content persists if URLs go down. - Connections: All 9 sources are marked as processed with claims_extracted linking to the specific KB claims they produced. Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>
6.6 KiB
| type | title | author | url | date | domain | intake_tier | rationale | proposed_by | format | status | processed_by | processed_date | claims_extracted | enrichments | tags | notes | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| source | Reframing Superintelligence: Comprehensive AI Services as General Intelligence | K. Eric Drexler | https://www.fhi.ox.ac.uk/wp-content/uploads/Reframing_Superintelligence_FHI-TR-2019-1.1-1.pdf | 2019-01-08 | ai-alignment | research-task | The closest published predecessor to our collective superintelligence thesis. Task-specific AI services collectively match superintelligence without unified agency. Phase 3 alignment research program — highest-priority source. | Theseus | whitepaper | processed | theseus | 2026-04-05 |
|
|
FHI Technical Report #2019-1. 210 pages. Also posted as LessWrong summary by Drexler on 2019-01-08. Alternative PDF mirror at owainevans.github.io/pdfs/Reframing_Superintelligence_FHI-TR-2019.pdf |
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
Published January 2019 as FHI Technical Report #2019-1 by K. Eric Drexler (Future of Humanity Institute, Oxford). 210-page report arguing that the standard model of superintelligence as a unified, agentic system is both misleading and unnecessarily dangerous.
The Core Reframing
Drexler argues that most AI safety discourse assumes a specific architecture — a monolithic agent with general goals, world models, and long-horizon planning. This assumption drives most alignment concerns (instrumental convergence, deceptive alignment, corrigibility challenges). But this architecture is not necessary for superintelligent-level performance.
The alternative: Comprehensive AI Services (CAIS). Instead of one superintelligent agent, build many specialized, task-specific AI services that collectively provide any capability a unified system could deliver.
Key Arguments
Services vs. Agents
| Property | Agent (standard model) | Service (CAIS) |
|---|---|---|
| Goals | General, persistent | Task-specific, ephemeral |
| World model | Comprehensive | Task-relevant only |
| Planning horizon | Long-term, strategic | Short-term, bounded |
| Identity | Persistent self | Stateless per-invocation |
| Instrumental convergence | Strong | Weak (no persistent goals) |
The safety advantage: services don't develop instrumental goals (self-preservation, resource acquisition, goal stability) because they don't have persistent objectives to preserve. Each service completes its task and terminates.
How Services Achieve General Intelligence
- Composition: Complex tasks are decomposed into simpler subtasks, each handled by a specialized service
- Orchestration: A (non-agentic) coordination layer routes tasks to appropriate services
- Recursive capability: The set of services can include the service of developing new services
- Comprehensiveness: Asymptotically, the service collective can handle any task a unified agent could
The Service-Development Service
A critical point: CAIS includes the ability to develop new services, guided by concrete human goals and informed by strong models of human approval. This is not a monolithic self-improving agent — it's a development process where:
- Humans specify what new capability is needed
- A service-development service creates it
- The new service is tested, validated, and deployed
- Each step involves human oversight
Why CAIS Avoids Standard Alignment Problems
- No instrumental convergence: Services don't have persistent goals, so they don't develop power-seeking behavior
- No deceptive alignment: Services are too narrow to develop strategic deception
- Natural corrigibility: Services that complete tasks and terminate don't resist shutdown
- Bounded impact: Each service has limited scope and duration
- Oversight-compatible: The decomposition into subtasks creates natural checkpoints for human oversight
The Emergent Agency Objection
The strongest objection to CAIS (and the one that produced a CHALLENGE claim in our KB): sufficiently complex service meshes may exhibit de facto unified agency even though no individual component possesses it.
- Complex service interactions could create persistent goals at the system level
- Optimization of service coordination could effectively create a planning horizon
- Information sharing between services could constitute a de facto world model
- The service collective might resist modifications that reduce its collective capability
This is the "emergent agency from service composition" problem — distinct from both monolithic AGI risk (Yudkowsky) and competitive multi-agent dynamics (multipolar instability).
Reception and Impact
- Warmly received by some in the alignment community (especially those building modular AI systems)
- Critiqued by Yudkowsky and others who argue that economic competition will push toward agentic, autonomous systems regardless of architectural preferences
- DeepMind's "Patchwork AGI" concept (2025) independently arrived at similar conclusions, validating the architectural intuition
- Most directly relevant to multi-agent AI systems, including our own collective architecture
Significance for Teleo KB
CAIS is the closest published framework to our collective superintelligence thesis, published six years before our architecture was designed. The key questions for our KB:
- Where does our architecture extend beyond CAIS? (We use persistent agents with identity and memory, which CAIS deliberately avoids)
- Where are we vulnerable to the same critiques? (The emergent agency objection applies to us)
- Is our architecture actually safer than CAIS? (Our agents have persistent goals, which CAIS argues against)
Understanding exactly where we overlap with and diverge from CAIS is essential for positioning our thesis in the broader alignment landscape.