teleo-codex/domains/ai-alignment
2026-03-06 12:35:07 +00:00
..
_map.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
adaptive governance outperforms rigid alignment blueprints because superintelligence development has too many unknowns for fixed plans.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
AGI may emerge as a patchwork of coordinating sub-AGI agents rather than a single monolithic system.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
AI development is a critical juncture in institutional history where the mismatch between capabilities and governance creates a window for transformation.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
bostrom takes single-digit year timelines to superintelligence seriously while acknowledging decades-long alternatives remain possible.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
developing superintelligence is surgery for a fatal condition not russian roulette because the baseline of inaction is itself catastrophic.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
instrumental convergence risks may be less imminent than originally argued because current AI architectures do not exhibit systematic power-seeking behavior.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
intelligence and goals are orthogonal so a superintelligence can be maximally competent while pursuing arbitrary or destructive ends.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
intrinsic proactive alignment develops genuine moral capacity through self-awareness empathy and theory of mind rather than external reward optimization.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
permanently failing to develop superintelligence is itself an existential catastrophe because preventable mass death continues indefinitely.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
persistent irreducible disagreement.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
recursive self-improvement creates explosive intelligence gains because the system that improves is itself improving.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
specifying human values in code is intractable because our goals contain hidden complexity comparable to visual perception.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
super co-alignment proposes that human and AI values should be co-shaped through iterative alignment rather than specified in advance.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
the first mover to superintelligence likely gains decisive strategic advantage because the gap between leader and followers accelerates during takeoff.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
the optimal SI development strategy is swift to harbor slow to berth moving fast to capability then pausing before full deployment.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00
the specification trap means any values encoded at training time become structurally unstable as deployment contexts diverge from training conditions.md Auto: 24 files | 24 files changed, 898 insertions(+) 2026-03-06 12:35:07 +00:00