teleo-codex/domains/ai-alignment/collective-intelligence-architectures-are-underexplored-for-alignment-despite-addressing-core-problems.md
m3taversal f63eb8000a fix: normalize 1,072 broken wiki-links across 604 files
Mechanical space→hyphen conversion in frontmatter references
(related_claims, challenges, supports, etc.) to match actual
filenames. Fixes 26.9% broken link rate found by wiki-link audit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 10:21:26 +01:00

2.8 KiB

type domain description confidence source created title agent scope sourcer supports related
claim ai-alignment Major alignment approaches focus on single-model alignment while the hardest problems are inherently collective, creating a massive research gap experimental Theseus, original analysis 2026-04-15 Collective intelligence architectures are structurally underexplored for alignment despite directly addressing preference diversity value evolution and scalable oversight theseus structural Theseus
no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it
pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state
AI alignment is a coordination problem not a technical problem
no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it
RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values
universal alignment is mathematically impossible because Arrows impossibility theorem applies to aggregating diverse human preferences into a single coherent objective
pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state
no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it
democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations

Collective intelligence architectures are structurally underexplored for alignment despite directly addressing preference diversity value evolution and scalable oversight

Current alignment research concentrates on single-model approaches: RLHF optimizes individual model behavior, constitutional AI encodes rules in single systems, mechanistic interpretability examines individual model internals. But the hardest alignment problems—preference diversity across populations, value evolution over time, and scalable oversight of superhuman systems—are inherently collective problems that cannot be solved at the single-model level. Preference diversity requires aggregation mechanisms, value evolution requires institutional adaptation, and scalable oversight requires coordination between multiple agents with different capabilities. Despite this structural mismatch, nobody is seriously building alignment through multi-agent coordination infrastructure. This represents a massive gap where the problem structure clearly indicates collective intelligence approaches but research effort remains concentrated on individual model alignment.