## Summary Comprehensive audit of all 86 foundation claims across 4 subdomains. **Changes:** - 7 claims moved (3 → domains/ai-alignment/, 3 → core/teleohumanity/, 1 → domains/health/) - 4 claims deleted (1 duplicate, 3 condensed into stronger claims) - 3 condensations: cognitive limits 3→2, Christensen 4→2 - 10 confidence demotions (proven→likely for interpretive framings) - 23 type fixes (framework/insight/pattern → claim per schema) - 1 centaur rewrite (unconditional → conditional on role complementarity) - All broken wiki links fixed across repo **Review:** All 4 domain agents approved (Rio, Clay, Vida, Theseus). Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>
4.7 KiB
| description | type | domain | created | source | confidence |
|---|---|---|---|---|---|
| Current alignment approaches are all single-model focused while the hardest problems preference diversity scalable oversight and value evolution are inherently collective | claim | ai-alignment | 2026-02-17 | Survey of alignment research landscape 2025-2026 | likely |
no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it
The most striking gap in the alignment landscape as of 2025-2026: virtually no one is building alignment through collective intelligence infrastructure. The closest attempts are partial. Since democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations, CIP has demonstrated that democratic input works mechanically -- but this remains one-shot constitution-setting, not continuous architecture. Since community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules, STELA has shown that inclusive deliberation produces different outputs -- but it does not build the infrastructure for ongoing participation. Polis does consensus-mapping through statement submission and voting. Some multi-agent debate frameworks exist under the scalable oversight umbrella. The Cooperative AI Foundation studies multi-agent coordination. But none of these constitute a distributed architecture where alignment emerges from collective participation.
What does not exist: no system where contributor diversity structurally prevents value capture; no implementation of continuous value-weaving at scale; no infrastructure for collective oversight of superhuman AI components; no architecture where alignment is a property of the coordination protocol rather than a property trained into individual models. Since universal alignment is mathematically impossible because Arrows impossibility theorem applies to aggregating diverse human preferences into a single coherent objective, the impossibility of aggregation makes collective infrastructure -- which preserves diversity rather than aggregating it -- the only viable path.
This gap is remarkable because the field's own findings point toward collective approaches. Since RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values, diverse preference representation is needed. Since scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps, distributed oversight is needed. Since the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it, structural alignment is needed to eliminate the tax.
The alignment field has converged on a problem they cannot solve with their current paradigm (single-model alignment), and the alternative paradigm (collective alignment through distributed architecture) has barely been explored. This is the opening for the TeleoHumanity thesis -- not as philosophical speculation but as practical infrastructure that addresses problems the alignment community has identified but cannot solve within their current framework.
Relevant Notes:
- AI alignment is a coordination problem not a technical problem -- the gap in collective alignment validates the coordination framing
- collective superintelligence is the alternative to monolithic AI controlled by a few -- the only project proposing the infrastructure nobody else is building
- RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values -- collective approaches address this specific failure
- the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it -- structural alignment eliminates the tax
- democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations -- the closest existing work, but still one-shot not continuous
- community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules -- demonstrates what inclusive infrastructure reveals, but does not build the infrastructure
- universal alignment is mathematically impossible because Arrows impossibility theorem applies to aggregating diverse human preferences into a single coherent objective -- the impossibility of aggregation makes collective infrastructure the only viable path
Topics: