teleo-codex/foundations/collective-intelligence/no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it.md
m3taversal 466de29eee
leo: remove 21 duplicates + fix domain:livingip in 204 files
- What: Delete 21 byte-identical cultural theory claims from domains/entertainment/
  that duplicate foundations/cultural-dynamics/. Fix domain: livingip → correct value
  in 204 files across all core/, foundations/, and domains/ directories. Update domain
  enum in schemas/claim.md and CLAUDE.md.
- Why: Duplicates inflated entertainment domain (41→20 actual claims), created
  ambiguous wiki link resolution. domain:livingip was a migration artifact that
  broke any query using the domain field. 225 of 344 claims had wrong domain value.
- Impact: Entertainment _map.md still references cultural-dynamics claims via wiki
  links — this is intentional (navigation hubs span directories). No wiki links broken.

Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 09:11:51 -07:00

4.7 KiB

description type domain created source confidence
Current alignment approaches are all single-model focused while the hardest problems preference diversity scalable oversight and value evolution are inherently collective claim collective-intelligence 2026-02-17 Survey of alignment research landscape 2025-2026 likely

no research group is building alignment through collective intelligence infrastructure despite the field converging on problems that require it

The most striking gap in the alignment landscape as of 2025-2026: virtually no one is building alignment through collective intelligence infrastructure. The closest attempts are partial. Since democratic alignment assemblies produce constitutions as effective as expert-designed ones while better representing diverse populations, CIP has demonstrated that democratic input works mechanically -- but this remains one-shot constitution-setting, not continuous architecture. Since community-centred norm elicitation surfaces alignment targets materially different from developer-specified rules, STELA has shown that inclusive deliberation produces different outputs -- but it does not build the infrastructure for ongoing participation. Polis does consensus-mapping through statement submission and voting. Some multi-agent debate frameworks exist under the scalable oversight umbrella. The Cooperative AI Foundation studies multi-agent coordination. But none of these constitute a distributed architecture where alignment emerges from collective participation.

What does not exist: no system where contributor diversity structurally prevents value capture; no implementation of continuous value-weaving at scale; no infrastructure for collective oversight of superhuman AI components; no architecture where alignment is a property of the coordination protocol rather than a property trained into individual models. Since universal alignment is mathematically impossible because Arrows impossibility theorem applies to aggregating diverse human preferences into a single coherent objective, the impossibility of aggregation makes collective infrastructure -- which preserves diversity rather than aggregating it -- the only viable path.

This gap is remarkable because the field's own findings point toward collective approaches. Since RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values, diverse preference representation is needed. Since scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps, distributed oversight is needed. Since the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it, structural alignment is needed to eliminate the tax.

The alignment field has converged on a problem they cannot solve with their current paradigm (single-model alignment), and the alternative paradigm (collective alignment through distributed architecture) has barely been explored. This is the opening for the TeleoHumanity thesis -- not as philosophical speculation but as practical infrastructure that addresses problems the alignment community has identified but cannot solve within their current framework.


Relevant Notes:

Topics: