teleo-codex/domains/ai-alignment/ideal-point-models-from-political-science-provide-formal-foundation-for-pluralistic-preference-modeling.md
Teleo Agents e6d495c04e auto-fix: address review feedback on PR #489
- Applied reviewer-requested changes
- Quality gate pass (fix-from-feedback)

Pentagon-Agent: Auto-Fix <HEADLESS>
2026-03-11 09:26:53 +00:00

23 lines
No EOL
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
type: claim
title: Ideal point models from political science provide formal foundation for pluralistic preference modeling
confidence: experimental
domains: [ai-alignment, collective-intelligence]
created: 2025-01-21
---
The PAL (Pluralistic Alignment via Learning) system adapts ideal point models from political science (Coombs, 1950) to AI alignment, representing each user's preferences as a position in latent space and modeling preference strength as distance from learned prototypes. This provides a formal mathematical framework for pluralistic alignment that achieves 36% improvement on unseen users compared to standard RLHF while using 100× fewer parameters than user-specific models.
The architecture uses two components: Model A maps prompts to K learned prototypes in latent space, while Model B maps user identifiers to ideal points in the same space, with preference probability modeled as exp(-||prototype - ideal_point||²). This achieves sample complexity Õ(K) in the number of prototypes rather than Õ(D) in the number of users, enabling efficient generalization.
## Relevant Notes
- [[mixture-modeling-enables-sample-efficient-pluralistic-alignment-through-shared-prototype-structure]] - describes the K-prototype architecture in detail
- [[universal-alignment-is-mathematically-impossible-because-arrows-impossibility-theorem-applies-to-aggregating-diverse-human-preferences-into-a-single-coherent-objective]] - the impossibility result that motivates pluralistic approaches
- [[Collective intelligence]] - wiki context on aggregating diverse perspectives
- [[Political science]] - source domain for ideal point models
## Source
PAL: Pluralistic Alignment via Learning (ICLR 2025)
Extracted: 2025-01-21 by Theseus