--- type: claim domain: ai-alignment secondary_domains: [collective-intelligence] description: "ML's core function of generalizing over diversity creates structural bias against dataset outliers where vulnerable populations concentrate" confidence: experimental source: "UK AI4CI Research Network national strategy (2024)" created: 2024-11-01 --- # Machine learning pattern extraction systematically erases outliers where vulnerable populations concentrate Machine learning fundamentally "extracts patterns that generalise over diversity in a data set" in ways that "fail to capture, respect or represent features of dataset outliers." This is not a bug or training artifact—it is the core function of ML systems. The UK AI4CI national research strategy identifies this as a structural barrier to reaching "intersectionally disadvantaged" populations, who by definition concentrate in the statistical tails that pattern-extraction optimizes away. This creates a fundamental tension for AI-enhanced collective intelligence: the same systems designed to aggregate distributed knowledge actively homogenize that knowledge by design. ML's optimization target (generalization) is structurally opposed to diversity preservation. ## Evidence The UK AI for Collective Intelligence Research Network's national strategy explicitly frames this as a core challenge: "AI must reach intersectionally disadvantaged populations, but the technical foundation (ML pattern extraction) systematically fails at the margins where those populations exist." The strategy identifies this not as a training problem but as a structural property of how ML generalizes—the algorithm's success metric (fitting a model that generalizes across the dataset) is mechanically opposed to preserving the variation that characterizes outlier populations. ## Implications This suggests that AI-enhanced collective intelligence cannot simply apply standard ML architectures to human knowledge aggregation. The infrastructure must actively counteract ML's homogenizing tendency through: - Federated learning that preserves local variation - Explicit outlier protection in training objectives - Governance mechanisms that weight minority perspectives The AI4CI strategy proposes these as requirements, not optimizations. ## Tensions This claim assumes that pattern-extraction and outlier-preservation are fundamentally opposed. Alternative architectures (e.g., mixture-of-experts models, adaptive weighting schemes) might partially decouple these objectives, though the strategy does not claim they fully resolve the tension. --- Relevant Notes: - [[collective intelligence requires diversity as a structural precondition not a moral preference]] - [[RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values]] - [[partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity]] Topics: - [[domains/ai-alignment/_map]] - [[foundations/collective-intelligence/_map]]