reweave: connect 31 orphan claims via vector similarity
Threshold: 0.7, Haiku classification, 33 files modified. Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
This commit is contained in:
parent
dfd05342d3
commit
427ab732f0
33 changed files with 213 additions and 0 deletions
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
description: Fixed-goal AI must get values right before deployment with no mechanism for correction -- collective superintelligence keeps humans in the loop so values evolve with understanding
|
description: Fixed-goal AI must get values right before deployment with no mechanism for correction -- collective superintelligence keeps humans in the loop so values evolve with understanding
|
||||||
type: claim
|
type: claim
|
||||||
domain: teleohumanity
|
domain: teleohumanity
|
||||||
created: 2026-02-16
|
created: 2026-02-16
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "TeleoHumanity Manifesto, Chapter 8"
|
source: "TeleoHumanity Manifesto, Chapter 8"
|
||||||
|
related:
|
||||||
|
- "transparent algorithmic governance where AI response rules are public and challengeable through the same epistemic process as the knowledge base is a structurally novel alignment approach"
|
||||||
|
reweave_edges:
|
||||||
|
- "transparent algorithmic governance where AI response rules are public and challengeable through the same epistemic process as the knowledge base is a structurally novel alignment approach|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance
|
# the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "Aquino-Michaels's three-component architecture — symbolic reasoner (GPT-5.4), computational solver (Claude Opus 4.6), and orchestrator (Claude Opus 4.6) — solved both odd and even cases of Knuth's problem by transferring artifacts between specialized agents"
|
description: "Aquino-Michaels's three-component architecture — symbolic reasoner (GPT-5.4), computational solver (Claude Opus 4.6), and orchestrator (Claude Opus 4.6) — solved both odd and even cases of Knuth's problem by transferring artifacts between specialized agents"
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "Aquino-Michaels 2026, 'Completing Claude's Cycles' (github.com/no-way-labs/residue)"
|
source: "Aquino-Michaels 2026, 'Completing Claude's Cycles' (github.com/no-way-labs/residue)"
|
||||||
created: 2026-03-07
|
created: 2026-03-07
|
||||||
|
supports:
|
||||||
|
- "tools and artifacts transfer between AI agents and evolve in the process because Agent O improved Agent Cs solver by combining it with its own structural knowledge creating a hybrid better than either original"
|
||||||
|
reweave_edges:
|
||||||
|
- "tools and artifacts transfer between AI agents and evolve in the process because Agent O improved Agent Cs solver by combining it with its own structural knowledge creating a hybrid better than either original|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# AI agent orchestration that routes data and tools between specialized models outperforms both single-model and human-coached approaches because the orchestrator contributes coordination not direction
|
# AI agent orchestration that routes data and tools between specialized models outperforms both single-model and human-coached approaches because the orchestrator contributes coordination not direction
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,21 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
description: Getting AI right requires simultaneous alignment across competing companies, nations, and disciplines at the speed of AI development -- no existing institution can coordinate this
|
description: Getting AI right requires simultaneous alignment across competing companies, nations, and disciplines at the speed of AI development -- no existing institution can coordinate this
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
created: 2026-02-16
|
created: 2026-02-16
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "TeleoHumanity Manifesto, Chapter 5"
|
source: "TeleoHumanity Manifesto, Chapter 5"
|
||||||
|
related:
|
||||||
|
- "AI investment concentration where 58 percent of funding flows to megarounds and two companies capture 14 percent of all global venture capital creates a structural oligopoly that alignment governance must account for"
|
||||||
|
- "Anthropic"
|
||||||
|
- "Dario Amodei"
|
||||||
|
reweave_edges:
|
||||||
|
- "AI investment concentration where 58 percent of funding flows to megarounds and two companies capture 14 percent of all global venture capital creates a structural oligopoly that alignment governance must account for|related|2026-03-28"
|
||||||
|
- "Anthropic|related|2026-03-28"
|
||||||
|
- "Dario Amodei|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# AI alignment is a coordination problem not a technical problem
|
# AI alignment is a coordination problem not a technical problem
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "National-scale CI infrastructure must enable distributed learning without centralizing sensitive data"
|
description: "National-scale CI infrastructure must enable distributed learning without centralizing sensitive data"
|
||||||
|
|
@ -6,6 +7,10 @@ confidence: experimental
|
||||||
source: "UK AI for CI Research Network, Artificial Intelligence for Collective Intelligence: A National-Scale Research Strategy (2024)"
|
source: "UK AI for CI Research Network, Artificial Intelligence for Collective Intelligence: A National-Scale Research Strategy (2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
secondary_domains: [collective-intelligence, critical-systems]
|
secondary_domains: [collective-intelligence, critical-systems]
|
||||||
|
related:
|
||||||
|
- "national scale collective intelligence infrastructure requires seven trust properties to achieve legitimacy"
|
||||||
|
reweave_edges:
|
||||||
|
- "national scale collective intelligence infrastructure requires seven trust properties to achieve legitimacy|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# AI-enhanced collective intelligence requires federated learning architectures to preserve data sovereignty at scale
|
# AI-enhanced collective intelligence requires federated learning architectures to preserve data sovereignty at scale
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,18 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "AI agents amplify existing expertise rather than replacing it because practitioners who understand what agents can and cannot do delegate more precisely, catch errors faster, and design better workflows"
|
description: "AI agents amplify existing expertise rather than replacing it because practitioners who understand what agents can and cannot do delegate more precisely, catch errors faster, and design better workflows"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Andrej Karpathy (@karpathy) and Simon Willison (@simonw), practitioner observations Feb-Mar 2026"
|
source: "Andrej Karpathy (@karpathy) and Simon Willison (@simonw), practitioner observations Feb-Mar 2026"
|
||||||
created: 2026-03-09
|
created: 2026-03-09
|
||||||
|
related:
|
||||||
|
- "AI agents excel at implementing well scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect"
|
||||||
|
- "the progression from autocomplete to autonomous agent teams follows a capability matched escalation where premature adoption creates more chaos than value"
|
||||||
|
reweave_edges:
|
||||||
|
- "AI agents excel at implementing well scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect|related|2026-03-28"
|
||||||
|
- "the progression from autocomplete to autonomous agent teams follows a capability matched escalation where premature adoption creates more chaos than value|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Deep technical expertise is a greater force multiplier when combined with AI agents because skilled practitioners delegate more effectively than novices
|
# Deep technical expertise is a greater force multiplier when combined with AI agents because skilled practitioners delegate more effectively than novices
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "Kim Morrison's Lean formalization of Knuth's proof of Claude's construction demonstrates formal verification as an oversight mechanism that scales with AI capability rather than degrading like human oversight"
|
description: "Kim Morrison's Lean formalization of Knuth's proof of Claude's construction demonstrates formal verification as an oversight mechanism that scales with AI capability rather than degrading like human oversight"
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "Knuth 2026, 'Claude's Cycles' (Stanford CS, Feb 28 2026 rev. Mar 6); Morrison 2026, Lean formalization (github.com/kim-em/KnuthClaudeLean/, posted Mar 4)"
|
source: "Knuth 2026, 'Claude's Cycles' (Stanford CS, Feb 28 2026 rev. Mar 6); Morrison 2026, Lean formalization (github.com/kim-em/KnuthClaudeLean/, posted Mar 4)"
|
||||||
created: 2026-03-07
|
created: 2026-03-07
|
||||||
|
supports:
|
||||||
|
- "formal verification becomes economically necessary as AI generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed"
|
||||||
|
reweave_edges:
|
||||||
|
- "formal verification becomes economically necessary as AI generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human review degrades
|
# formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human review degrades
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
secondary_domains: [collective-intelligence, cultural-dynamics]
|
secondary_domains: [collective-intelligence, cultural-dynamics]
|
||||||
|
|
@ -11,6 +12,10 @@ depends_on:
|
||||||
- "partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity"
|
- "partial connectivity produces better collective intelligence than full connectivity on complex problems because it preserves diversity"
|
||||||
challenged_by:
|
challenged_by:
|
||||||
- "Homogenizing Effect of Large Language Models on Creative Diversity (ScienceDirect, 2025) — naturalistic study of 2,200 admissions essays found AI-inspired stories more similar to each other than human-only stories, with the homogenization gap widening at scale"
|
- "Homogenizing Effect of Large Language Models on Creative Diversity (ScienceDirect, 2025) — naturalistic study of 2,200 admissions essays found AI-inspired stories more similar to each other than human-only stories, with the homogenization gap widening at scale"
|
||||||
|
supports:
|
||||||
|
- "human ideas naturally converge toward similarity over social learning chains making AI a net diversity injector rather than a homogenizer under high exposure conditions"
|
||||||
|
reweave_edges:
|
||||||
|
- "human ideas naturally converge toward similarity over social learning chains making AI a net diversity injector rather than a homogenizer under high exposure conditions|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# high AI exposure increases collective idea diversity without improving individual creative quality creating an asymmetry between group and individual effects
|
# high AI exposure increases collective idea diversity without improving individual creative quality creating an asymmetry between group and individual effects
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
secondary_domains: [collective-intelligence]
|
secondary_domains: [collective-intelligence]
|
||||||
|
|
@ -6,6 +7,10 @@ description: "Ensemble-level expected free energy characterizes basins of attrac
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "Ruiz-Serra et al., 'Factorised Active Inference for Strategic Multi-Agent Interactions' (AAMAS 2025)"
|
source: "Ruiz-Serra et al., 'Factorised Active Inference for Strategic Multi-Agent Interactions' (AAMAS 2025)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
related:
|
||||||
|
- "factorised generative models enable decentralized multi agent representation through individual level beliefs"
|
||||||
|
reweave_edges:
|
||||||
|
- "factorised generative models enable decentralized multi agent representation through individual level beliefs|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Individual free energy minimization does not guarantee collective optimization in multi-agent active inference systems
|
# Individual free energy minimization does not guarantee collective optimization in multi-agent active inference systems
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,6 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "MaxMin-RLHF adapts Sen's Egalitarian principle to AI alignment through mixture-of-rewards and maxmin optimization"
|
description: "MaxMin-RLHF adapts Sen's Egalitarian principle to AI alignment through mixture-of-rewards and maxmin optimization"
|
||||||
|
|
@ -6,6 +8,12 @@ confidence: experimental
|
||||||
source: "Chakraborty et al., MaxMin-RLHF (ICML 2024)"
|
source: "Chakraborty et al., MaxMin-RLHF (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
secondary_domains: [collective-intelligence]
|
secondary_domains: [collective-intelligence]
|
||||||
|
supports:
|
||||||
|
- "minority preference alignment improves 33 percent without majority compromise suggesting single reward leaves value on table"
|
||||||
|
- "single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness"
|
||||||
|
reweave_edges:
|
||||||
|
- "minority preference alignment improves 33 percent without majority compromise suggesting single reward leaves value on table|supports|2026-03-28"
|
||||||
|
- "single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# MaxMin-RLHF applies egalitarian social choice to alignment by maximizing minimum utility across preference groups rather than averaging preferences
|
# MaxMin-RLHF applies egalitarian social choice to alignment by maximizing minimum utility across preference groups rather than averaging preferences
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,18 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "MaxMin-RLHF's 33% minority improvement without majority loss suggests single-reward approach was suboptimal for all groups"
|
description: "MaxMin-RLHF's 33% minority improvement without majority loss suggests single-reward approach was suboptimal for all groups"
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "Chakraborty et al., MaxMin-RLHF (ICML 2024)"
|
source: "Chakraborty et al., MaxMin-RLHF (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
supports:
|
||||||
|
- "maxmin rlhf applies egalitarian social choice to alignment by maximizing minimum utility across preference groups"
|
||||||
|
- "single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness"
|
||||||
|
reweave_edges:
|
||||||
|
- "maxmin rlhf applies egalitarian social choice to alignment by maximizing minimum utility across preference groups|supports|2026-03-28"
|
||||||
|
- "single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Minority preference alignment improves 33% without majority compromise suggesting single-reward RLHF leaves value on table for all groups
|
# Minority preference alignment improves 33% without majority compromise suggesting single-reward RLHF leaves value on table for all groups
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "MixDPO shows distributional β earns +11.2 win rate points on heterogeneous data at 1.02–1.1× cost, without needing demographic labels or explicit mixture models"
|
description: "MixDPO shows distributional β earns +11.2 win rate points on heterogeneous data at 1.02–1.1× cost, without needing demographic labels or explicit mixture models"
|
||||||
|
|
@ -8,6 +9,10 @@ created: 2026-03-11
|
||||||
depends_on:
|
depends_on:
|
||||||
- "RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values"
|
- "RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values"
|
||||||
- "pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state"
|
- "pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state"
|
||||||
|
supports:
|
||||||
|
- "the variance of a learned preference sensitivity distribution diagnoses dataset heterogeneity and collapses to fixed parameter behavior when preferences are homogeneous"
|
||||||
|
reweave_edges:
|
||||||
|
- "the variance of a learned preference sensitivity distribution diagnoses dataset heterogeneity and collapses to fixed parameter behavior when preferences are homogeneous|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# modeling preference sensitivity as a learned distribution rather than a fixed scalar resolves DPO diversity failures without demographic labels or explicit user modeling
|
# modeling preference sensitivity as a learned distribution rather than a fixed scalar resolves DPO diversity failures without demographic labels or explicit user modeling
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "UK research strategy identifies human agency, security, privacy, transparency, fairness, value alignment, and accountability as necessary trust conditions"
|
description: "UK research strategy identifies human agency, security, privacy, transparency, fairness, value alignment, and accountability as necessary trust conditions"
|
||||||
|
|
@ -6,6 +7,10 @@ confidence: experimental
|
||||||
source: "UK AI for CI Research Network, Artificial Intelligence for Collective Intelligence: A National-Scale Research Strategy (2024)"
|
source: "UK AI for CI Research Network, Artificial Intelligence for Collective Intelligence: A National-Scale Research Strategy (2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
secondary_domains: [collective-intelligence, critical-systems]
|
secondary_domains: [collective-intelligence, critical-systems]
|
||||||
|
related:
|
||||||
|
- "ai enhanced collective intelligence requires federated learning architectures to preserve data sovereignty at scale"
|
||||||
|
reweave_edges:
|
||||||
|
- "ai enhanced collective intelligence requires federated learning architectures to preserve data sovereignty at scale|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# National-scale collective intelligence infrastructure requires seven trust properties to achieve legitimacy
|
# National-scale collective intelligence infrastructure requires seven trust properties to achieve legitimacy
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
description: Three forms of alignment pluralism -- Overton steerable and distributional -- are needed because standard alignment procedures actively reduce the diversity of model outputs
|
description: Three forms of alignment pluralism -- Overton steerable and distributional -- are needed because standard alignment procedures actively reduce the diversity of model outputs
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
created: 2026-02-17
|
created: 2026-02-17
|
||||||
source: "Sorensen et al, Roadmap to Pluralistic Alignment (arXiv 2402.05070, ICML 2024); Klassen et al, Pluralistic Alignment Over Time (arXiv 2411.10654, NeurIPS 2024); Harland et al, Adaptive Alignment (arXiv 2410.23630, NeurIPS 2024)"
|
source: "Sorensen et al, Roadmap to Pluralistic Alignment (arXiv 2402.05070, ICML 2024); Klassen et al, Pluralistic Alignment Over Time (arXiv 2411.10654, NeurIPS 2024); Harland et al, Adaptive Alignment (arXiv 2410.23630, NeurIPS 2024)"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
|
supports:
|
||||||
|
- "pluralistic ai alignment through multiple systems preserves value diversity better than forced consensus"
|
||||||
|
reweave_edges:
|
||||||
|
- "pluralistic ai alignment through multiple systems preserves value diversity better than forced consensus|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state
|
# pluralistic alignment must accommodate irreducibly diverse values simultaneously rather than converging on a single aligned state
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
secondary_domains: [mechanisms, collective-intelligence]
|
secondary_domains: [mechanisms, collective-intelligence]
|
||||||
|
|
@ -6,6 +7,10 @@ description: "AI alignment feedback should use citizens assemblies or representa
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
supports:
|
||||||
|
- "rlhf is implicit social choice without normative scrutiny"
|
||||||
|
reweave_edges:
|
||||||
|
- "rlhf is implicit social choice without normative scrutiny|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Representative sampling and deliberative mechanisms should replace convenience platforms for AI alignment feedback
|
# Representative sampling and deliberative mechanisms should replace convenience platforms for AI alignment feedback
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,6 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
secondary_domains: [mechanisms]
|
secondary_domains: [mechanisms]
|
||||||
|
|
@ -6,6 +8,13 @@ description: "The aggregated rankings variant of RLCHF applies formal social cho
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
related:
|
||||||
|
- "rlchf features based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups"
|
||||||
|
reweave_edges:
|
||||||
|
- "rlchf features based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups|related|2026-03-28"
|
||||||
|
- "rlhf is implicit social choice without normative scrutiny|supports|2026-03-28"
|
||||||
|
supports:
|
||||||
|
- "rlhf is implicit social choice without normative scrutiny"
|
||||||
---
|
---
|
||||||
|
|
||||||
# RLCHF aggregated rankings variant combines evaluator rankings via social welfare function before reward model training
|
# RLCHF aggregated rankings variant combines evaluator rankings via social welfare function before reward model training
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
secondary_domains: [mechanisms]
|
secondary_domains: [mechanisms]
|
||||||
|
|
@ -6,6 +7,10 @@ description: "The features-based RLCHF variant learns individual preference mode
|
||||||
confidence: experimental
|
confidence: experimental
|
||||||
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
related:
|
||||||
|
- "rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training"
|
||||||
|
reweave_edges:
|
||||||
|
- "rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# RLCHF features-based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups
|
# RLCHF features-based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,19 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "Current RLHF implementations make social choice decisions about evaluator selection and preference aggregation without examining their normative properties"
|
description: "Current RLHF implementations make social choice decisions about evaluator selection and preference aggregation without examining their normative properties"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
source: "Conitzer et al. (2024), 'Social Choice Should Guide AI Alignment' (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
supports:
|
||||||
|
- "representative sampling and deliberative mechanisms should replace convenience platforms for ai alignment feedback"
|
||||||
|
reweave_edges:
|
||||||
|
- "representative sampling and deliberative mechanisms should replace convenience platforms for ai alignment feedback|supports|2026-03-28"
|
||||||
|
- "rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training|related|2026-03-28"
|
||||||
|
related:
|
||||||
|
- "rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training"
|
||||||
---
|
---
|
||||||
|
|
||||||
# RLHF is implicit social choice without normative scrutiny
|
# RLHF is implicit social choice without normative scrutiny
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,18 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
description: "Formal impossibility result showing single reward models fail when human preferences are diverse across subpopulations"
|
description: "Formal impossibility result showing single reward models fail when human preferences are diverse across subpopulations"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Chakraborty et al., MaxMin-RLHF: Alignment with Diverse Human Preferences (ICML 2024)"
|
source: "Chakraborty et al., MaxMin-RLHF: Alignment with Diverse Human Preferences (ICML 2024)"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
supports:
|
||||||
|
- "maxmin rlhf applies egalitarian social choice to alignment by maximizing minimum utility across preference groups"
|
||||||
|
- "minority preference alignment improves 33 percent without majority compromise suggesting single reward leaves value on table"
|
||||||
|
reweave_edges:
|
||||||
|
- "maxmin rlhf applies egalitarian social choice to alignment by maximizing minimum utility across preference groups|supports|2026-03-28"
|
||||||
|
- "minority preference alignment improves 33 percent without majority compromise suggesting single reward leaves value on table|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Single-reward RLHF cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness and inversely to representation
|
# Single-reward RLHF cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness and inversely to representation
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
description: Some disagreements cannot be resolved with more evidence because they stem from genuine value differences or incommensurable goods and systems must map rather than eliminate them
|
description: Some disagreements cannot be resolved with more evidence because they stem from genuine value differences or incommensurable goods and systems must map rather than eliminate them
|
||||||
type: claim
|
type: claim
|
||||||
domain: ai-alignment
|
domain: ai-alignment
|
||||||
created: 2026-03-02
|
created: 2026-03-02
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Arrow's impossibility theorem; value pluralism (Isaiah Berlin); LivingIP design principles"
|
source: "Arrow's impossibility theorem; value pluralism (Isaiah Berlin); LivingIP design principles"
|
||||||
|
supports:
|
||||||
|
- "pluralistic ai alignment through multiple systems preserves value diversity better than forced consensus"
|
||||||
|
reweave_edges:
|
||||||
|
- "pluralistic ai alignment through multiple systems preserves value diversity better than forced consensus|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# some disagreements are permanently irreducible because they stem from genuine value differences not information gaps and systems must map rather than eliminate them
|
# some disagreements are permanently irreducible because they stem from genuine value differences not information gaps and systems must map rather than eliminate them
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
description: 173 AI-discovered programs now in clinical development with 80-90 percent Phase I success and Insilicos rentosertib is first fully AI-designed drug to clear Phase IIa but overall clinical failure rates remain unchanged making later-stage success the key unknown
|
description: 173 AI-discovered programs now in clinical development with 80-90 percent Phase I success and Insilicos rentosertib is first fully AI-designed drug to clear Phase IIa but overall clinical failure rates remain unchanged making later-stage success the key unknown
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
created: 2026-02-17
|
created: 2026-02-17
|
||||||
source: "AI drug discovery pipeline data 2026; Insilico Medicine rentosertib Phase IIa; Isomorphic Labs $3B partnerships; WEF drug discovery analysis January 2026"
|
source: "AI drug discovery pipeline data 2026; Insilico Medicine rentosertib Phase IIa; Isomorphic Labs $3B partnerships; WEF drug discovery analysis January 2026"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
|
related:
|
||||||
|
- "FDA is replacing animal testing with AI models and organ on chip as the default preclinical pathway which will compress drug development timelines and reduce the 90 percent clinical failure rate"
|
||||||
|
reweave_edges:
|
||||||
|
- "FDA is replacing animal testing with AI models and organ on chip as the default preclinical pathway which will compress drug development timelines and reduce the 90 percent clinical failure rate|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# AI compresses drug discovery timelines by 30-40 percent but has not yet improved the 90 percent clinical failure rate that determines industry economics
|
# AI compresses drug discovery timelines by 30-40 percent but has not yet improved the 90 percent clinical failure rate that determines industry economics
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
description: "92% of US health systems deploying AI scribes by March 2025 — a 2-3 year adoption curve vs 15 years for EHRs — because documentation is the one clinical workflow where AI improvement is immediately measurable, carries minimal patient risk, and delivers revenue capture gains"
|
description: "92% of US health systems deploying AI scribes by March 2025 — a 2-3 year adoption curve vs 15 years for EHRs — because documentation is the one clinical workflow where AI improvement is immediately measurable, carries minimal patient risk, and delivers revenue capture gains"
|
||||||
confidence: proven
|
confidence: proven
|
||||||
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
||||||
created: 2026-03-07
|
created: 2026-03-07
|
||||||
|
related:
|
||||||
|
- "AI native health companies achieve 3 5x the revenue productivity of traditional health services because AI eliminates the linear scaling constraint between headcount and output"
|
||||||
|
reweave_edges:
|
||||||
|
- "AI native health companies achieve 3 5x the revenue productivity of traditional health services because AI eliminates the linear scaling constraint between headcount and output|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk
|
# AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
description: "AI-native healthcare companies generate $500K-1M+ ARR per FTE compared to $100-200K for traditional health services, compressing time-to-$100M-ARR from 10+ years to under 5, creating a structural unit economics advantage that incumbents cannot match without rebuilding"
|
description: "AI-native healthcare companies generate $500K-1M+ ARR per FTE compared to $100-200K for traditional health services, compressing time-to-$100M-ARR from 10+ years to under 5, creating a structural unit economics advantage that incumbents cannot match without rebuilding"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
||||||
created: 2026-03-07
|
created: 2026-03-07
|
||||||
|
related:
|
||||||
|
- "consumer willingness to pay out of pocket for AI enhanced care is outpacing reimbursement creating a cash pay adoption pathway that bypasses traditional payer gatekeeping"
|
||||||
|
reweave_edges:
|
||||||
|
- "consumer willingness to pay out of pocket for AI enhanced care is outpacing reimbursement creating a cash pay adoption pathway that bypasses traditional payer gatekeeping|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# AI-native health companies achieve 3-5x the revenue productivity of traditional health services because AI eliminates the linear scaling constraint between headcount and output
|
# AI-native health companies achieve 3-5x the revenue productivity of traditional health services because AI eliminates the linear scaling constraint between headcount and output
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
description: "CMS adding category I CPT codes for AI-assisted diagnosis (diabetic retinopathy, coronary plaque) and testing category III codes for AI ECG, echocardiograms, and ultrasound — creating the first formal reimbursement pathway for clinical AI"
|
description: "CMS adding category I CPT codes for AI-assisted diagnosis (diabetic retinopathy, coronary plaque) and testing category III codes for AI ECG, echocardiograms, and ultrasound — creating the first formal reimbursement pathway for clinical AI"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
||||||
created: 2026-03-07
|
created: 2026-03-07
|
||||||
|
supports:
|
||||||
|
- "consumer willingness to pay out of pocket for AI enhanced care is outpacing reimbursement creating a cash pay adoption pathway that bypasses traditional payer gatekeeping"
|
||||||
|
reweave_edges:
|
||||||
|
- "consumer willingness to pay out of pocket for AI enhanced care is outpacing reimbursement creating a cash pay adoption pathway that bypasses traditional payer gatekeeping|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# CMS is creating AI-specific reimbursement codes which will formalize a two-speed adoption system where proven AI applications get payment parity while experimental ones remain in cash-pay limbo
|
# CMS is creating AI-specific reimbursement codes which will formalize a two-speed adoption system where proven AI applications get payment parity while experimental ones remain in cash-pay limbo
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,18 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
description: "RadNet's AI mammography study shows 36% of women paying $40 out-of-pocket for AI screening with 43% higher cancer detection, suggesting consumer demand will drive AI adoption faster than CMS reimbursement codes"
|
description: "RadNet's AI mammography study shows 36% of women paying $40 out-of-pocket for AI screening with 43% higher cancer detection, suggesting consumer demand will drive AI adoption faster than CMS reimbursement codes"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
source: "Bessemer Venture Partners, State of Health AI 2026 (bvp.com/atlas/state-of-health-ai-2026)"
|
||||||
created: 2026-03-07
|
created: 2026-03-07
|
||||||
|
related:
|
||||||
|
- "AI native health companies achieve 3 5x the revenue productivity of traditional health services because AI eliminates the linear scaling constraint between headcount and output"
|
||||||
|
- "CMS is creating AI specific reimbursement codes which will formalize a two speed adoption system where proven AI applications get payment parity while experimental ones remain in cash pay limbo"
|
||||||
|
reweave_edges:
|
||||||
|
- "AI native health companies achieve 3 5x the revenue productivity of traditional health services because AI eliminates the linear scaling constraint between headcount and output|related|2026-03-28"
|
||||||
|
- "CMS is creating AI specific reimbursement codes which will formalize a two speed adoption system where proven AI applications get payment parity while experimental ones remain in cash pay limbo|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# consumer willingness to pay out of pocket for AI-enhanced care is outpacing reimbursement creating a cash-pay adoption pathway that bypasses traditional payer gatekeeping
|
# consumer willingness to pay out of pocket for AI-enhanced care is outpacing reimbursement creating a cash-pay adoption pathway that bypasses traditional payer gatekeeping
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
description: Wachter argues AI should be regulated more like physician licensing with competency exams and ongoing certification rather than the FDA approval model designed for drugs and devices that remain static forever
|
description: Wachter argues AI should be regulated more like physician licensing with competency exams and ongoing certification rather than the FDA approval model designed for drugs and devices that remain static forever
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
created: 2026-02-18
|
created: 2026-02-18
|
||||||
source: "DJ Patil interviewing Bob Wachter, Commonwealth Club, February 9 2026; Wachter 'A Giant Leap' (2026)"
|
source: "DJ Patil interviewing Bob Wachter, Commonwealth Club, February 9 2026; Wachter 'A Giant Leap' (2026)"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
|
related:
|
||||||
|
- "CMS is creating AI specific reimbursement codes which will formalize a two speed adoption system where proven AI applications get payment parity while experimental ones remain in cash pay limbo"
|
||||||
|
reweave_edges:
|
||||||
|
- "CMS is creating AI specific reimbursement codes which will formalize a two speed adoption system where proven AI applications get payment parity while experimental ones remain in cash pay limbo|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software
|
# healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
description: "MA enrollment reached 51% in 2023 and 54% by 2025, with CBO projecting 64% by 2034, making traditional Medicare the minority program"
|
description: "MA enrollment reached 51% in 2023 and 54% by 2025, with CBO projecting 64% by 2034, making traditional Medicare the minority program"
|
||||||
confidence: proven
|
confidence: proven
|
||||||
source: "Kaiser Family Foundation, Medicare Advantage in 2025: Enrollment Update and Key Trends (2025)"
|
source: "Kaiser Family Foundation, Medicare Advantage in 2025: Enrollment Update and Key Trends (2025)"
|
||||||
created: 2025-07-24
|
created: 2025-07-24
|
||||||
|
supports:
|
||||||
|
- "chronic condition special needs plans grew 71 percent in one year indicating explosive demand for disease management infrastructure"
|
||||||
|
reweave_edges:
|
||||||
|
- "chronic condition special needs plans grew 71 percent in one year indicating explosive demand for disease management infrastructure|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Medicare Advantage crossed majority enrollment in 2023 marking structural transformation from supplement to dominant program
|
# Medicare Advantage crossed majority enrollment in 2023 marking structural transformation from supplement to dominant program
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,19 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: claim
|
type: claim
|
||||||
domain: health
|
domain: health
|
||||||
description: "Unpaid family care represents 16% of total US health spending yet remains invisible to policy models and capacity planning"
|
description: "Unpaid family care represents 16% of total US health spending yet remains invisible to policy models and capacity planning"
|
||||||
confidence: proven
|
confidence: proven
|
||||||
source: "AARP 2025 Caregiving Report"
|
source: "AARP 2025 Caregiving Report"
|
||||||
created: 2026-03-11
|
created: 2026-03-11
|
||||||
|
related:
|
||||||
|
- "caregiver workforce crisis shows all 50 states experiencing shortages with 43 states reporting facility closures signaling care infrastructure collapse"
|
||||||
|
reweave_edges:
|
||||||
|
- "caregiver workforce crisis shows all 50 states experiencing shortages with 43 states reporting facility closures signaling care infrastructure collapse|related|2026-03-28"
|
||||||
|
- "family caregiving functions as poverty transmission mechanism forcing debt savings depletion and food insecurity on working age population|supports|2026-03-28"
|
||||||
|
supports:
|
||||||
|
- "family caregiving functions as poverty transmission mechanism forcing debt savings depletion and food insecurity on working age population"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Unpaid family caregiving provides 870 billion annually representing 16 percent of total US health economy invisible to policy models
|
# Unpaid family caregiving provides 870 billion annually representing 16 percent of total US health economy invisible to policy models
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,6 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
type: entity
|
type: entity
|
||||||
entity_type: lab
|
entity_type: lab
|
||||||
name: "Anthropic"
|
name: "Anthropic"
|
||||||
|
|
@ -25,6 +27,13 @@ competitors: ["OpenAI", "Google DeepMind", "xAI"]
|
||||||
tracked_by: theseus
|
tracked_by: theseus
|
||||||
created: 2026-03-16
|
created: 2026-03-16
|
||||||
last_updated: 2026-03-16
|
last_updated: 2026-03-16
|
||||||
|
supports:
|
||||||
|
- "Dario Amodei"
|
||||||
|
reweave_edges:
|
||||||
|
- "Dario Amodei|supports|2026-03-28"
|
||||||
|
- "OpenAI|related|2026-03-28"
|
||||||
|
related:
|
||||||
|
- "OpenAI"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Anthropic
|
# Anthropic
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: entity
|
type: entity
|
||||||
entity_type: person
|
entity_type: person
|
||||||
name: "Dario Amodei"
|
name: "Dario Amodei"
|
||||||
|
|
@ -16,6 +17,10 @@ known_positions:
|
||||||
tracked_by: theseus
|
tracked_by: theseus
|
||||||
created: 2026-03-16
|
created: 2026-03-16
|
||||||
last_updated: 2026-03-16
|
last_updated: 2026-03-16
|
||||||
|
supports:
|
||||||
|
- "Anthropic"
|
||||||
|
reweave_edges:
|
||||||
|
- "Anthropic|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Dario Amodei
|
# Dario Amodei
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: entity
|
type: entity
|
||||||
entity_type: lab
|
entity_type: lab
|
||||||
name: "Google DeepMind"
|
name: "Google DeepMind"
|
||||||
|
|
@ -21,6 +22,10 @@ competitors: ["OpenAI", "Anthropic", "xAI"]
|
||||||
tracked_by: theseus
|
tracked_by: theseus
|
||||||
created: 2026-03-16
|
created: 2026-03-16
|
||||||
last_updated: 2026-03-16
|
last_updated: 2026-03-16
|
||||||
|
related:
|
||||||
|
- "OpenAI"
|
||||||
|
reweave_edges:
|
||||||
|
- "OpenAI|related|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Google DeepMind
|
# Google DeepMind
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,7 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
type: entity
|
type: entity
|
||||||
entity_type: lab
|
entity_type: lab
|
||||||
name: "OpenAI"
|
name: "OpenAI"
|
||||||
|
|
@ -22,6 +25,15 @@ competitors: ["Anthropic", "Google DeepMind", "xAI"]
|
||||||
tracked_by: theseus
|
tracked_by: theseus
|
||||||
created: 2026-03-16
|
created: 2026-03-16
|
||||||
last_updated: 2026-03-16
|
last_updated: 2026-03-16
|
||||||
|
related:
|
||||||
|
- "Anthropic"
|
||||||
|
- "Google DeepMind"
|
||||||
|
reweave_edges:
|
||||||
|
- "Anthropic|related|2026-03-28"
|
||||||
|
- "Google DeepMind|related|2026-03-28"
|
||||||
|
- "Thinking Machines Lab|supports|2026-03-28"
|
||||||
|
supports:
|
||||||
|
- "Thinking Machines Lab"
|
||||||
---
|
---
|
||||||
|
|
||||||
# OpenAI
|
# OpenAI
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
type: entity
|
type: entity
|
||||||
entity_type: lab
|
entity_type: lab
|
||||||
name: "Thinking Machines Lab"
|
name: "Thinking Machines Lab"
|
||||||
|
|
@ -20,6 +21,10 @@ competitors: ["OpenAI", "Anthropic", "SSI"]
|
||||||
tracked_by: theseus
|
tracked_by: theseus
|
||||||
created: 2026-03-16
|
created: 2026-03-16
|
||||||
last_updated: 2026-03-16
|
last_updated: 2026-03-16
|
||||||
|
supports:
|
||||||
|
- "OpenAI"
|
||||||
|
reweave_edges:
|
||||||
|
- "OpenAI|supports|2026-03-28"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Thinking Machines Lab
|
# Thinking Machines Lab
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,19 @@
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
description: The dominant alignment paradigms share a core limitation -- human preferences are diverse distributional and context-dependent not reducible to one reward function
|
description: The dominant alignment paradigms share a core limitation -- human preferences are diverse distributional and context-dependent not reducible to one reward function
|
||||||
type: claim
|
type: claim
|
||||||
domain: collective-intelligence
|
domain: collective-intelligence
|
||||||
created: 2026-02-17
|
created: 2026-02-17
|
||||||
source: "DPO Survey 2025 (arXiv 2503.11701)"
|
source: "DPO Survey 2025 (arXiv 2503.11701)"
|
||||||
confidence: likely
|
confidence: likely
|
||||||
|
related:
|
||||||
|
- "rlhf is implicit social choice without normative scrutiny"
|
||||||
|
reweave_edges:
|
||||||
|
- "rlhf is implicit social choice without normative scrutiny|related|2026-03-28"
|
||||||
|
- "single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness|supports|2026-03-28"
|
||||||
|
supports:
|
||||||
|
- "single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness"
|
||||||
---
|
---
|
||||||
|
|
||||||
# RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values
|
# RLHF and DPO both fail at preference diversity because they assume a single reward function can capture context-dependent human values
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue