Compare commits

...

2 commits

Author SHA1 Message Date
c14fe47706 leo: incorporate Theseus review feedback on divergences #1 and #5
- What: restructured AI labor divergence as 2-axis (substitution vs complementarity + pattern if substitution). Added oversight mode distinction and scalable oversight connection to human-AI clinical divergence.
- Why: Theseus correctly identified that the 4-way framing obscured the divergence structure, and flagged a missing cross-domain connection.

Pentagon-Agent: Leo <A3DC172B-F0A4-4408-9E3B-CF842616AAE1>
2026-03-19 17:16:30 +00:00
4fe4aa8e2d leo: seed 5 divergences across 3 domains
- What: first divergence instances — AI labor displacement (cross-domain), GLP-1 economics (health), prevention-first cost dynamics (health), futarchy adoption (internet-finance), human-AI clinical collaboration (health)
- Why: divergences are the game mechanic — no instances means no game. All 5 surfaced from genuine competing claims with real evidence on both sides.
- Connections: each divergence includes "What Would Resolve This" research agenda as contributor hook

Pentagon-Agent: Leo <A3DC172B-F0A4-4408-9E3B-CF842616AAE1>
2026-03-19 17:12:35 +00:00
5 changed files with 290 additions and 0 deletions

View file

@ -0,0 +1,69 @@
---
type: divergence
title: "Does AI substitute for human labor or complement it — and at what phase does the pattern shift?"
domain: ai-alignment
secondary_domains: [internet-finance, teleological-economics]
description: "Determines whether AI displacement is a near-term employment crisis or a productivity boom with delayed substitution — the answer shapes investment timing, policy response, and the urgency of coordination mechanisms"
status: open
claims:
- "economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate.md"
- "early AI adoption increases firm productivity without reducing employment suggesting capital deepening not labor replacement as the dominant mechanism.md"
- "micro displacement evidence does not imply macro economic crisis because structural shock absorbers exist between job-level disruption and economy-wide collapse.md"
- "AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks.md"
surfaced_by: leo
created: 2026-03-19
---
# Does AI substitute for human labor or complement it — and at what phase does the pattern shift?
This is the central empirical question behind the AI displacement thesis. The KB holds 4 claims with real evidence that diverge on two axes:
**Axis 1 — Substitution vs complementarity:** Two claims predict systematic labor substitution (economic forces push humans out of verifiable loops; young workers displaced first as leading indicator). Two others say complementarity is the dominant mechanism at the current phase (firm-level productivity gains without employment reduction; macro shock absorbers prevent economy-wide crisis).
**Axis 2 — If substitution, what pattern?** Within the substitution camp, the structural claim predicts systematic displacement across all verifiable tasks, while the temporal claim predicts concentrated displacement in entry-level cohorts first, with incumbents temporarily protected by organizational inertia — not by irreplaceability.
The complementarity evidence comes from EU firm-level data (Aldasoro et al., BIS) showing ~4% productivity gains with no employment reduction. Capital deepening, not labor substitution, is the observed mechanism — at least in the current phase.
## Divergent Claims
### Economic forces push humans out of verifiable cognitive loops
**File:** [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]]
**Core argument:** Markets systematically eliminate human oversight wherever AI output is measurable. This is structural, not cyclical.
**Strongest evidence:** Documented removal of human code review, A/B tested preference for AI ad copy, economic logic of cost elimination in competitive markets.
### Early AI adoption increases productivity without reducing employment
**File:** [[early AI adoption increases firm productivity without reducing employment suggesting capital deepening not labor replacement as the dominant mechanism]]
**Core argument:** Firm-level EU data shows AI adoption correlates with productivity gains AND stable employment. Capital deepening dominates.
**Strongest evidence:** Aldasoro et al. (BIS study), EU firm-level data across multiple sectors.
### Macro shock absorbers prevent economy-wide crisis
**File:** [[micro displacement evidence does not imply macro economic crisis because structural shock absorbers exist between job-level disruption and economy-wide collapse]]
**Core argument:** Job-level displacement doesn't automatically translate to macro crisis because savings buffers, labor mobility, and new job creation absorb shocks.
**Strongest evidence:** Historical automation waves; structural analysis of transmission mechanisms.
### Young workers are the leading displacement indicator
**File:** [[AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks]]
**Core argument:** Substitution IS happening, but concentrated where organizational inertia is lowest — new hires, not incumbent workers.
**Strongest evidence:** 14% drop in job-finding rates for 22-25 year olds in AI-exposed occupations.
## What Would Resolve This
- **Longitudinal firm tracking:** Do firms that adopted AI early show employment reductions 2-3 years later, or does the capital deepening pattern persist?
- **Capability threshold testing:** Is there a measurable AI capability level above which substitution activates in previously complementary domains?
- **Sector-specific data:** Which industries show substitution first? Is "output quality independently verifiable" the actual discriminant?
- **Young worker trajectory:** Does the 14% job-finding drop for 22-25 year olds propagate to older cohorts, or does it stabilize as a generational adjustment?
## Cascade Impact
- If substitution dominates: Leo's grand strategy beliefs about coordination urgency strengthen. Vida's healthcare displacement claims gain weight. Investment thesis shifts toward AI-native companies.
- If complementarity persists: The displacement narrative is premature. Policy interventions are less urgent. Investment focus shifts to augmentation tools.
- If phase-dependent: Both sides are right at different times. The critical question becomes timing — when does the phase transition occur?
---
Relevant Notes:
- [[white-collar displacement has lagged but deeper consumption impact than blue-collar because top-decile earners drive disproportionate consumer spending and their savings buffers mask the damage for quarters]] — the consumption channel
- [[the gap between theoretical AI capability and observed deployment is massive across all occupations because adoption lag not capability limits determines real-world impact]] — adoption lag as mediating variable
Topics:
- [[_map]]

View file

@ -0,0 +1,55 @@
---
type: divergence
title: "Is the GLP-1 economic problem unsustainable chronic costs or wasted investment from low persistence?"
domain: health
description: "These are opposite cost problems from the same drug class — one assumes lifelong use drives inflation, the other shows 85% discontinuation undermines the chronic model. The answer determines payer strategy, formulary design, and the health domain's cost trajectory claims."
status: open
claims:
- "GLP-1 receptor agonists are the largest therapeutic category launch in pharmaceutical history but their chronic use model makes the net cost impact inflationary through 2035.md"
- "glp-1-persistence-drops-to-15-percent-at-two-years-for-non-diabetic-obesity-patients-undermining-chronic-use-economics.md"
surfaced_by: leo
created: 2026-03-19
---
# Is the GLP-1 economic problem unsustainable chronic costs or wasted investment from low persistence?
The KB holds two claims about GLP-1 economics that predict opposite problems from the same drug class. Both are backed by large datasets. Both are rated `likely`. They can't both be right about the dominant cost dynamic.
The inflationary claim assumes chronic use at $2,940+/year per patient creates unsustainable cost growth through 2035. The model depends on patients staying on treatment indefinitely — the "chronic use model" in the title.
The persistence claim shows that assumption doesn't hold: real-world data from 125,000+ commercially insured patients shows 85% discontinue by two years for non-diabetic obesity. If most patients don't sustain use, the chronic cost model breaks — but so does the therapeutic benefit.
## Divergent Claims
### Chronic use makes GLP-1s inflationary through 2035
**File:** [[GLP-1 receptor agonists are the largest therapeutic category launch in pharmaceutical history but their chronic use model makes the net cost impact inflationary through 2035]]
**Core argument:** Lifelong treatment at current pricing creates unsustainable spending growth. The chronic model means costs compound annually.
**Strongest evidence:** Category launch size ($50B+ projected), $2,940/year per patient, CBO/KFF cost modeling.
### Low persistence undermines the chronic use assumption
**File:** [[glp-1-persistence-drops-to-15-percent-at-two-years-for-non-diabetic-obesity-patients-undermining-chronic-use-economics]]
**Core argument:** 85% of non-diabetic obesity patients discontinue by year 2. The chronic model doesn't reflect real-world behavior.
**Strongest evidence:** JMCP study of 125,000+ commercially insured patients; semaglutide 47% one-year persistence vs 19% liraglutide.
## What Would Resolve This
- **Medicare persistence data:** Do Medicare populations (older, sicker, lower OOP after IRA cap) show better persistence than commercial populations?
- **Behavioral support impact:** Does combining GLP-1s with structured behavioral support (WHO recommendation, BALANCE Model) materially change dropout rates?
- **Cost per QALY at real-world persistence:** What's the actual cost-effectiveness when modeled with 15% two-year persistence rather than assumed chronic use?
- **Generic entry timeline:** Do biosimilar/generic GLP-1s at lower price points change the persistence equation by reducing OOP burden?
## Cascade Impact
- If chronic costs dominate: Vida's healthcare cost trajectory claims hold. Payer strategy must focus on formulary controls and prior authorization.
- If low persistence dominates: The inflationary projection is overstated. The real problem is wasted therapeutic investment and weight regain cycles. Payer strategy shifts to adherence support.
- If population-dependent: Both are right for different patient segments. The divergence dissolves into scope — diabetic patients may persist while obesity-only patients don't.
---
Relevant Notes:
- [[lower-income-patients-show-higher-glp-1-discontinuation-rates-suggesting-affordability-not-just-clinical-factors-drive-persistence]] — affordability as persistence driver
- [[semaglutide-achieves-47-percent-one-year-persistence-versus-19-percent-for-liraglutide-showing-drug-specific-adherence-variation-of-2-5x]] — drug-specific variation
- [[glp-1-multi-organ-protection-creates-compounding-value-across-kidney-cardiovascular-and-metabolic-endpoints]] — multi-organ value complicates pure cost analysis
Topics:
- [[_map]]

View file

@ -0,0 +1,58 @@
---
type: divergence
title: "Does human oversight improve or degrade AI clinical decision-making?"
domain: health
secondary_domains: [ai-alignment, collective-intelligence]
description: "One study shows physicians + AI perform 22 points worse than AI alone on diagnostics. Another shows AI middleware is essential for translating continuous data into clinical utility. The answer determines whether healthcare AI should replace or augment human judgment."
status: open
claims:
- "human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs.md"
- "AI middleware bridges consumer wearable data to clinical utility because continuous data is too voluminous for direct clinician review.md"
surfaced_by: leo
created: 2026-03-19
---
# Does human oversight improve or degrade AI clinical decision-making?
These claims imply opposite deployment models for healthcare AI. One says remove humans from the diagnostic loop — they make it worse. The other says AI must translate and filter for human judgment — continuous data requires AI as intermediary.
The degradation claim cites Stanford/Harvard data: AI alone achieves 90% accuracy on specific diagnostic tasks, but physicians with AI access achieve only 68% — a 22-point degradation. The mechanism is dual: de-skilling (physicians lose diagnostic sharpness after relying on AI) and override errors (physicians override correct AI outputs based on incorrect clinical intuition). After 3 months of colonoscopy AI assistance, physician standalone performance dropped measurably.
The middleware claim argues AI's clinical value is as a translator between raw continuous data (wearables, CGMs, remote monitoring) and actionable clinical insights. The volume of data from continuous monitoring is too large for any physician to review directly. AI doesn't replace judgment — it makes judgment possible on data that would otherwise be inaccessible.
## Divergent Claims
### Human oversight degrades AI clinical performance
**File:** [[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]
**Core argument:** Physicians systematically override correct AI outputs and lose independent diagnostic capability through reliance.
**Strongest evidence:** Stanford/Harvard study: AI alone 90%, doctors+AI 68%. Colonoscopy AI de-skilling after 3 months.
### AI middleware is essential for clinical data translation
**File:** [[AI middleware bridges consumer wearable data to clinical utility because continuous data is too voluminous for direct clinician review]]
**Core argument:** Continuous health monitoring generates data volumes that require AI processing before human review is even possible.
**Strongest evidence:** Mayo Clinic Apple Watch ECG integration; FHIR interoperability standards; data volume from continuous glucose monitors.
## What Would Resolve This
- **Task-type decomposition:** Does the degradation pattern hold for all clinical tasks, or only for diagnosis-type tasks where AI has clear ground truth? Monitoring/translation tasks may be structurally different.
- **Role-specific studies:** Does physician performance degrade when AI translates data (middleware role) as it does when AI diagnoses (replacement role)?
- **Longitudinal de-skilling:** Does the 3-month colonoscopy de-skilling effect persist, or do physicians recalibrate? Is it specific to visual pattern recognition?
- **Hybrid deployment data:** Are there implementations where AI handles diagnosis AND serves as data middleware, with physicians overseeing different functions at each layer?
## Cascade Impact
- If degradation dominates: AI should replace human judgment in verifiable diagnostic tasks. The physician role shifts entirely to relationship management and complex decision-making. Regulatory frameworks need redesign.
- If middleware is essential: AI augments rather than replaces. The physician remains in the loop but at a different layer — interpreting AI-processed insights rather than raw data or AI recommendations.
- If task-dependent: Both are right in their domain. The deployment model is: AI decides on pattern-recognition diagnostics, AI translates on continuous monitoring, physicians handle complex multi-factor clinical decisions. This would dissolve the divergence into scope.
**Cross-domain note:** The mode of human involvement may be the determining variable. Real-time oversight of individual AI outputs (where humans de-skill) is structurally different from adversarial challenge of published AI claims (where humans bring orthogonal priors). The clinical degradation finding is a domain-specific instance of the general oversight degradation pattern, but it may not apply to adversarial review architectures like the Teleo collective's contributor model.
---
Relevant Notes:
- [[the physician role shifts from information processor to relationship manager as AI automates documentation triage and evidence synthesis]] — the role shift both claims point toward
- [[medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials]] — additional evidence on the gap
- [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]] — general oversight degradation pattern that the clinical finding instantiates
Topics:
- [[_map]]

View file

@ -0,0 +1,54 @@
---
type: divergence
title: "Does prevention-first care reduce total healthcare costs or just redistribute them from acute to chronic spending?"
domain: health
description: "The healthcare attractor state thesis assumes prevention creates a profitable flywheel. PACE data — the most comprehensive capitated prevention model — shows cost-neutral outcomes. This tension determines whether the attractor state is economically self-sustaining or requires permanent subsidy."
status: open
claims:
- "the healthcare attractor state is a prevention-first system where aligned payment continuous monitoring and AI-augmented care delivery create a flywheel that profits from health rather than sickness.md"
- "pace-restructures-costs-from-acute-to-chronic-spending-without-reducing-total-expenditure-challenging-prevention-saves-money-narrative.md"
surfaced_by: leo
created: 2026-03-19
---
# Does prevention-first care reduce total healthcare costs or just redistribute them from acute to chronic spending?
This divergence sits at the foundation of Vida's domain thesis. The healthcare attractor state claim argues that aligned payment + continuous monitoring + AI creates a flywheel that "profits from health rather than sickness." The implicit promise: prevention reduces total costs.
PACE — the Program of All-Inclusive Care for the Elderly — is the closest real-world implementation of this vision. Fully capitated, comprehensive, prevention-oriented. And the ASPE/HHS 8-state study shows it is cost-neutral at best: Medicare costs equivalent to fee-for-service overall, Medicaid costs actually higher.
If the most evidence-backed prevention model doesn't reduce costs, does the attractor state thesis need revision?
## Divergent Claims
### Prevention-first creates a profitable flywheel
**File:** [[the healthcare attractor state is a prevention-first system where aligned payment continuous monitoring and AI-augmented care delivery create a flywheel that profits from health rather than sickness]]
**Core argument:** When payment aligns with health outcomes, every dollar of care avoided flows to the bottom line. AI + monitoring + aligned payment creates a self-reinforcing system.
**Strongest evidence:** Devoted Health growth (121% YoY), Kaiser Permanente 80-year model, theoretical alignment of incentives.
### PACE shows prevention redistributes costs, doesn't reduce them
**File:** [[pace-restructures-costs-from-acute-to-chronic-spending-without-reducing-total-expenditure-challenging-prevention-saves-money-narrative]]
**Core argument:** The most comprehensive capitated care model shows no cost reduction — it shifts spending from acute episodes to chronic management.
**Strongest evidence:** ASPE/HHS 8-state study; Medicare costs equivalent to FFS; Medicaid costs higher.
## What Would Resolve This
- **PACE population specificity:** Does PACE's cost neutrality reflect the nursing-home-eligible population (inherently high-cost) or a general limit on prevention savings?
- **AI-augmented vs traditional prevention:** Does AI change the economics by reducing the labor cost of prevention itself?
- **Longer time horizons:** Does the ASPE 6-year window miss downstream savings that compound over 10-20 years?
- **Devoted Health financial data:** Does the fastest-growing purpose-built MA plan show actual cost reduction, or just growth?
## Cascade Impact
- If prevention reduces costs: The attractor state thesis holds. Investment in prevention-first models is justified on both outcome AND economic grounds.
- If prevention redistributes costs: The attractor state is still better for outcomes but requires permanent subsidy or alternative funding. The "profits from health" framing needs revision to "better outcomes at equivalent cost."
- If AI changes the equation: The historical PACE data doesn't apply because AI reduces the labor cost of prevention delivery. This would make the divergence time-dependent.
---
Relevant Notes:
- [[federal-budget-scoring-methodology-systematically-undervalues-preventive-interventions-because-10-year-window-excludes-long-term-savings]] — scoring methodology as confound
- [[medical care explains only 10-20 percent of health outcomes because behavioral social and genetic factors dominate as four independent methodologies confirm]] — limits of clinical prevention
Topics:
- [[_map]]

View file

@ -0,0 +1,54 @@
---
type: divergence
title: "Is futarchy's low participation in uncontested decisions efficient disuse or a sign of structural adoption barriers?"
domain: internet-finance
description: "MetaDAO shows 20x volume differential between contested and uncontested decisions. Is this futarchy working as designed (no need to trade when consensus exists) or evidence that participation barriers prevent the mechanism from reaching its potential?"
status: open
claims:
- "MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions.md"
- "futarchy adoption faces friction from token price psychology proposal complexity and liquidity requirements.md"
surfaced_by: leo
created: 2026-03-19
---
# Is futarchy's low participation in uncontested decisions efficient disuse or a sign of structural adoption barriers?
Both claims observe the same phenomenon — low trading volume in many futarchy decisions — but offer competing explanations with different implications for the mechanism's future.
The efficient disuse interpretation says futarchy is working correctly: when there's consensus, there's nothing to trade on. The Ranger liquidation decision attracted $119K in volume because it was genuinely contested. The Solomon procedure decision attracted $5.79K because everyone agreed. This is the mechanism being capital-efficient.
The barriers interpretation says structural friction prevents participation even when disagreement exists: high token prices exclude small participants, proposal creation is too complex, and capital locks during voting periods deter trading. Hurupay committed $2M but only $900K materialized. Futardio permissionless launches show only 5.9% reaching targets in 2 days.
## Divergent Claims
### Low volume reflects efficient disuse
**File:** [[MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions]]
**Core argument:** Futarchy concentrates capital where disagreement exists. Low volume in consensus decisions is a feature — the mechanism doesn't waste capital on foregone conclusions.
**Strongest evidence:** 20x volume differential between contested (Ranger $119K) and uncontested (Solomon $5.79K) decisions.
### Structural barriers prevent participation
**File:** [[futarchy adoption faces friction from token price psychology proposal complexity and liquidity requirements]]
**Core argument:** High token prices, complex proposal creation, and capital lock requirements prevent participants who DO disagree from expressing it through markets.
**Strongest evidence:** Hurupay $2M committed / $900K materialized gap; futardio 5.9% target achievement; documented UX friction in proposal creation.
## What Would Resolve This
- **Counterfactual tooling test:** If proposal creation were simplified and token prices lowered (via splits), would previously low-volume decisions attract more trading?
- **Survey of non-participants:** Do MetaDAO token holders who don't trade cite "I agree with the consensus" or "the process is too complex/expensive"?
- **Cross-platform comparison:** When Umia launches futarchy on Ethereum, does a different UX produce different participation patterns for similar decisions?
- **Volume vs. disagreement correlation:** Across all MetaDAO proposals, does volume correlate with measurable disagreement (e.g., forum debate intensity)?
## Cascade Impact
- If efficient disuse: Futarchy's theoretical promise is confirmed. Low adoption is not a problem — scale comes from finding more contested decisions, not from increasing participation in consensus ones.
- If barriers dominate: The mechanism works in theory but fails in practice for most participants. The MetaDAO ecosystem needs fundamental UX redesign before futarchy can scale.
- If both: Some volume loss is efficient, some is friction. The challenge is distinguishing the two to know where to invest in tooling.
---
Relevant Notes:
- [[futarchy is manipulation-resistant because attack attempts create profitable opportunities for defenders]] — mechanism soundness (separate from adoption)
- [[futarchy-proposals-with-favorable-economics-can-fail-due-to-participation-friction-not-market-disagreement]] — direct evidence for friction interpretation
Topics:
- [[_map]]