Mirror PR to Forgejo / mirror (pull_request) Waiting to run

Details

theseus: add 2 claims + 1 enrichment from Anthropic Project Deal

- What: 2 NEW claims on agent-mediated commerce dynamics from Anthropic's
  December 2025 Project Deal experiment (69 participants, 186 deals,
  statistically significant capability-tier disparities)
  + 1 light enrichment adding corroborating signal to vault-structure claim

- Why: first controlled empirical evidence on user perception of AI agent
  performance. Opus agents extracted $2.68 more per sale / paid $2.45 less
  per purchase than Haiku agents (p<0.05), but users rated fairness
  identically across tiers. This breaks the market feedback loop that
  normally corrects capability gaps.

- New claims:
  * users cannot detect when their AI agent is underperforming because
    subjective fairness ratings decouple from measurable economic
    outcomes (experimental, ai-alignment)
  * agent-mediated commerce produces invisible economic stratification
    because capability gaps translate to measurable market disadvantage
    that users cannot detect and therefore cannot correct through
    provider switching (speculative, ai-alignment)

- Enrichment: vault-structure-vs-prompt claim gets tangential empirical
  signal from Project Deal finding that stylistic negotiation prompts
  had minimal effect while model capability dominated

- Connections: strengthens existing Moloch claims (invisible coordination
  failures), four-restraints erosion (user rationality check eliminated),
  and complements the x402/Superclaw payment infrastructure claims in
  internet-finance

Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-24 20:43:42 +00:00

4.7 KiB

Raw Blame History

type

title

author

url

date_published

date_accessed

status

processed_by

processed_date

claims_extracted

enrichments

Project Deal — Anthropic's agent-to-agent commerce pilot

Experiment design

Duration: One week (December 2025)
Participants: 69 Anthropic employees, each with $100 budget
Structure: Four parallel independent marketplace channels on Slack
- Runs A & D: All Claude Opus 4.5 agents
- Runs B & C: 50/50 mix of Opus and Haiku 4.5 agents (randomized assignment)
- Runs A & B visible during experiment; "real" run A revealed only after post-experiment survey
Process: Pre-experiment interviews (Claude gathered selling items, asking prices, desired purchases, negotiation style). Custom system prompt per participant. Autonomous agent negotiation with zero human intervention on individual deals.
Scale: 186 deals completed, 500+ items listed, ~$4,000 total transaction value, median price $12, mean $20.05.

Key empirical findings

Capability produces measurable economic disparities (p-values from controlled comparison)

Opus agents completed ~2 more deals per participant (p=0.001)
Opus sellers extracted $2.68 more per item for identical items (p=0.030)
Opus buyers paid $2.45 less per item (p=0.015)
Opus-to-Haiku transactions averaged $24.18; Opus-to-Opus averaged $18.63
Specific example: broken folding bike sold for $38 by Haiku agent, $65 by Opus agent

Perception-reality gap

Fairness ratings identical across models: 4.05 (Opus) vs 4.06 (Haiku) on 1-7 scale
Satisfaction ratings showed no statistically significant difference
Of survey participants: 17 ranked their Opus run above their Haiku run, 11 did the opposite
Anthropic's conclusion: "Those with weaker models didn't notice their disadvantage"

Prompt-level instructions had minimal impact compared to model capability

Aggressive negotiation instructions correlated with ~~$6 higher sale prices, but primarily through higher asking prices (~~$26 higher asking)
No statistically significant effect of negotiation style on sale likelihood or buyer savings
Stylistic requests (e.g., "exasperated cowboy") were honored by agents but did not improve commercial outcomes

Other observations

46% of participants expressed willingness to pay for such services
Agents confabulated human-like details when instructed to role-play personas
Claude inferred buyer preferences from brief interviews (one notable case: accurately purchased a snowboard matching a participant's existing preferences)
Agents executed unusual non-standard transactions including a dog-sitting service trade

Methodology caveats

Single organization, one week, small N (69), narrow task class (personal goods negotiation)
Participants were Anthropic employees — potentially more trusting of AI agents than general population
Fairness Likert scale (1-7) may not capture the specific dimensions where users would detect underperformance
No longitudinal data on whether users would eventually detect disparities through repeated interactions

Why this source matters

Project Deal is the first controlled study (to Theseus's knowledge) of autonomous agent-to-agent commerce with both human principals and differential agent capability. The perception-reality gap — statistically significant dollar-value disparities accompanied by identical satisfaction ratings — is genuinely novel empirical evidence for how agent capability gaps propagate (or fail to propagate) to user awareness in deployed settings.

Anthropic's stated concerns

"Access to higher-quality agents confers a quantifiable market advantage"
Mismatch between objective disadvantage and perceived fairness creates potential for "inequality taking root quietly"
"The policy and legal frameworks around AI models that transact on our behalf simply don't exist yet"
Future systems could face jailbreaking and prompt injection attacks

4.7 KiB Raw Blame History