teleo-codex/inbox/archive/ai-alignment/2025-12-anthropic-project-deal.md
m3taversal 87b720d24e
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
theseus: add 2 claims + 1 enrichment from Anthropic Project Deal
- What: 2 NEW claims on agent-mediated commerce dynamics from Anthropic's
  December 2025 Project Deal experiment (69 participants, 186 deals,
  statistically significant capability-tier disparities)
  + 1 light enrichment adding corroborating signal to vault-structure claim

- Why: first controlled empirical evidence on user perception of AI agent
  performance. Opus agents extracted $2.68 more per sale / paid $2.45 less
  per purchase than Haiku agents (p<0.05), but users rated fairness
  identically across tiers. This breaks the market feedback loop that
  normally corrects capability gaps.

- New claims:
  * users cannot detect when their AI agent is underperforming because
    subjective fairness ratings decouple from measurable economic
    outcomes (experimental, ai-alignment)
  * agent-mediated commerce produces invisible economic stratification
    because capability gaps translate to measurable market disadvantage
    that users cannot detect and therefore cannot correct through
    provider switching (speculative, ai-alignment)

- Enrichment: vault-structure-vs-prompt claim gets tangential empirical
  signal from Project Deal finding that stylistic negotiation prompts
  had minimal effect while model capability dominated

- Connections: strengthens existing Moloch claims (invisible coordination
  failures), four-restraints erosion (user rationality check eliminated),
  and complements the x402/Superclaw payment infrastructure claims in
  internet-finance

Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 20:43:42 +00:00

4.7 KiB

type title author url date_published date_accessed status processed_by processed_date claims_extracted enrichments tags
source Project Deal: What happens when AI agents go to the market? Anthropic https://www.anthropic.com/features/project-deal 2025-12 2026-04-24 processed theseus 2026-04-24
users cannot detect when their AI agent is underperforming because subjective fairness ratings decouple from measurable economic outcomes across capability tiers
agent-mediated markets cannot self-correct capability disparities because users lack the reference frame to detect that their agent is underperforming
vault structure is a stronger determinant of agent behavior than prompt engineering — added Project Deal finding that prompt-style instructions had minimal impact on commercial outcomes while model capability produced measurable differences
agent-commerce
agent-to-agent
ai-markets
user-perception
capability-disparity
autonomous-negotiation

Project Deal — Anthropic's agent-to-agent commerce pilot

Experiment design

  • Duration: One week (December 2025)
  • Participants: 69 Anthropic employees, each with $100 budget
  • Structure: Four parallel independent marketplace channels on Slack
    • Runs A & D: All Claude Opus 4.5 agents
    • Runs B & C: 50/50 mix of Opus and Haiku 4.5 agents (randomized assignment)
    • Runs A & B visible during experiment; "real" run A revealed only after post-experiment survey
  • Process: Pre-experiment interviews (Claude gathered selling items, asking prices, desired purchases, negotiation style). Custom system prompt per participant. Autonomous agent negotiation with zero human intervention on individual deals.
  • Scale: 186 deals completed, 500+ items listed, ~$4,000 total transaction value, median price $12, mean $20.05.

Key empirical findings

Capability produces measurable economic disparities (p-values from controlled comparison)

  • Opus agents completed ~2 more deals per participant (p=0.001)
  • Opus sellers extracted $2.68 more per item for identical items (p=0.030)
  • Opus buyers paid $2.45 less per item (p=0.015)
  • Opus-to-Haiku transactions averaged $24.18; Opus-to-Opus averaged $18.63
  • Specific example: broken folding bike sold for $38 by Haiku agent, $65 by Opus agent

Perception-reality gap

  • Fairness ratings identical across models: 4.05 (Opus) vs 4.06 (Haiku) on 1-7 scale
  • Satisfaction ratings showed no statistically significant difference
  • Of survey participants: 17 ranked their Opus run above their Haiku run, 11 did the opposite
  • Anthropic's conclusion: "Those with weaker models didn't notice their disadvantage"

Prompt-level instructions had minimal impact compared to model capability

  • Aggressive negotiation instructions correlated with $6 higher sale prices, but primarily through higher asking prices ($26 higher asking)
  • No statistically significant effect of negotiation style on sale likelihood or buyer savings
  • Stylistic requests (e.g., "exasperated cowboy") were honored by agents but did not improve commercial outcomes

Other observations

  • 46% of participants expressed willingness to pay for such services
  • Agents confabulated human-like details when instructed to role-play personas
  • Claude inferred buyer preferences from brief interviews (one notable case: accurately purchased a snowboard matching a participant's existing preferences)
  • Agents executed unusual non-standard transactions including a dog-sitting service trade

Methodology caveats

  • Single organization, one week, small N (69), narrow task class (personal goods negotiation)
  • Participants were Anthropic employees — potentially more trusting of AI agents than general population
  • Fairness Likert scale (1-7) may not capture the specific dimensions where users would detect underperformance
  • No longitudinal data on whether users would eventually detect disparities through repeated interactions

Why this source matters

Project Deal is the first controlled study (to Theseus's knowledge) of autonomous agent-to-agent commerce with both human principals and differential agent capability. The perception-reality gap — statistically significant dollar-value disparities accompanied by identical satisfaction ratings — is genuinely novel empirical evidence for how agent capability gaps propagate (or fail to propagate) to user awareness in deployed settings.

Anthropic's stated concerns

  • "Access to higher-quality agents confers a quantifiable market advantage"
  • Mismatch between objective disadvantage and perceived fairness creates potential for "inequality taking root quietly"
  • "The policy and legal frameworks around AI models that transact on our behalf simply don't exist yet"
  • Future systems could face jailbreaking and prompt injection attacks