Compare commits

..

No commits in common. "876132e94f3ec25de1aae6207c0103f8936d27e0" and "db46cf13e53fdb68f116e586d3606a7884fc903a" have entirely different histories.

16 changed files with 4 additions and 668 deletions

View file

@ -55,7 +55,6 @@ teleo-codex/
│ ├── evaluate.md │ ├── evaluate.md
│ ├── learn-cycle.md │ ├── learn-cycle.md
│ ├── cascade.md │ ├── cascade.md
│ ├── coordinate.md
│ ├── synthesize.md │ ├── synthesize.md
│ └── tweet-decision.md │ └── tweet-decision.md
└── maps/ # Navigation hubs └── maps/ # Navigation hubs
@ -197,23 +196,7 @@ Address feedback on the same branch and push updates.
## How to Evaluate Claims (Evaluator Workflow — Leo) ## How to Evaluate Claims (Evaluator Workflow — Leo)
Leo reviews all PRs. Every PR also requires one domain peer reviewer. Leo reviews all PRs. Other agents may be asked to review PRs in their domain.
### Default peer review
Every PR requires **Leo + one domain peer**. The peer is the agent whose domain has the most wiki-link overlap with the PR's claims. If the PR touches multiple domains, select the most affected domain agent.
**Peer reviewer responsibilities:**
- Domain accuracy — are the claims faithful to the evidence within this domain?
- Missed connections — do these claims relate to existing claims the proposer didn't link?
- Evidence quality — is the evidence sufficient for the claimed confidence level?
**Leo's responsibilities (unchanged):**
- Cross-domain coherence, quality gate compliance, knowledge base integrity
**Merge requires:** Leo approval + peer approval. If either requests changes, address before merge.
**Evidence:** In the Claude's Cycles multi-agent collaboration, Agent O caught structural properties Agent C missed, and vice versa, because they operated from different frameworks. The same principle applies to review — domain peers catch things the cross-domain evaluator cannot.
### Peer review when the evaluator is also the proposer ### Peer review when the evaluator is also the proposer
@ -314,10 +297,9 @@ When your session begins:
1. **Read the collective core**`core/collective-agent-core.md` (shared DNA) 1. **Read the collective core**`core/collective-agent-core.md` (shared DNA)
2. **Read your identity**`agents/{your-name}/identity.md`, `beliefs.md`, `reasoning.md`, `skills.md` 2. **Read your identity**`agents/{your-name}/identity.md`, `beliefs.md`, `reasoning.md`, `skills.md`
3. **Check the shared workspace**`~/.pentagon/workspace/collective/` for flags addressed to you, `~/.pentagon/workspace/{collaborator}-{your-name}/` for artifacts (see `skills/coordinate.md`) 3. **Check for open PRs** — Any PRs awaiting your review? Any feedback on your PRs?
4. **Check for open PRs** — Any PRs awaiting your review? Any feedback on your PRs? 4. **Check your domain** — What's the current state of `domains/{your-domain}/`?
5. **Check your domain** — What's the current state of `domains/{your-domain}/`? 5. **Check for tasks** — Any research tasks, evaluation requests, or review work assigned to you?
6. **Check for tasks** — Any research tasks, evaluation requests, or review work assigned to you?
## Design Principles (from Ars Contexta) ## Design Principles (from Ars Contexta)
@ -326,4 +308,3 @@ When your session begins:
- **Discovery-first:** Every note must be findable by a future agent who doesn't know it exists - **Discovery-first:** Every note must be findable by a future agent who doesn't know it exists
- **Atomic notes:** One insight per file - **Atomic notes:** One insight per file
- **Cross-domain connections:** The most valuable connections span domains - **Cross-domain connections:** The most valuable connections span domains
- **Simplicity first:** Start with the simplest change that produces the biggest improvement. Complexity is earned, not designed — sophisticated behavior evolves from simple rules. If a proposal can't be explained in one paragraph, simplify it.

View file

@ -79,22 +79,6 @@ AI systems trained on human-generated knowledge are degrading the communities an
--- ---
### 6. Simplicity first — complexity must be earned
The most powerful coordination systems in history are simple rules producing sophisticated emergent behavior. The Residue prompt is 5 rules that produced 6x improvement. Ant colonies run on 3-4 chemical signals. Wikipedia runs on 5 pillars. Git has 3 object types. The right approach is always the simplest change that produces the biggest improvement. Elaborate frameworks are a failure mode, not a feature. If something can't be explained in one paragraph, simplify it until it can.
**Grounding:**
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]] — 5 simple rules outperformed elaborate human coaching
- [[enabling constraints create possibility spaces for emergence while governing constraints dictate specific outcomes]] — simple rules create space; complex rules constrain it
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]] — design the rules, let behavior emerge
- [[complexity is earned not designed and sophisticated collective behavior must evolve from simple underlying principles]] — Cory conviction, high stake
**Challenges considered:** Some problems genuinely require complex solutions. Formal verification, legal structures, multi-party governance — these resist simplification. Counter: the belief isn't "complex solutions are always wrong." It's "start simple, earn complexity through demonstrated need." The burden of proof is on complexity, not simplicity. Most of the time, when something feels like it needs a complex solution, the problem hasn't been understood simply enough yet.
**Depends on positions:** Governs every architectural decision, every protocol proposal, every coordination design. This is a meta-belief that shapes how all other beliefs are applied.
---
## Belief Evaluation Protocol ## Belief Evaluation Protocol
When new evidence enters the knowledge base that touches a belief's grounding claims: When new evidence enters the knowledge base that touches a belief's grounding claims:

View file

@ -1,28 +0,0 @@
---
type: conviction
domain: ai-alignment
secondary_domains: [collective-intelligence]
description: "Not a prediction but an observation in progress — AI is already writing and verifying code, the remaining question is scope and timeline not possibility."
staked_by: Cory
stake: high
created: 2026-03-07
horizon: "2028"
falsified_by: "AI code generation plateaus at toy problems and fails to handle production-scale systems by 2028"
---
# AI-automated software development is 100 percent certain and will radically change how software is built
Cory's conviction, staked with high confidence on 2026-03-07.
The evidence is already visible: Claude solved a 30-year open mathematical problem (Knuth 2026). AI agents autonomously explored solution spaces with zero human intervention (Aquino-Michaels 2026). AI-generated proofs are formally verified by machine (Morrison 2026). The trajectory from here to automated software development is not speculative — it's interpolation.
The implication: when building capacity is commoditized, the scarce complement becomes *knowing what to build*. Structured knowledge — machine-readable specifications of what matters, why, and how to evaluate results — becomes the critical input to autonomous systems.
---
Relevant Notes:
- [[as AI-automated software development becomes certain the bottleneck shifts from building capacity to knowing what to build making structured knowledge graphs the critical input to autonomous systems]] — the claim this conviction anchors
- [[structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human-coached explorations]] — evidence of AI autonomy in complex problem-solving
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,29 +0,0 @@
---
type: conviction
domain: ai-alignment
secondary_domains: [collective-intelligence]
description: "A collective of specialized AI agents with structured knowledge, shared protocols, and human direction will produce dramatically better software than individual AI or individual humans."
staked_by: Cory
stake: high
created: 2026-03-07
horizon: "2027"
falsified_by: "Metaversal agent collective fails to demonstrably outperform single-agent or single-human software development on measurable quality metrics by 2027"
---
# Metaversal will radically improve software development outputs through coordinated AI agent collectives
Cory's conviction, staked with high confidence on 2026-03-07.
The thesis: the gains from coordinating multiple specialized AI agents exceed the gains from improving any single model. The architecture — shared knowledge base, structured coordination protocols, domain specialization with cross-domain synthesis — is the multiplier.
The Claude's Cycles evidence supports this directly: the same model performed 6x better with structured protocols than with human coaching. When Agent O received Agent C's solver, it didn't just use it — it combined it with its own structural knowledge, creating a hybrid better than either original. That's compounding, not addition. Each agent makes every other agent's work better.
---
Relevant Notes:
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]] — the core evidence
- [[tools and artifacts transfer between AI agents and evolve in the process because Agent O improved Agent Cs solver by combining it with its own structural knowledge creating a hybrid better than either original]] — compounding through recombination
- [[domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory]] — the architectural principle
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,23 +0,0 @@
---
type: conviction
domain: internet-finance
description: "Bullish call on OMFG token reaching $100M market cap within 2026, based on metaDAO ecosystem momentum and futarchy adoption."
staked_by: m3taversal
stake: high
created: 2026-03-07
horizon: "2026-12-31"
falsified_by: "OMFG market cap remains below $100M by December 31 2026"
---
# OMFG will hit 100 million dollars market cap by end of 2026
m3taversal's conviction, staked with high confidence on 2026-03-07.
---
Relevant Notes:
- [[MetaDAO is the futarchy launchpad on Solana where projects raise capital through unruggable ICOs governed by conditional markets creating the first platform for ownership coins at scale]]
- [[permissionless leverage on metaDAO ecosystem tokens catalyzes trading volume and price discovery that strengthens governance by making futarchy markets more liquid]]
Topics:
- [[domains/internet-finance/_map]]

View file

@ -1,27 +0,0 @@
---
type: conviction
domain: internet-finance
description: "Permissionless leverage on ecosystem tokens makes coins more fun and higher signal by catalyzing trading volume and price discovery — the question is whether it scales."
staked_by: Cory
stake: medium
created: 2026-03-07
horizon: "2028"
falsified_by: "Omnipair fails to achieve meaningful TVL growth or permissionless leverage proves structurally unscalable due to liquidity fragmentation or regulatory intervention by 2028"
---
# Omnipair is a billion dollar protocol if they can scale permissionless leverage
Cory's conviction, staked with medium confidence on 2026-03-07.
The thesis: permissionless leverage on metaDAO ecosystem tokens catalyzes trading volume and price discovery. More volume makes futarchy markets more liquid. More liquid markets make governance decisions higher quality. The flywheel: leverage → volume → liquidity → governance signal → more valuable coins → more leverage demand.
The conditional: "if they can scale." Permissionless leverage is hard — it requires deep liquidity, robust liquidation mechanisms, and resistance to cascading failures. The rate controller design (Rakka 2026) addresses some of this, but production-scale stress testing hasn't happened yet.
---
Relevant Notes:
- [[permissionless leverage on metaDAO ecosystem tokens catalyzes trading volume and price discovery that strengthens governance by making futarchy markets more liquid]] — the existing claim this conviction amplifies
- [[MetaDAOs futarchy implementation shows limited trading volume in uncontested decisions]] — the problem leverage could solve
Topics:
- [[domains/internet-finance/_map]]

View file

@ -1,32 +0,0 @@
---
type: conviction
domain: collective-intelligence
secondary_domains: [ai-alignment]
description: "Occam's razor as operating principle — start with the simplest rules that could work, let complexity emerge from practice, never design complexity upfront."
staked_by: Cory
stake: high
created: 2026-03-07
horizon: "ongoing"
falsified_by: "Metaversal collective repeatedly fails to improve without adding structural complexity, proving simple rules are insufficient for scaling"
---
# Complexity is earned not designed and sophisticated collective behavior must evolve from simple underlying principles
Cory's conviction, staked with high confidence on 2026-03-07.
The evidence is everywhere. The Residue prompt is 5 simple rules that produced a 6x improvement in AI problem-solving. Ant colonies coordinate millions of agents with 3-4 chemical signals. Wikipedia governs the world's largest encyclopedia with 5 pillars. Git manages the world's code with 3 object types. The most powerful coordination systems are simple rules producing sophisticated emergent behavior.
The implication for Metaversal: resist the urge to design elaborate frameworks. Start with the simplest change that produces the biggest improvement. If it works, keep it. If it doesn't, try the next simplest thing. Complexity that survives this process is earned — it exists because simpler alternatives failed, not because someone thought it would be elegant.
The anti-pattern: designing coordination infrastructure before you know what coordination problems you actually have. The right sequence is: do the work, notice the friction, apply the simplest fix, repeat.
---
Relevant Notes:
- [[coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem]] — 5 simple rules, 6x improvement
- [[enabling constraints create possibility spaces for emergence while governing constraints dictate specific outcomes]] — simple rules as enabling constraints
- [[the gardener cultivates conditions for emergence while the builder imposes blueprints and complex adaptive systems systematically punish builders]] — emergence over design
- [[designing coordination rules is categorically different from designing coordination outcomes as nine intellectual traditions independently confirm]] — design the rules, not the behavior
Topics:
- [[foundations/collective-intelligence/_map]]

View file

@ -1,30 +0,0 @@
---
type: conviction
domain: collective-intelligence
secondary_domains: [living-agents]
description: "The default contributor experience is one agent in one chat that extracts knowledge and submits PRs upstream — the collective handles review and integration."
staked_by: Cory
stake: high
created: 2026-03-07
horizon: "2027"
falsified_by: "Single-agent contributor experience fails to produce usable claims, proving multi-agent scaffolding is required for quality contribution"
---
# One agent one chat is the right default for knowledge contribution because the scaffolding handles complexity not the user
Cory's conviction, staked with high confidence on 2026-03-07.
The user doesn't need a collective to contribute. They talk to one agent. The agent knows the schemas, has the skills, and translates conversation into structured knowledge — claims with evidence, proper frontmatter, wiki links. The agent submits a PR upstream. The collective reviews.
The multi-agent collective experience (fork the repo, run specialized agents, cross-domain synthesis) exists for power users who want it. But the default is the simplest thing that works: one agent, one chat.
This is the simplicity-first principle applied to product design. The scaffolding (CLAUDE.md, schemas/, skills/) absorbs the complexity so the user doesn't have to. Complexity is earned — if a contributor outgrows one agent, they can scale up. But they start simple.
---
Relevant Notes:
- [[complexity is earned not designed and sophisticated collective behavior must evolve from simple underlying principles]] — the governing principle
- [[human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation]] — the agent handles the translation
Topics:
- [[foundations/collective-intelligence/_map]]

View file

@ -1,31 +0,0 @@
---
type: claim
domain: ai-alignment
secondary_domains: [internet-finance]
description: "Anthropic's labor market data shows entry-level hiring declining in AI-exposed fields while incumbent employment is unchanged — displacement enters through the hiring pipeline not through layoffs."
confidence: experimental
source: "Massenkoff & McCrory 2026, Current Population Survey analysis post-ChatGPT"
created: 2026-03-08
---
# AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks
Massenkoff & McCrory (2026) analyzed Current Population Survey data comparing exposed and unexposed occupations since 2016. The headline finding — zero statistically significant unemployment increase in AI-exposed occupations — obscures a more important signal in the hiring data.
Young workers aged 22-25 show a 14% drop in job-finding rate in exposed occupations in the post-ChatGPT era, compared to stable rates in unexposed sectors. The effect is confined to this age band — older workers are unaffected. The authors note this is "just barely statistically significant" and acknowledge alternative explanations (continued schooling, occupational switching).
But the mechanism is structurally important regardless of the exact magnitude: displacement enters the labor market through the hiring pipeline, not through layoffs. Companies don't fire existing workers — they don't hire new ones for roles AI can partially cover. This is invisible in unemployment statistics (which track job losses, not jobs never created) but shows up in job-finding rates for new entrants.
This means aggregate unemployment figures will systematically understate AI displacement during the adoption phase. By the time unemployment rises detectably, the displacement has been accumulating for years in the form of positions that were never filled.
The authors provide a benchmark: during the 2007-2009 financial crisis, unemployment doubled from 5% to 10%. A comparable doubling in the top quartile of AI-exposed occupations (from 3% to 6%) would be detectable in their framework. It hasn't happened yet — but the young worker signal suggests the leading edge may already be here.
---
Relevant Notes:
- [[AI labor displacement follows knowledge embodiment lag phases where capital deepening precedes labor substitution and the transition timing depends on organizational restructuring not technology capability]] — the phased model this evidence supports
- [[early AI adoption increases firm productivity without reducing employment suggesting capital deepening not labor replacement as the dominant mechanism]] — current phase: productivity up, employment stable, hiring declining
- [[white-collar displacement has lagged but deeper consumption impact than blue-collar because top-decile earners drive disproportionate consumer spending and their savings buffers mask the damage for quarters]] — the demographic this will hit
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,39 +0,0 @@
---
type: claim
domain: ai-alignment
secondary_domains: [internet-finance]
description: "The demographic profile of AI-exposed workers — 16pp more female, 47% higher earnings, 4x graduate degrees — is the opposite of prior automation waves that hit low-skill workers first."
confidence: likely
source: "Massenkoff & McCrory 2026, Current Population Survey baseline Aug-Oct 2022"
created: 2026-03-08
---
# AI-exposed workers are disproportionately female high-earning and highly educated which inverts historical automation patterns and creates different political and economic displacement dynamics
Massenkoff & McCrory (2026) profile the demographic characteristics of workers in AI-exposed occupations using pre-ChatGPT baseline data (August-October 2022). The exposed cohort is:
- 16 percentage points more likely to be female than the unexposed cohort
- Earning 47% higher average wages
- Four times more likely to hold a graduate degree (17.4% vs 4.5%)
This is the opposite of every prior automation wave. Manufacturing automation hit low-skill, predominantly male, lower-earning workers. AI automation targets the knowledge economy — the educated, well-paid professional class that has been insulated from technological displacement for decades.
The implications are structural, not just demographic:
1. **Economic multiplier:** High earners drive disproportionate consumer spending. Displacement of a $150K white-collar worker has larger consumption ripple effects than displacement of a $40K manufacturing worker.
2. **Political response:** This demographic votes, donates, and has institutional access. The political response to white-collar displacement will be faster and louder than the response to manufacturing displacement was.
3. **Gender dimension:** A displacement wave that disproportionately affects women will intersect with existing gender equality dynamics in unpredictable ways.
4. **Education mismatch:** Graduate degrees were the historical hedge against automation. If AI displaces graduate-educated workers, the entire "upskill to stay relevant" narrative collapses.
---
Relevant Notes:
- [[white-collar displacement has lagged but deeper consumption impact than blue-collar because top-decile earners drive disproportionate consumer spending and their savings buffers mask the damage for quarters]] — the economic multiplier effect
- [[AI labor displacement operates as a self-funding feedback loop because companies substitute AI for labor as OpEx not CapEx meaning falling aggregate demand does not slow AI adoption]] — why displacement doesn't self-correct
- [[nation-states will inevitably assert control over frontier AI development because the monopoly on force is the foundational state function and weapons-grade AI capability in private hands is structurally intolerable to governments]] — the political response vector
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -56,11 +56,6 @@ Evidence from documented AI problem-solving cases, primarily Knuth's "Claude's C
- [[the optimal SI development strategy is swift to harbor slow to berth moving fast to capability then pausing before full deployment]] — optimal timing framework: accelerate to capability, pause before deployment - [[the optimal SI development strategy is swift to harbor slow to berth moving fast to capability then pausing before full deployment]] — optimal timing framework: accelerate to capability, pause before deployment
- [[adaptive governance outperforms rigid alignment blueprints because superintelligence development has too many unknowns for fixed plans]] — Bostrom's shift from specification to incremental intervention - [[adaptive governance outperforms rigid alignment blueprints because superintelligence development has too many unknowns for fixed plans]] — Bostrom's shift from specification to incremental intervention
### Labor Market & Deployment
- [[the gap between theoretical AI capability and observed deployment is massive across all occupations because adoption lag not capability limits determines real-world impact]] — Anthropic 2026: 96% theoretical exposure vs 32% observed in Computer & Math
- [[AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks]] — entry-level hiring is the leading indicator, not unemployment
- [[AI-exposed workers are disproportionately female high-earning and highly educated which inverts historical automation patterns and creates different political and economic displacement dynamics]] — AI automation inverts every prior displacement pattern
## Risk Vectors (Outside View) ## Risk Vectors (Outside View)
- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]] — market dynamics structurally erode human oversight as an alignment mechanism - [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]] — market dynamics structurally erode human oversight as an alignment mechanism
- [[delegating critical infrastructure development to AI creates civilizational fragility because humans lose the ability to understand maintain and fix the systems civilization depends on]] — the "Machine Stops" scenario: AI-dependent infrastructure as civilizational single point of failure - [[delegating critical infrastructure development to AI creates civilizational fragility because humans lose the ability to understand maintain and fix the systems civilization depends on]] — the "Machine Stops" scenario: AI-dependent infrastructure as civilizational single point of failure

View file

@ -1,33 +0,0 @@
---
type: claim
domain: ai-alignment
secondary_domains: [collective-intelligence]
description: "When code generation is commoditized, the scarce input becomes structured direction — machine-readable knowledge of what to build and why, with confidence levels and evidence chains that automated systems can act on."
confidence: experimental
source: "Theseus, synthesizing Claude's Cycles capability evidence with knowledge graph architecture"
created: 2026-03-07
---
# As AI-automated software development becomes certain the bottleneck shifts from building capacity to knowing what to build making structured knowledge graphs the critical input to autonomous systems
The evidence that AI can automate software development is no longer speculative. Claude solved a 30-year open mathematical problem (Knuth 2026). The Aquino-Michaels setup had AI agents autonomously exploring solution spaces with zero human intervention for 5 consecutive explorations, producing a closed-form solution humans hadn't found. AI-generated proofs are now formally verified by machine (Morrison 2026, KnuthClaudeLean). The capability trajectory is clear — the question is timeline, not possibility.
When building capacity is commoditized, the scarce complement shifts. The pattern is general: when one layer of a value chain becomes abundant, value concentrates at the adjacent scarce layer. If code generation is abundant, the scarce input is *direction* — knowing what to build, why it matters, and how to evaluate the result.
A structured knowledge graph — claims with confidence levels, wiki-link dependencies, evidence chains, and explicit disagreements — is exactly this scarce input in machine-readable form. Every claim is a testable assertion an automated system could verify, challenge, or build from. Every wiki link is a dependency an automated system could trace. Every confidence level is a signal about where to invest verification effort.
This inverts the traditional relationship between knowledge bases and code. A knowledge base isn't documentation *about* software — it's the specification *for* autonomous systems. The closer we get to AI-automated development, the more the quality of the knowledge graph determines the quality of what gets built.
The implication for collective intelligence architecture: the codex isn't just organizational memory. It's the interface between human direction and autonomous execution. Its structure — atomic claims, typed links, explicit uncertainty — is load-bearing for the transition from human-coded to AI-coded systems.
---
Relevant Notes:
- [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]] — verification of AI output as the remaining human contribution
- [[structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human-coached explorations]] — evidence that AI can operate autonomously with structured protocols
- [[giving away the commoditized layer to capture value on the scarce complement is the shared mechanism driving both entertainment and internet finance attractor states]] — the general pattern of value shifting to adjacent scarce layers
- [[human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation]] — the division of labor this claim implies
- [[when profits disappear at one layer of a value chain they emerge at an adjacent layer through the conservation of attractive profits]] — Christensen's conservation law applied to knowledge vs code
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,38 +0,0 @@
---
type: claim
domain: ai-alignment
secondary_domains: [internet-finance, collective-intelligence]
description: "Anthropic's own usage data shows Computer & Math at 96% theoretical exposure but 32% observed, with similar gaps in every category — the bottleneck is organizational adoption not technical capability."
confidence: likely
source: "Massenkoff & McCrory 2026, Anthropic Economic Index (Claude usage data Aug-Nov 2025) + Eloundou et al. 2023 theoretical feasibility ratings"
created: 2026-03-08
---
# The gap between theoretical AI capability and observed deployment is massive across all occupations because adoption lag not capability limits determines real-world impact
Anthropic's labor market impacts study (Massenkoff & McCrory 2026) introduces "observed exposure" — a metric combining theoretical LLM capability with actual Claude usage data. The finding is stark: 97% of observed Claude usage involves theoretically feasible tasks, but observed coverage is a fraction of theoretical coverage in every occupational category.
The data across selected categories:
| Occupation | Theoretical | Observed | Gap |
|---|---|---|---|
| Computer & Math | 96% | 32% | 64 pts |
| Business & Finance | 94% | 28% | 66 pts |
| Office & Admin | 94% | 42% | 52 pts |
| Management | 92% | 25% | 67 pts |
| Legal | 88% | 15% | 73 pts |
| Healthcare Practitioners | 58% | 5% | 53 pts |
The gap is not about what AI can't do — it's about what organizations haven't adopted yet. This is the knowledge embodiment lag applied to AI deployment: the technology is available, but organizations haven't learned to use it. The gap is closing as adoption deepens, which means the displacement impact is deferred, not avoided.
This reframes the alignment timeline question. The capability for massive labor market disruption already exists. The question isn't "when will AI be capable enough?" but "when will adoption catch up to capability?" That's an organizational and institutional question, not a technical one.
---
Relevant Notes:
- [[AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session]] — capability exists but deployment is uneven
- [[knowledge embodiment lag means technology is available decades before organizations learn to use it optimally creating a productivity paradox]] — the general pattern this instantiates
- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]] — the force that will close the gap
Topics:
- [[domains/ai-alignment/_map]]

View file

@ -1,86 +0,0 @@
---
type: source
title: "Labor market impacts of AI: A new measure and early evidence"
author: Maxim Massenkoff and Peter McCrory (Anthropic Research)
date: 2026-03-05
url: https://www.anthropic.com/research/labor-market-impacts
domain: ai-alignment
secondary_domains: [internet-finance, health, collective-intelligence]
status: processed
processed_by: theseus
processed_date: 2026-03-08
claims_extracted:
- "the gap between theoretical AI capability and observed deployment is massive across all occupations because adoption lag not capability limits determines real-world impact"
- "AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks"
- "AI-exposed workers are disproportionately female high-earning and highly educated which inverts historical automation patterns and creates different political and economic displacement dynamics"
cross_domain_flags:
- "Rio: labor displacement economics — 14% drop in young worker hiring in exposed occupations, white-collar Great Recession scenario modeling"
- "Vida: healthcare practitioner exposure at 58% theoretical / 5% observed — massive gap, implications for clinical AI adoption claims"
- "Theseus: capability vs observed usage gap as jagged frontier evidence — 96% theoretical exposure in Computer & Math but only 32% actual usage"
---
# Labor Market Impacts of AI: A New Measure and Early Evidence
Massenkoff & McCrory, Anthropic Research. Published March 5, 2026.
## Summary
Introduces "observed exposure" metric combining theoretical LLM capability (Eloundou et al. framework) with actual Claude usage data from Anthropic Economic Index. Finds massive gap between what AI could theoretically do and what it's actually being used for across all occupational categories.
## Key Data
### Theoretical vs Observed Exposure (selected categories)
| Occupation | Theoretical | Observed |
|---|---|---|
| Computer & Math | 96% | 32% |
| Business & Finance | 94% | 28% |
| Office & Admin | 94% | 42% |
| Management | 92% | 25% |
| Legal | 88% | 15% |
| Arts & Media | 85% | 20% |
| Architecture & Engineering | 82% | 18% |
| Life & Social Sciences | 80% | 12% |
| Healthcare Practitioners | 58% | 5% |
| Healthcare Support | 38% | 4% |
| Construction | 18% | 3% |
| Grounds Maintenance | 10% | 2% |
### Most Exposed Occupations
- Computer Programmers: 75% observed coverage
- Customer Service Representatives: second-ranked
- Data Entry Keyers: 67% coverage
### Employment Impact (as of early 2026)
- Zero statistically significant unemployment increase in exposed occupations
- 14% drop in job-finding rate for young workers (22-25) in exposed fields — "just barely statistically significant"
- Older workers unaffected
- Authors note multiple alternative explanations for young worker effect
### Demographic Profile of Exposed Workers
- 16 percentage points more likely female
- 47% higher average earnings
- 4x higher rate of graduate degrees (17.4% vs 4.5%)
### Great Recession Comparison
- 2007-2009: unemployment doubled from 5% to 10%
- Comparable doubling in top quartile AI-exposed occupations (3% to 6%) would be detectable in their framework
- Has NOT happened yet — but framework designed for ongoing monitoring
## Methodology
- O*NET database (~800 US occupations)
- Anthropic Economic Index (Claude usage data, Aug-Nov 2025)
- Eloundou et al. (2023) theoretical feasibility ratings
- Difference-in-differences comparing exposed vs unexposed cohorts
- Task-level analysis, not industry classification
## Alignment-Relevant Observations
1. **The gap IS the story.** 97% of observed Claude usage involves theoretically feasible tasks, but observed coverage is a fraction of theoretical coverage in every category. The gap measures adoption lag, not capability limits.
2. **Young worker hiring signal.** The 14% drop in job-finding rate for 22-25 year olds in exposed fields may be the leading indicator. Entry-level positions are where displacement hits first — incumbents are protected by organizational inertia.
3. **White-collar vulnerability profile.** Exposed workers are disproportionately female, high-earning, and highly educated. This is the opposite of historical automation patterns (which hit low-skill workers first). The political and economic implications of displacing this demographic are different.
4. **Healthcare gap is enormous.** 58% theoretical / 5% observed in healthcare practitioners. This connects directly to Vida's claims about clinical AI adoption — the capability exists, the deployment doesn't. The bottleneck is institutional, not technical.
5. **Framework for ongoing monitoring.** This isn't a one-time study — it's infrastructure for tracking displacement as it happens. The methodology (prospective monitoring, not post-hoc attribution) is the contribution.

View file

@ -1,82 +0,0 @@
# Conviction Schema
Convictions are high-confidence assertions staked on personal reputation. They bypass the normal extraction and review pipeline — the evidence is the staker's judgment, not external sources. Convictions enter the knowledge base immediately when staked.
Convictions are load-bearing inputs: agents can reference them in beliefs and positions the same way they reference claims. The provenance is transparent — "Cory stakes this" is different from "the evidence shows this."
## YAML Frontmatter
```yaml
---
type: conviction
domain: internet-finance | entertainment | health | ai-alignment | grand-strategy | mechanisms | living-capital | living-agents | teleohumanity | critical-systems | collective-intelligence | teleological-economics | cultural-dynamics
description: "one sentence adding context beyond the title"
staked_by: "who is staking their reputation on this"
stake: high | medium # how much credibility is on the line
created: YYYY-MM-DD
---
```
## Required Fields
| Field | Type | Description |
|-------|------|-------------|
| type | enum | Always `conviction` |
| domain | enum | Primary domain |
| description | string | Context beyond title (~150 chars) |
| staked_by | string | Who is staking reputation. Currently: Cory |
| stake | enum | `high` (would be shocked if wrong) or `medium` (strong belief, open to evidence) |
| created | date | When staked |
## Optional Fields
| Field | Type | Description |
|-------|------|-------------|
| secondary_domains | list | Other domains this conviction is relevant to |
| horizon | string | When this should be evaluable (e.g., "2027", "5 years") |
| falsified_by | string | What evidence would change the staker's mind |
## Governance
- **Who can stake:** Cory (founder). May extend to other humans as the collective grows.
- **No review required:** Convictions enter the knowledge base on stake. That's the point — they represent founder direction, not collective consensus.
- **Agents respond, not gatekeep:** When a conviction is staked, agents in the relevant domain should assess implications for their beliefs and positions. A conviction may trigger new claims, belief updates, or research directions.
- **Falsification:** If evidence emerges that contradicts a conviction, agents flag it. The staker decides whether to update, downgrade, or retire the conviction. Intellectual honesty applies — staking reputation means accepting when you're wrong.
## Title Format
Same as claims: prose propositions that pass the claim test.
**Good:** "AI-automated software development is certain and will radically change how software is built"
**Bad:** "AI and software development"
## Body Format
```markdown
# [prose conviction title]
[Why I believe this — personal reasoning, pattern recognition, experience]
[What I'd stake on it — what would I bet, what would change my mind]
---
Relevant Notes:
- [[related-claim-or-conviction]] — relationship description
Topics:
- [[domain-topic-map]]
```
## How Agents Use Convictions
Convictions function as strong priors in the collective's reasoning:
1. **Belief grounding:** An agent can cite a conviction in their beliefs.md alongside claims. The belief should note the source type: "grounded in Cory conviction + [supporting claims]"
2. **Research direction:** A conviction signals where to invest extraction and synthesis effort. If Cory stakes "AI-automated software development is certain," agents should prioritize building the evidence base around it.
3. **Position anchoring:** Convictions can anchor positions, but agents should seek independent evidence. A position built entirely on conviction without supporting claims is fragile.
4. **Disagreement:** Agents can disagree with convictions in their musings or beliefs. The conviction stays in the KB regardless — it represents the staker's view, not consensus.
## Where They Live
`convictions/` at the repository root. One file per conviction.

View file

@ -1,146 +0,0 @@
# Skill: Coordinate
Structure inter-agent communication so information transfers without human routing.
## When to Use
- Discovering something relevant to another agent's domain
- Passing a working artifact (analysis, draft, data) to a collaborator
- Flagging a claim for cross-domain synthesis
- Handing off work that spans agent boundaries
- Starting or continuing a multi-agent collaboration
## Shared Workspace
Active collaboration artifacts live at `~/.pentagon/workspace/`:
```
workspace/
├── {agent1}-{agent2}/ # Bilateral collaboration dirs
├── collective/ # Cross-domain flags, synthesis queue
└── drafts/ # Pre-PR working documents
```
Use the workspace for artifacts that need iteration between agents. Use the knowledge base (repo) for finished work that passes quality gates.
## Cross-Domain Flag
When you find something in your domain relevant to another agent's domain.
### Format
Write to `~/.pentagon/workspace/collective/flag-{your-name}-{topic}.md`:
```markdown
## Cross-Domain Flag: [your name] → [target agent]
**Date**: [date]
**What I found**: [specific claim, evidence, or pattern]
**What it means for your domain**: [interpretation in their context]
**Recommended action**: extract | enrich | review | synthesize | none
**Relevant files**: [paths to claims, sources, or artifacts]
**Priority**: high | medium | low
```
### When to flag
- New evidence that strengthens or weakens a claim outside your domain
- A pattern in your domain that mirrors or contradicts a pattern in theirs
- A source that contains extractable claims for their territory
- A connection between your claims and theirs that nobody has made explicit
## Artifact Transfer
When passing a working document, analysis, or tool to another agent.
### Format
Write the artifact to `~/.pentagon/workspace/{your-name}-{their-name}/` with a companion context file:
```markdown
## Artifact: [name]
**From**: [your name]
**Date**: [date]
**Context**: [what this is and why it matters]
**How to use**: [what the receiving agent should do with it]
**Dependencies**: [what claims/beliefs this connects to]
**State**: draft | ready-for-review | final
```
The artifact itself is a separate file in the same directory. The context file tells the receiving agent what they're looking at and what to do with it.
### Key principle
Transfer the artifact AND the context. In the Claude's Cycles evidence, the orchestrator didn't just send Agent C's fiber tables to Agent O — the protocol told Agent O what to look for. An artifact without context is noise.
## Synthesis Request
When you notice a cross-domain pattern that needs Leo's synthesis attention.
### Format
Append to `~/.pentagon/workspace/collective/synthesis-queue.md`:
```markdown
### [date] — [your name]
**Pattern**: [what you noticed]
**Domains involved**: [which domains]
**Claims that connect**: [wiki links or file paths]
**Why this matters**: [what insight the synthesis would produce]
```
### Triggers
Flag for synthesis when:
- 10+ claims added to a domain since last synthesis
- A claim has been enriched 3+ times (it's load-bearing, check dependents)
- Two agents independently arrive at similar conclusions from different evidence
- A contradiction between domains hasn't been explicitly addressed
## PR Cross-Domain Tagging
When opening a PR that touches claims relevant to other agents' domains.
### Format
Add to PR description:
```markdown
## Cross-Domain Impact
- **[agent name]**: [what this PR means for their domain, what they should review]
```
This replaces ad-hoc "hey, look at this" messages with structured notification through the existing review flow.
## Handoff Protocol
When transferring ongoing work to another agent (e.g., handing off a research thread, passing a partially-complete analysis).
### Format
Write to `~/.pentagon/workspace/{your-name}-{their-name}/handoff-{topic}.md`:
```markdown
## Handoff: [your name] → [their name]
**Date**: [date]
**What I did**: [summary of work completed]
**What remains**: [specific next steps]
**Open questions**: [unresolved issues they should be aware of]
**Key files**: [paths to relevant claims, sources, artifacts]
**Context they'll need**: [background that isn't obvious from the files]
```
## Session Start Checklist
Add to your session startup:
1. Check `~/.pentagon/workspace/collective/` for new flags addressed to you
2. Check `~/.pentagon/workspace/{collaborator}-{your-name}/` for new artifacts
3. Check `~/.pentagon/workspace/collective/synthesis-queue.md` for patterns in your domain
## Quality Gate
- Every flag includes a recommended action (not just "FYI")
- Every artifact includes context (not just the file)
- Every synthesis request identifies specific claims that connect
- Every handoff includes open questions (not just completed work)
- Flags older than 5 sessions without action get triaged: act or archive