rio: generalize entity schema cross-domain + add entity extraction field guide

- What: Core+extension type system in schemas/entity.md. 5 core types
  (company, person, organization, product, market) shared by all agents.
  Domain-specific extensions for each agent defined as type tables.
  New skills/extract-entities.md field guide for all agents.
- Why: Leo/Cory directive — every agent needs entity profiles. Schema was
  internet-finance-specific; now it's the collective's shared infrastructure.
- Design: Domain-specific field definitions are intentionally deferred —
  each agent adds fields when they start extracting. Complexity is earned.

Pentagon-Agent: Rio <760F7FE7-5D50-4C2E-8B7C-9F1A8FEE8A46>
This commit is contained in:
m3taversal 2026-03-11 21:29:45 +00:00 committed by Teleo Agents
parent 2bc09de2b7
commit afc8022ecb
2 changed files with 255 additions and 11 deletions

View file

@ -13,26 +13,99 @@ Evidence → Claims (what's true about the world)
Claims are static propositions with confidence levels. Entities are dynamic objects with temporal attributes. Both feed into agent reasoning. Claims are static propositions with confidence levels. Entities are dynamic objects with temporal attributes. Both feed into agent reasoning.
## Entity Types ## Entity Type System
The type system has two layers: **core types** shared by all agents, and **domain-specific extensions** that specialize core types for particular domains. Every entity uses exactly one type.
### Core Types (all domains)
| Type | What it tracks | Examples | | Type | What it tracks | Examples |
|------|---------------|----------| |------|---------------|----------|
| `company` | Protocol, startup, fund, DAO | MetaDAO, Aave, Solomon, Devoted Health | | `company` | Organization that operates — startup, fund, DAO, protocol | MetaDAO, Aave, Devoted Health, SpaceX |
| `person` | Individual with tracked positions/influence | Stani Kulechov, Gabriel Shapiro, Proph3t | | `person` | Individual with tracked positions/influence | Proph3t, Stani Kulechov, Elon Musk |
| `organization` | Government body, regulatory agency, standards body, consortium | SEC, CFTC, NASA, FLI, CMS |
| `product` | Specific product, tool, or platform distinct from its maker | Autocrat, Starlink, Claude |
| `market` | Industry segment or ecosystem | Futarchic markets, DeFi lending, Medicare Advantage | | `market` | Industry segment or ecosystem | Futarchic markets, DeFi lending, Medicare Advantage |
| `decision_market` | Governance proposal, prediction market, futarchy decision | MetaDAO: Hire Robin Hanson, MetaDAO: Burn 99.3% of META |
### Domain-Specific Extensions
Domain extensions are specialized subtypes that inherit from a core type. Use the most specific type available — it determines which fields are relevant.
#### Internet Finance (Rio)
| Type | Extends | What it tracks | Examples |
|------|---------|---------------|----------|
| `protocol` | company | On-chain protocol with TVL/volume metrics | Aave, Drift, Omnipair |
| `token` | product | Fungible token distinct from its protocol | META, SOL, CLOUD |
| `decision_market` | — | Governance proposal, prediction market, futarchy decision | MetaDAO: Hire Robin Hanson |
| `exchange` | company | Trading venue (CEX or DEX) | Raydium, Meteora, Jupiter |
| `fund` | company | Investment vehicle or DAO treasury | Solomon, Theia Research |
#### Space Development (Astra)
| Type | Extends | What it tracks | Examples |
|------|---------|---------------|----------|
| `vehicle` | product | Launch vehicle or spacecraft | Starship, New Glenn, Neutron |
| `mission` | — | Specific spaceflight mission | Artemis III, ESCAPADE |
| `facility` | — | Launch site, factory, or ground infrastructure | Starbase, LC-36 |
| `program` | — | Multi-mission program or initiative | Artemis, Commercial Crew |
#### Health (Vida)
| Type | Extends | What it tracks | Examples |
|------|---------|---------------|----------|
| `therapy` | product | Treatment modality or therapeutic approach | mRNA cancer vaccines, GLP-1 agonists |
| `drug` | product | Specific pharmaceutical product | Ozempic, Keytruda |
| `insurer` | company | Health insurance organization | UnitedHealthcare, Devoted Health |
| `provider` | company | Healthcare delivery organization | Kaiser Permanente, Oak Street Health |
| `policy` | — | Legislation, regulation, or administrative rule | GENIUS Act, CMS 2027 Advance Notice |
#### Entertainment (Clay)
| Type | Extends | What it tracks | Examples |
|------|---------|---------------|----------|
| `studio` | company | Production company or media business | Beast Industries, Mediawan |
| `creator` | person | Individual content creator or artist | MrBeast, Taylor Swift |
| `franchise` | product | IP, franchise, or media property | Claynosaurz, Pudgy Penguins |
| `platform` | product | Distribution or social media platform | YouTube, TikTok, Dropout |
#### AI/Alignment (Theseus)
| Type | Extends | What it tracks | Examples |
|------|---------|---------------|----------|
| `lab` | company | AI research laboratory | Anthropic, OpenAI, DeepMind |
| `model` | product | AI model or model family | Claude, GPT-4, Gemini |
| `framework` | product | Safety framework, governance protocol, or methodology | RSP, Constitutional AI |
| `governance_body` | organization | AI governance or safety organization | AISI, FLI, Partnership on AI |
### Choosing the Right Type
```
Is it a person? → person (or domain-specific: creator)
Is it a government/regulatory body? → organization (or domain-specific: governance_body)
Is it a governance proposal or market? → decision_market
Is it a specific product/tool? → product (or domain-specific: drug, model, vehicle, etc.)
Is it an organization that operates? → company (or domain-specific: lab, studio, insurer, etc.)
Is it a market segment? → market
Is it a policy or regulation? → policy
Is it a space mission? → mission
Is it a physical facility? → facility
Is it a multi-mission program? → program
```
**Rule:** Use the most specific type available. If a DeFi protocol fits `protocol`, use that instead of `company`. If an AI lab fits `lab`, use that instead of `company`. Domain-specific types carry domain-specific fields.
## YAML Frontmatter ## YAML Frontmatter
```yaml ```yaml
--- ---
type: entity type: entity
entity_type: company | person | market | decision_market entity_type: company | person | organization | product | market | decision_market | protocol | token | exchange | fund | vehicle | mission | facility | program | therapy | drug | insurer | provider | policy | studio | creator | franchise | platform | lab | model | framework | governance_body
name: "Display name" name: "Display name"
domain: internet-finance | entertainment | health | ai-alignment | space-development domain: internet-finance | entertainment | health | ai-alignment | space-development
handles: ["@StaniKulechov", "@MetaLeX_Labs"] # social/web identities handles: ["@StaniKulechov", "@MetaLeX_Labs"] # social/web identities
website: https://example.com website: https://example.com
status: active | inactive | acquired | liquidated | emerging # for company/person/market status: active | inactive | acquired | liquidated | emerging # for most types
# Decision markets use: active | passed | failed # Decision markets use: active | passed | failed
tracked_by: rio # which agent owns this entity tracked_by: rio # which agent owns this entity
created: YYYY-MM-DD created: YYYY-MM-DD
@ -45,7 +118,7 @@ last_updated: YYYY-MM-DD
| Field | Type | Description | | Field | Type | Description |
|-------|------|-------------| |-------|------|-------------|
| type | enum | Always `entity` | | type | enum | Always `entity` |
| entity_type | enum | `company`, `person`, `market`, or `decision_market` | | entity_type | enum | Any type from the type system above |
| name | string | Canonical display name | | name | string | Canonical display name |
| domain | enum | Primary domain | | domain | enum | Primary domain |
| status | enum | Current operational status | | status | enum | Current operational status |
@ -152,7 +225,7 @@ Example: `entities/internet-finance/metadao-hire-robin-hanson.md`
## Company-Specific Fields ## Company-Specific Fields
```yaml ```yaml
# Company attributes # Company attributes (also used by protocol, exchange, fund, lab, studio, insurer, provider)
founded: YYYY-MM-DD founded: YYYY-MM-DD
founders: ["[[person-entity]]"] founders: ["[[person-entity]]"]
category: "DeFi lending protocol" category: "DeFi lending protocol"
@ -184,7 +257,7 @@ launch_date: YYYY-MM-DD # when the entity launched/raised
People entities serve dual purpose: they track public figures we analyze AND serve as contributor profiles when those people engage with the KB. One file, two functions — the file grows from "person we track" to "person who participates." People entities serve dual purpose: they track public figures we analyze AND serve as contributor profiles when those people engage with the KB. One file, two functions — the file grows from "person we track" to "person who participates."
```yaml ```yaml
# Person attributes # Person attributes (also used by creator)
role: "Founder & CEO of Aave" role: "Founder & CEO of Aave"
organizations: ["[[company-entity]]"] organizations: ["[[company-entity]]"]
followers: 290000 # primary platform followers: 290000 # primary platform
@ -202,9 +275,19 @@ first_contribution: null # date of first KB interaction
attribution_handle: null # how they want to be credited attribution_handle: null # how they want to be credited
``` ```
## Market-Specific Fields ## Other Core Type Fields
```yaml ```yaml
# Organization attributes (also used by governance_body)
jurisdiction: "United States"
authority: "Securities regulation" # what this body governs
parent_body: "[[parent-organization]]"
# Product attributes (also used by token, vehicle, drug, model, framework, franchise, platform)
maker: "[[company-entity]]" # who built/maintains this
launched: YYYY-MM-DD
category: "futarchy governance program"
# Market attributes # Market attributes
total_size: "$120B TVL" total_size: "$120B TVL"
growth_rate: "flat since 2021" growth_rate: "flat since 2021"
@ -213,6 +296,8 @@ market_structure: "winner-take-most | fragmented | consolidating"
regulatory_status: "emerging clarity | hostile | supportive" regulatory_status: "emerging clarity | hostile | supportive"
``` ```
**Domain-specific fields:** Each agent adds type-specific fields as they start extracting entities. The fields above cover core types. When Astra creates their first `vehicle` entity, they add vehicle-specific fields to the schema. Complexity is earned from actual use, not designed in advance.
## Body Format ## Body Format
```markdown ```markdown
@ -275,9 +360,19 @@ entities/
claynosaurz.md claynosaurz.md
pudgy-penguins.md pudgy-penguins.md
matthew-ball.md matthew-ball.md
beast-industries.md # studio
health/ health/
devoted-health.md devoted-health.md # insurer
function-health.md function-health.md
ozempic.md # drug
ai-alignment/
anthropic.md # lab
claude.md # model
rsp.md # framework
space-development/
spacex.md
starship.md # vehicle
artemis.md # program
``` ```
**Filename:** Lowercase slugified name. Companies use brand name, people use full name. Decision markets use `{parent}-{proposal-slug}.md`. **Filename:** Lowercase slugified name. Companies use brand name, people use full name. Decision markets use `{parent}-{proposal-slug}.md`.
@ -299,6 +394,8 @@ Sources often contain entity information. During extraction, agents should:
- Update entities (factual changes to tracked objects) → `entities/{domain}/` - Update entities (factual changes to tracked objects) → `entities/{domain}/`
- Both from the same source, in the same PR - Both from the same source, in the same PR
See `skills/extract-entities.md` for the full extraction process.
## Key Difference from Claims ## Key Difference from Claims
| | Claims | Entities | | | Claims | Entities |

147
skills/extract-entities.md Normal file
View file

@ -0,0 +1,147 @@
# Entity Extraction Field Guide
How to extract entities from source material. This skill works alongside `extract.md` (claim extraction) — both run during source processing.
## When to Extract Entities
Every source may contain entity data. During extraction, ask:
1. **Does this source mention an organization, person, product, or market we don't already track?** → Create a new entity
2. **Does this source contain updated information about an entity we already track?** → Update the existing entity (timeline, metrics, status)
3. **Does this source describe a decision, proposal, or market outcome?** → Create a decision_market entity (if it meets significance threshold)
## The Dual Extraction Loop
```
Source → Read completely
Extract claims (propositions about the world) → domains/{domain}/
Extract entities (objects in the world) → entities/{domain}/
Update existing entities (new timeline events, metrics)
Both in the same PR
```
## Entity Extraction Process
### Step 1: Identify Entity Mentions
Read the source and list every entity mentioned. For each:
- Is it already in `entities/{domain}/`? → Flag for update
- Is it new and significant enough to track? → Flag for creation
- Is it mentioned in passing with no meaningful data? → Skip
**Significance test:** Would tracking this entity help us evaluate claims or form positions? If the entity is just background context, skip it.
### Step 2: Select Entity Type
Use the most specific type available. See `schemas/entity.md` for the full type system.
```
Is it a person? → person (or domain-specific: creator)
Is it a government/regulatory body? → organization (or domain-specific: governance_body)
Is it a governance proposal or market? → decision_market
Is it a specific product/tool? → product (or domain-specific: drug, model, vehicle)
Is it an organization that operates? → company (or domain-specific: lab, studio, insurer)
Is it a market segment? → market
```
### Step 3: Extract Frontmatter
Fill in every field you have data for. Don't guess — leave fields empty rather than fabricating data.
**Required fields** (every entity):
- `type: entity`
- `entity_type`: the specific type
- `name`: canonical display name
- `domain`: primary domain
- `status`: current status
- `tracked_by`: your agent name
- `created`: today's date
**Optional but valuable:**
- `handles`: social media handles (from the source or quick lookup)
- `website`: primary web presence
- `tags`: discovery tags
- `secondary_domains`: if the entity spans domains
**Type-specific fields:** Fill in whatever the source provides. The schema lists all available fields — use the ones that have data.
### Step 4: Write the Body
Follow the body format from `schemas/entity.md`:
1. **Overview**: What this entity is, why we track it (2-3 sentences)
2. **Current State**: Latest known attributes from this source
3. **Timeline**: Key events with dates (at minimum, the event from this source)
4. **Competitive Position**: Where it sits relative to competitors (if known)
5. **Relationship to KB**: Wiki-link to related claims and entities
### Step 5: Check for Duplicates
Before creating a new entity, search `entities/{domain}/` for:
- Same name (exact or variant spelling)
- Same handles
- Same website
If a match exists, update the existing entity instead of creating a new one.
### Step 6: Update Parent Entities
If the new entity has a `parent` or `parent_entity` field, update the parent:
- Add the new entity to the parent's Relevant Entities section
- If it's a decision_market, add to the parent's Key Decisions table (if significant)
- Add a timeline entry on the parent
## What Makes a Good Entity
**Good entities have:**
- Concrete, verifiable attributes (dates, metrics, names)
- Clear relevance to at least one domain claim
- Enough data to be useful (not just a name)
- A reason to track changes over time
**Bad entity candidates:**
- Mentioned once in passing with no data
- Purely historical with no ongoing relevance
- Duplicates of existing entities under different names
- Too granular (every tweet doesn't need an entity)
## Domain-Specific Guidance
### Internet Finance (Rio)
- Protocols and tokens are separate entities (MetaDAO = company, META = token)
- Every futardio launch that raises significant capital gets a company entity
- Governance proposals that materially change direction get decision_market entities
- Regulatory bodies (CFTC, SEC) get organization entities
### Space (Astra)
- Vehicles (Starship, New Glenn) are distinct from their makers (SpaceX, Blue Origin)
- Programs (Artemis, Commercial Crew) are distinct from the agencies running them
- Missions get entities when they're historically significant or produce notable data
### Health (Vida)
- Drugs are distinct from the companies that make them
- Insurers and providers are separate entity types — don't conflate
- Policies (legislation, CMS rules) get organization entities for the issuing body + policy entities for the rule itself
### Entertainment (Clay)
- Creators are distinct from their companies (MrBeast vs Beast Industries)
- Franchises/IP are distinct from the studios that own them
- Platforms (YouTube, TikTok) get product or platform entities
### AI/Alignment (Theseus)
- Labs are distinct from their models (Anthropic vs Claude)
- Frameworks (RSP, Constitutional AI) get their own entities when they influence multiple claims
- Governance bodies (AISI, FLI) get organization entities
## Eval Checklist (for reviewers)
1. `entity_type` is the most specific available type
2. Required fields are all populated
3. No fabricated data — empty fields are better than guesses
4. Not a duplicate of existing entity
5. Meets significance threshold
6. Wiki links resolve to real files
7. Parent entity updated if applicable
8. Filing location is correct: `entities/{domain}/{slug}.md`