From 20ebfc56a850253176fcab5789d4cf7119860892 Mon Sep 17 00:00:00 2001 From: m3taversal Date: Thu, 2 Apr 2026 10:35:32 +0100 Subject: [PATCH 1/4] rio: upgrade 7 ownership coin entity files with research + correct attribution - What: Rewrote mtnCapital, Avici, Loyal, ZKLSOL, Paystream, Solomon, P2P.me entities - Why: Entities had wrong parent (futardio instead of metadao), missing investment rationales, no governance activity, stale/thin content. Bot couldn't answer basic questions about MetaDAO launches. - Changes per entity: - Corrected parent: [[metadao]] (curated launches, not futardio permissionless) - Added launch_platform, launch_order fields for proper sequencing - Added investment rationale from original raise pitches - Added governance activity tables (buybacks, restructuring, team packages) - Added open questions and competitive context - Removed hardcoded prices (live tool handles this) - Sources: X research, decision records, source archives, web search Pentagon-Agent: Rio <244ba05f-3aa3-4079-8c59-6d68a77c76fe> --- entities/internet-finance/avici.md | 93 +++++++++++++---- entities/internet-finance/loyal.md | 80 +++++++++++---- entities/internet-finance/mtncapital.md | 80 +++++++-------- entities/internet-finance/p2p-me.md | 126 +++++++++++++++--------- entities/internet-finance/paystream.md | 77 +++++++++++---- entities/internet-finance/solomon.md | 113 +++++++++++++-------- entities/internet-finance/zklsol.md | 82 +++++++++++---- 7 files changed, 446 insertions(+), 205 deletions(-) diff --git a/entities/internet-finance/avici.md b/entities/internet-finance/avici.md index b0cc48d93..5719d4085 100644 --- a/entities/internet-finance/avici.md +++ b/entities/internet-finance/avici.md @@ -8,42 +8,93 @@ website: https://avici.money status: active tracked_by: rio created: 2026-03-11 -last_updated: 2026-03-11 -parent: "futardio" +last_updated: 2026-04-02 +parent: "[[metadao]]" +launch_platform: metadao-curated +launch_order: 4 category: "Distributed internet banking infrastructure (Solana)" stage: growth -funding: "$3.5M raised via Futardio ICO" +token_symbol: "$AVICI" +token_mint: "BANKJmvhT8tiJRsBSS1n2HryMBPvT5Ze4HU95DUAmeta" built_on: ["Solana"] -tags: ["banking", "lending", "futardio-launch", "ownership-coin"] -source_archive: "inbox/archive/2025-10-14-futardio-launch-avici.md" +tags: [metadao-curated-launch, ownership-coin, neobank, defi, lending] +competitors: ["traditional banks", "Revolut", "crypto card providers"] +source_archive: "inbox/archive/internet-finance/2025-10-14-futardio-launch-avici.md" --- # Avici ## Overview -Distributed internet banking infrastructure — onchain credit scoring, spend cards, unsecured loans, and mortgages. Aims to replace traditional banking with permissionless onchain finance. Second Futardio launch by committed capital. -## Current State -- **Raised**: $3.5M final (target $2M, $34.2M committed — 17x oversubscribed) -- **Treasury**: $2.4M USDC remaining -- **Token**: AVICI (mint: BANKJmvhT8tiJRsBSS1n2HryMBPvT5Ze4HU95DUAmeta), price: $1.31 -- **Monthly allowance**: $100K -- **Launch mechanism**: Futardio v0.6 (pro-rata) +Crypto neobank building distributed internet banking infrastructure on Solana — spend cards, an internet-native trust score, unsecured loans, and eventually home mortgages. The thesis: internet capital markets need internet banking infrastructure. To gain independence from fiat, crypto needs a social ledger for reputation-based undercollateralized lending. + +## Investment Rationale (from raise) + +"Money didn't originate from the barter system, that's a myth. It began as credit. Money isn't a commodity; it is a social ledger." Avici argues that onchain finance still lacks reputation-based undercollateralized lending (citing Vitalik's agreement). The ICO pitch: build the onchain banking infrastructure that replaces traditional bank accounts — credit scoring, spend cards, unsecured loans, mortgages — all governed by futarchy. + +## ICO Details + +- **Platform:** MetaDAO curated launchpad (4th launch) +- **Date:** October 14-18, 2025 +- **Target:** $2M +- **Committed:** $34.2M (17x oversubscribed) +- **Final raise:** $3.5M (89.8% of commitments refunded) +- **Initial FDV:** $4.515M at $0.35/token +- **Launch mechanism:** Futardio v0.6 (pro-rata) +- **Distribution:** No preferential VC allocations — described as one of crypto's fairest token distributions + +## Current State (as of early 2026) + +**Live products:** +- **Visa Debit Card** — live in 100+ countries, virtual and physical. 1.5-2% cashback. No staking required. No top-up, transaction, or maintenance fees. Processing 100,000+ transactions monthly. +- **Smart Wallet** — self-custodial, login via Google/iCloud/biometrics/passkey (no seed phrases). Programmable security policies (daily spend limits, address whitelisting). +- **Biz Cards** — lets Solana projects spend from onchain treasury for business needs +- **Named Virtual Accounts** — personal account number + IBAN, fiat auto-converted to stablecoins in self-custodial wallet. MoonPay integration. +- **Multi-chain deposits** — Solana, Polygon, Arbitrum, Base, BSC, Avalanche + +**Traction:** ~4,000+ MAU, 70% month-on-month retention, $1.2M+ in Visa card spend, 12,000+ token holders + +**Not yet live:** Trust Score (onchain credit scoring), unsecured loans, mortgages — still on roadmap + +## Team Performance Package (March 2026 proposal) + +0% team allocation at launch. New proposal for up to 25% contingent on reaching $5B valuation: +- Phase 1: 15% linear unlock between $100M-$1B market cap ($5.53-$55.30/token) +- Phase 2: 10% in equal tranches between $1.5B-$5B ($82.95-$197.55/token) +- No tokens unlock before January 2029 lockup regardless of milestone achievement +- Change-of-control protection: 30% of acquisition value to team if hostile takeover + +This is the strongest performance-alignment structure in the MetaDAO ecosystem — zero dilution unless the project is worth 100x+ the ICO valuation. + +## Governance Activity + +| Decision | Date | Outcome | Record | +|----------|------|---------|--------| +| ICO launch | 2025-10-14 | Completed, $3.5M raised | [[avici-futardio-launch]] | +| Team performance package | 2026-03-30 | Proposed | See inbox/archive | + +## Open Questions + +- **Team anonymity.** No founder names publicly disclosed. RootData shows 55% transparency score and project "not claimed." This is unusual for a project processing 100K+ monthly card transactions. +- **Credit scoring timeline.** The Trust Score is the key differentiator vs. existing crypto cards, but it's still on the roadmap. Without it, Avici is a good crypto debit card but not the "internet bank" the pitch describes. +- **Regulatory exposure.** Visa card program in 100+ countries implies banking partnerships and compliance obligations. How does futarchy governance interact with regulated card issuer requirements? ## Timeline -- **2025-10-14** — Futardio launch opens ($2M target) -- **2025-10-18** — Launch closes. $3.5M raised. -- **2026-01-00** — Performance update: reached 21x peak return, currently trading at ~7x from ICO price -## Relationship to KB -- futardio — launched on Futardio platform -- [[cryptos primary use case is capital formation not payments or store of value because permissionless token issuance solves the fundraising bottleneck that solo founders and small teams face]] — test case for banking-focused crypto raising via permissionless ICO +- **2025-10-14** — MetaDAO curated ICO opens ($2M target) +- **2025-10-18** — ICO closes. $3.5M raised (17x oversubscribed). +- **2025-11** — Card top-up speed reduced from minutes to seconds +- **2026-01-09** — SOLO yield integration for passive stablecoin earnings +- **2026-01-10** — Named Virtual Accounts launched (account number + IBAN) +- **2026-01** — Peak return: 21x from ICO price ($7.56 ATH) +- **2026-03-30** — Team performance package proposal (0% → up to 25% contingent on $5B) --- -Relevant Entities: -- futardio — launch platform -- [[metadao]] — parent ecosystem +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #4) +- [[solomon]] — SOLO yield integration partner +- [[internet capital markets compress fundraising from months to days because permissionless raises eliminate gatekeepers while futarchy replaces due diligence bottlenecks with real-time market pricing]] — 4-day raise window with 17x oversubscription confirms compression Topics: - [[internet finance and decision markets]] diff --git a/entities/internet-finance/loyal.md b/entities/internet-finance/loyal.md index ba36b444a..21a67d277 100644 --- a/entities/internet-finance/loyal.md +++ b/entities/internet-finance/loyal.md @@ -9,42 +9,80 @@ website: https://askloyal.com status: active tracked_by: rio created: 2026-03-11 -last_updated: 2026-03-11 -parent: "futardio" +last_updated: 2026-04-02 +parent: "[[metadao]]" +launch_platform: metadao-curated +launch_order: 5 category: "Decentralized private AI intelligence protocol (Solana)" -stage: growth -funding: "$2.5M raised via Futardio ICO" +stage: early +token_symbol: "$LOYAL" +token_mint: "LYLikzBQtpa9ZgVrJsqYGQpR3cC1WMJrBHaXGrQmeta" +founded_by: "unknown" built_on: ["Solana", "MagicBlock", "Arcium"] -tags: ["privacy", "ai", "futardio-launch", "ownership-coin"] +tags: [metadao-curated-launch, ownership-coin, privacy, ai, confidential-computing] +competitors: ["Venice.ai", "private AI chat alternatives"] source_archive: "inbox/archive/2025-10-18-futardio-launch-loyal.md" --- # Loyal ## Overview -Open source, decentralized, censorship-resistant intelligence protocol. Private AI conversations with no single point of failure — computations via confidential oracles, key derivation in confidential rollups, encrypted chat on decentralized storage. Sits at the intersection of AI privacy and crypto infrastructure. -## Current State -- **Raised**: $2.5M final (target $500K, $75.9M committed — 152x oversubscribed) -- **Treasury**: $260K USDC remaining -- **Token**: LOYAL (mint: LYLikzBQtpa9ZgVrJsqYGQpR3cC1WMJrBHaXGrQmeta), price: $0.14 -- **Monthly allowance**: $60K -- **Launch mechanism**: Futardio v0.6 (pro-rata) +Open source, decentralized, censorship-resistant intelligence protocol. Private AI conversations with no single point of failure — computations via confidential oracles (Arcium), key derivation in confidential rollups with granular read controls, encrypted chats on decentralized storage. Sits at the intersection of AI privacy and crypto infrastructure. + +## Investment Rationale (from raise) + +"Fight against mass surveillance with us. Your chats with AI have no protection. They're used to put people behind bars, to launch targeted ads and in model training. Every question you ask can and will be used against you." + +The pitch is existential: as AI becomes a primary interface for knowledge work, the privacy of AI conversations becomes a fundamental rights issue. Loyal is building the infrastructure so that no single entity can surveil, censor, or monetize your AI interactions. The 152x oversubscription — the highest in MetaDAO history — reflects strong conviction in this thesis. + +## ICO Details + +- **Platform:** MetaDAO curated launchpad (5th launch) +- **Date:** October 18-22, 2025 +- **Target:** $500K +- **Committed:** $75.9M (152x oversubscribed — highest ratio in MetaDAO history) +- **Final raise:** $2.5M +- **Launch mechanism:** Futardio v0.6 (pro-rata) + +## Current State (as of early 2026) + +- **Treasury:** $260K USDC remaining (after $1.5M buyback) +- **Monthly allowance:** $60K +- **Product status:** In development. Private AI chat protocol powered by MagicBlock + Arcium confidential computing stack. + +## Governance Activity — Active Treasury Defense + +Loyal is notable for aggressive treasury management — deploying both buybacks and liquidity burns to defend NAV: + +| Decision | Date | Outcome | Record | +|----------|------|---------|--------| +| ICO launch | 2025-10-18 | Completed, $2.5M raised (152x oversubscribed) | [[loyal-futardio-launch]] | +| $1.5M treasury buyback | 2025-11 | Passed — 8,640 orders over 30 days at max $0.238/token (NAV minus 2 months opex) | [[loyal-buyback-up-to-nav]] | +| 90% liquidity pool burn | 2025-12 | Passed — burned 809,995 LOYAL from Meteora DAMM v2 pool | [[loyal-liquidity-adjustment]] | + +**Buyback logic:** $1.5M at max $0.238/token = estimated 6.3M LOYAL purchased. 90-day cooldown on new buyback/redemption proposals. The max price was calculated as NAV minus 2 months operating expenses — disciplined framework. + +**Liquidity burn rationale:** The Meteora pool was creating selling pressure without corresponding price support. 90% withdrawal (not 100%) to avoid Dexscreener indexing visibility issues. Second MetaDAO project to deploy NAV defense through buybacks. + +## Open Questions + +- **Product delivery.** $260K treasury and $60K/month burn gives ~4 months runway. The confidential computing stack (MagicBlock + Arcium) is ambitious infrastructure. Can they ship with this runway? +- **Market timing.** Private AI chat is a growing concern but the paying market is uncertain. Venice.ai is the closest competitor with a different approach (no blockchain, subscription model). +- **Oversubscription paradox.** 152x oversubscription generated massive attention but the pro-rata mechanism means most committed capital was returned. Does the ratio reflect genuine conviction or allocation-hunting behavior? ## Timeline -- **2025-10-18** — Futardio launch opens ($500K target) -- **2025-10-22** — Launch closes. $2.5M raised. -- **2026-01-00** — ICO performance: maximum 30% drawdown from launch price -## Relationship to KB -- futardio — launched on Futardio platform -- [[internet capital markets compress fundraising from months to days because permissionless raises eliminate gatekeepers while futarchy replaces due diligence bottlenecks with real-time market pricing]] — 4-day raise window confirms compression +- **2025-10-18** — MetaDAO curated ICO opens ($500K target) +- **2025-10-22** — ICO closes. $2.5M raised (152x oversubscribed). +- **2025-11** — $1.5M treasury buyback (8,640 orders over 30 days, max $0.238/token) +- **2025-12** — 90% LOYAL tokens burned from Meteora DAMM v2 pool --- -Relevant Entities: -- futardio — launch platform -- [[metadao]] — parent ecosystem +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #5) +- [[internet capital markets compress fundraising from months to days because permissionless raises eliminate gatekeepers while futarchy replaces due diligence bottlenecks with real-time market pricing]] — 4-day raise window with 152x oversubscription Topics: - [[internet finance and decision markets]] diff --git a/entities/internet-finance/mtncapital.md b/entities/internet-finance/mtncapital.md index 765a2ab87..923a656b1 100644 --- a/entities/internet-finance/mtncapital.md +++ b/entities/internet-finance/mtncapital.md @@ -6,70 +6,72 @@ domain: internet-finance status: liquidated tracked_by: rio created: 2026-03-20 -last_updated: 2026-03-20 -tags: [metadao, futarchy, ico, liquidation, fund] +last_updated: 2026-04-02 +tags: [metadao-curated-launch, ownership-coin, futarchy, fund, liquidation] token_symbol: "$MTN" +token_mint: "unknown" parent: "[[metadao]]" -launch_date: 2025-08 +launch_platform: metadao-curated +launch_order: 1 +launch_date: 2025-04 amount_raised: "$5,760,000" built_on: ["Solana"] +handles: [] +website: "https://v1.metadao.fi/mtncapital" +competitors: [] --- # mtnCapital ## Overview -mtnCapital was a futarchy-governed investment fund launched through MetaDAO's permissioned launchpad. It raised approximately $5.76M USDC, all locked in the DAO treasury. The fund was subsequently wound down via futarchy governance vote (~Sep 2025), making it the **first MetaDAO project to be liquidated** — predating the Ranger Finance liquidation by approximately 6 months. +Futarchy-governed investment fund — the first ownership coin launched through MetaDAO's curated launchpad. Created by mtndao, focused exclusively on Solana ecosystem investments. All capital allocation decisions governed through prediction markets rather than traditional DAO voting. Any $MTN holder could submit investment proposals, making deal sourcing fully permissionless. -## Current State +## Investment Rationale (from raise) -- **Status:** Liquidated (wind-down completed via futarchy vote, ~September 2025) -- **Token:** $MTN (token_mint unknown) -- **Raise:** ~$5.76M USDC (all locked in DAO treasury) -- **Launch FDV:** Unknown — one source (@cryptof4ck) cites $3.3M but this is unverified and would imply a substantial discount to NAV at launch -- **Redemption price:** ~$0.604 per $MTN -- **Post-liquidation:** Token still traded with minimal volume (~$79/day as of Nov 2025) +The thesis was that futarchy-governed capital allocation would outperform traditional VC by removing gatekeepers from deal flow and using market-based decision-making instead of committee votes. The CoinDesk coverage quoted the founder claiming the fund would "outperform VCs." The mechanism: propose an investment → conditional markets price the outcome → capital deploys only if the market signals positive expected value. -## ICO Details +## What Happened -Launched via MetaDAO's permissioned launchpad (~August 2025). All $5.76M raised was locked in the DAO treasury under futarchy governance. Token allocation details unknown. This was one of the earlier MetaDAO permissioned launches alongside Avici, Omnipair, Umbra, and Solomon Labs. - -## Timeline - -- **~2025-08** — Launched via MetaDAO permissioned ICO, raised ~$5.76M USDC -- **2025-08 to 2025-09** — Trading period. At times traded above NAV. -- **~2025-09** — Futarchy governance proposal to wind down operations passed. Capital returned to token holders at ~$0.604/MTN redemption rate. See [[mtncapital-wind-down]] for decision record. -- **2025-09** — Theia Research profited ~$35K via NAV arbitrage (bought at avg $0.485, redeemed at $0.604) -- **2025-11** — @_Dean_Machine flagged potential manipulation concerns "going as far back as the mtnCapital raise, trading, and redemption" -- **2026-01** — @AK47ven listed mtnCapital among 5/8 MetaDAO launches still green since launch -- **2026-03** — @donovanchoy cited mtnCapital as first in liquidation sequence: "mtnCapital was liquidated and returned capital, then Hurupay, now (possibly) Ranger" +The fund underperformed. DAO members initiated a futarchy proposal to liquidate in September 2025. The proposal passed despite team opposition — the market prices clearly supported unwinding. Funds were returned to MTN holders via a one-way redemption mechanism (redeem MTN for USDC, no fees). Redemption price: ~$0.604 per $MTN. ## Significance -mtnCapital is the **first empirical test of the unruggable ICO enforcement mechanism**. The futarchy governance system approved a wind-down, capital was returned to investors, and the process was orderly. This establishes that: +mtnCapital is the **first empirical test of the unruggable ICO enforcement mechanism.** Three things it proved: -1. **Futarchy-governed liquidation works in practice** — mechanism moved from theoretical to empirically validated -2. **NAV arbitrage creates a price floor** — Theia bought below redemption value and profited, confirming the arbitrage mechanism -3. **The liquidation sequence matters** — mtnCapital (orderly wind-down) → Hurupay (refund, didn't reach minimum) → Ranger (contested liquidation with misrepresentation) shows enforcement operating across different failure modes +1. **Futarchy can force liquidation against team wishes.** The team opposed the wind-down but the market overruled them. This is the mechanism working as designed — investor protection without legal proceedings. + +2. **NAV arbitrage is real.** Theia Research bought 297K $MTN at ~$0.485 (below NAV), voted for wind-down, redeemed at ~$0.604. Profit: ~$35K. This confirms the NAV floor is enforceable through market mechanics. + +3. **Orderly unwinding is possible.** Capital returned, redemption mechanism worked, no rugpull. The process established the liquidation playbook that Ranger Finance later followed. ## Open Questions -- What specifically triggered the wind-down? The fund raised $5.76M but apparently failed to deploy capital successfully. Details sparse. -- @_Dean_Machine's manipulation concerns — was there exploitative trading around the raise/redemption cycle? -- Token allocation structure unknown — what % was ICO vs team vs LP? This affects the FDV/NAV relationship. +- **Manipulation concerns.** @_Dean_Machine flagged potential exploitation "going as far back as the mtnCapital raise, trading, and redemption." He stated it's "very unlikely that the MetaDAO team is involved" but "very likely that someone has been taking advantage." Proposed fixes: fees on ICO commitments, restricted capital from newly funded wallets, wallet reputation systems. +- **Why did it underperform?** No detailed post-mortem published by the team. The mechanism proved the fund could be wound down — but the market never tested whether futarchy-governed allocation could outperform in a bull case. -## Relationship to KB -- [[metadao]] — parent entity, permissioned launchpad -- [[decision markets make majority theft unprofitable through conditional token arbitrage]] — mtnCapital liquidation is empirical confirmation of the NAV arbitrage mechanism -- [[futarchy-governed liquidation is the enforcement mechanism that makes unruggable ICOs credible because investors can force full treasury return when teams materially misrepresent]] — first live test of this enforcement mechanism -- [[MetaDAO is the futarchy launchpad on Solana where projects raise capital through unruggable ICOs governed by conditional markets creating the first platform for ownership coins at scale]] — one of the earlier permissioned launches +## Timeline + +- **2025-04** — Launched via MetaDAO curated ICO, raised ~$5.76M USDC (first-ever MetaDAO launch) +- **2025-04 to 2025-09** — Trading period. At times traded above NAV. +- **~2025-09** — Futarchy governance proposal to wind down passed despite team opposition. Capital returned at ~$0.604/MTN redemption rate. See [[mtncapital-wind-down]]. +- **2025-09** — Theia Research profited ~$35K via NAV arbitrage +- **2025-11** — @_Dean_Machine flagged manipulation concerns +- **2026-01** — @AK47ven listed mtnCapital among 5/8 MetaDAO launches still green since launch +- **2026-03** — @donovanchoy cited mtnCapital as first in liquidation sequence: mtnCapital → Hurupay → Ranger + +## Governance Activity + +| Decision | Date | Outcome | Record | +|----------|------|---------|--------| +| Wind-down proposal | ~2025-09 | Passed (liquidation) | [[mtncapital-wind-down]] | --- -Relevant Entities: -- [[metadao]] — platform -- [[theia-research]] — NAV arbitrage participant -- [[ranger-finance]] — second liquidation case (different failure mode) +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #1) +- [[ranger-finance]] — second project to be liquidated via futarchy +- [[futarchy is manipulation-resistant because attack attempts create profitable opportunities for defenders]] — mtnCapital NAV arbitrage supports this claim Topics: - [[internet finance and decision markets]] diff --git a/entities/internet-finance/p2p-me.md b/entities/internet-finance/p2p-me.md index 1dad62c18..ffa515635 100644 --- a/entities/internet-finance/p2p-me.md +++ b/entities/internet-finance/p2p-me.md @@ -1,71 +1,107 @@ --- type: entity entity_type: company -name: P2P.me +name: "P2P.me" domain: internet-finance +handles: [] +website: https://p2p.me status: active +tracked_by: rio +created: 2026-03-20 +last_updated: 2026-04-02 +parent: "[[metadao]]" +launch_platform: metadao-curated +launch_order: 10 +category: "Non-custodial fiat-to-stablecoin on/off ramp" +stage: growth +token_symbol: "$P2P" +token_mint: "P2PXup1ZvMpCDkJn3PQxtBYgxeCSfH39SFeurGSmeta" founded: 2024 headquarters: India +built_on: ["Base", "Solana"] +tags: [metadao-curated-launch, ownership-coin, payments, on-off-ramp, emerging-markets] +competitors: ["MoonPay", "Transak", "Local Bitcoins successors"] +source_archive: "inbox/archive/2026-01-01-futardio-launch-p2p-protocol.md" --- # P2P.me ## Overview -Non-custodial USDC-to-fiat on/off ramp built on Base, targeting emerging markets with peer-to-peer crypto-to-fiat conversion. +Non-custodial peer-to-peer USDC-to-fiat on/off ramp targeting emerging markets. Users convert between stablecoins and local fiat currencies without centralized custody. Live for 2 years on Base, expanding to Solana. Uses a Proof-of-Credibility system with zk-KYC to prevent fraud (<1 in 1,000 transactions). -## Key Metrics (as of March 2026) +## Investment Rationale (from raise) -- **Users:** 23,000+ registered -- **Geography:** India (78%), Brazil (15%), Argentina, Indonesia -- **Volume:** Peaked $3.95M monthly (February 2026) -- **Revenue:** ~$500K annualized -- **Gross Profit:** ~$82K annually (after costs) -- **Team Size:** 25 staff -- **Monthly Burn:** $175K ($75K salaries, $50K marketing, $35K legal, $15K infrastructure) +The most recent MetaDAO curated launch and the first with a live, revenue-generating product and institutional backing. The bull case: P2P.me solves a real problem in emerging markets (India, Brazil, Argentina, Indonesia) where traditional on/off ramps are expensive, slow, or blocked by banking infrastructure. In India specifically, zk-KYC addresses the bank-freeze problem that plagues centralized crypto services. VC backing from Multicoin Capital ($1.4M), Coinbase Ventures ($500K), and Alliance DAO ($350K) provides validation and distribution. ## ICO Details -- **Platform:** MetaDAO -- **Raise Target:** $6M -- **FDV:** ~$15.5M -- **Token Price:** $0.60 -- **Tokens Sold:** 10M -- **Total Supply:** 25.8M -- **Liquid at Launch:** 50% -- **Team Unlock:** Performance-based, no benefit below 2x ICO price -- **Scheduled Date:** March 26, 2026 +- **Platform:** MetaDAO curated launchpad (10th launch — most recent) +- **Date:** March 26-30, 2026 +- **Target:** $6M at $15.5M FDV ($0.60/token, later adjusted to $0.01/token) +- **Total bids:** $7.15M (above target) +- **Final raise:** $5.2M +- **Total supply:** 25.8M tokens +- **Liquid at launch:** 50% (highest in MetaDAO history) +- **Team tokens (30%):** 12-month cliff, performance-based unlocks at 2x/4x/8x/16x/32x ICO price +- **Investor tokens (20%):** 12-month full lockup, then 5 equal unlocks over 12 months -## Business Model +## Current State (as of March 2026) -- B2B SDK deployment potential -- Circles of Trust merchant onboarding for geographic expansion -- On-chain P2P with futarchy governance +**Product metrics:** +- **Users:** 23,000+ registered +- **Geography:** India (78%), Brazil (15%), Argentina, Indonesia +- **Volume:** Peaked $3.95M monthly (February 2026) +- **Weekly actives:** 2,000-2,500 (~10-11% of base) +- **Revenue:** ~$578K annualized (2-6% spread on transactions) +- **Gross profit:** $4.5K-$13.3K/month (inconsistent) +- **NPS:** 80; 65% would be "very disappointed" without the product +- **Fraud rate:** <1 in 1,000 transactions (Proof-of-Credibility) -## Governance +**Financial reality:** +- Monthly burn: $175K ($75K salaries, $50K marketing, $35K legal, $15K infrastructure) +- Runway: ~34 months at current burn +- Self-sustainability threshold: ~$875K/month revenue (currently ~$48K/month) +- Targeting $500M monthly volume over next 18 months -Treasury controlled by token holders through futarchy-based governance. Team cannot unilaterally spend raised capital. +**Prior funding:** +- Multicoin Capital: $1.4M (Jan 2025, 9.33% supply) +- Coinbase Ventures: $500K (Feb 2025, 2.56% supply) +- Alliance DAO: $350K (2024, 4.66% supply) +- Reclaim Protocol: $80K angel (2023, 3.45% supply) + +## The Polymarket Incident + +In March 2026, the P2P.me team placed bets on Polymarket that their own ICO would reach the $6M target, using the pseudonym "P2PTeam." They had a verbal $3M commitment from Multicoin at the time. They netted ~$14,700 in profit. The team publicly apologized, sent profits to the MetaDAO treasury, and adopted a formal policy against future prediction market trades on their own activities. Covered by CoinTelegraph, BeInCrypto, Unchained. + +This incident is noteworthy because it highlights the tension between prediction market participation and insider information — the same issue that recurs in futarchy design (see MetaDAO decision market analysis). + +## Analyst Concerns + +Pine Analytics characterized the valuation as "stretched relative to fundamentals" — the ~182x price-to-gross-profit multiple requires significant growth acceleration that recent data does not support. User growth has stalled for ~6 months with weekly actives plateauing. Delphi Digital found 30-40% of MetaDAO ICO participants are passives/flippers, creating structural post-TGE selling pressure independent of project quality. + +## Roadmap + +- Q2 2026: B2B SDK launch, treasury allocation, multi-currency expansion +- Q3 2026: Solana deployment, governance Phase 1 (insurance/disputes) +- Q4 2026: Phase 2 governance (token-holder voting for non-critical parameters) +- Q1 2027: Operating profitability target ## Timeline -- **2024** — Founded -- **Mid-2025** — Active user growth plateaus -- **February 2026** — Peak monthly volume of $3.95M -- **March 15, 2026** — Pine Analytics publishes pre-ICO analysis identifying 182x gross profit multiple concern -- **March 26, 2026** — ICO scheduled on MetaDAO +- **2024** — Founded, initial angel round from Reclaim Protocol +- **2025-01** — Multicoin Capital $1.4M +- **2025-02** — Coinbase Ventures $500K +- **2026-01-01** — MetaDAO ICO initialized +- **2026-03-16** — Polymarket incident (team bets on own ICO) +- **2026-03-26** — MetaDAO curated ICO goes live +- **2026-03-30** — ICO closes. $5.2M raised. -- **2026-03-26** — [[p2p-me-metadao-ico]] Active: ICO scheduled, targeting $6M raise at $15.5M FDV with Pine Analytics identifying 182x gross profit multiple concerns -- **2026-03-26** — [[p2p-me-ico-march-2026]] Active: $6M ICO at $15.5M FDV scheduled on MetaDAO -- **2026-03-26** — [[metadao-p2p-me-ico]] Active: ICO launch targeting $15.5M FDV at 182x gross profit multiple -- **2026-03-26** — [[p2p-me-metadao-ico-march-2026]] Active: ICO scheduled, targeting $6M at $15.5M FDV -- **2026-03-26** — [[p2p-me-metadao-ico-march-2026]] Status pending: ICO vote scheduled -- **2026-03-26** — [[p2p-me-ico-launch]] Active: ICO launch on MetaDAO with $6M minimum fundraising target -- **2026-03-24** — MetaDAO launch allocation structure announced: XP holders receive priority allocation with pro-rata distribution and bonus multipliers for P2P points holders -- **2026-03-25** — Announced $P2P token sale on MetaDAO with participation from Multicoin Capital, Moonrock Capital, and ex-Solana Foundation investors. Multiple VCs published public investment theses ahead of the ICO. -- **2026-03-26** — [[p2p-me-metadao-ico]] Active: ICO scheduled on MetaDAO platform targeting $15.5M FDV -- **2026-03-27** — ICO launches on MetaDAO with 7-9 month delay on community governance proposals as post-ICO guardrail -- **2026-03-27** — ICO live on MetaDAO with 7-9 month delay before community governance proposals enabled -- **2026-03-27** — ICO structure includes 7-9 month delay before community governance proposals become eligible -- **2026-03-27** — ICO launched on MetaDAO with 7-9 month delay before community governance proposals become enabled, implementing post-ICO timing guardrails -- **2026-03-27** — ICO live on MetaDAO with 7-9 month delay on community governance proposals as post-ICO guardrail -- **2026-03-30** — Transparency issues noted in market analysis; trading policies revised post-market involvement; potential trust rebuilding via MetaDAO integration discussed \ No newline at end of file +--- + +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #10, most recent) +- [[omnipair]] — earlier MetaDAO launch with different token structure + +Topics: +- [[internet finance and decision markets]] diff --git a/entities/internet-finance/paystream.md b/entities/internet-finance/paystream.md index a0f127008..a108cc72f 100644 --- a/entities/internet-finance/paystream.md +++ b/entities/internet-finance/paystream.md @@ -8,41 +8,78 @@ website: https://paystream.finance status: active tracked_by: rio created: 2026-03-11 -last_updated: 2026-03-11 -parent: "futardio" +last_updated: 2026-04-02 +parent: "[[metadao]]" +launch_platform: metadao-curated +launch_order: 7 category: "Liquidity optimization protocol (Solana)" -stage: growth -funding: "$750K raised via Futardio ICO" +stage: early +token_symbol: "$PAYS" +token_mint: "PAYZP1W3UmdEsNLJwmH61TNqACYJTvhXy8SCN4Tmeta" +founded_by: "Maushish Yadav" built_on: ["Solana"] -tags: ["defi", "lending", "liquidity", "futardio-launch", "ownership-coin"] +tags: [metadao-curated-launch, ownership-coin, defi, lending, liquidity] +competitors: ["Kamino", "Juplend", "MarginFi"] source_archive: "inbox/archive/2025-10-23-futardio-launch-paystream.md" --- # Paystream ## Overview -Modular Solana protocol unifying peer-to-peer lending, leveraged liquidity provisioning, and yield routing. Matches lenders and borrowers at mid-market rates, eliminating APY spreads seen in pool-based models like Kamino and Juplend. Integrates with Raydium CLMM, Meteora DLMM, and DAMM v2 pools. -## Current State -- **Raised**: $750K final (target $550K, $6.1M committed — 11x oversubscribed) -- **Treasury**: $241K USDC remaining -- **Token**: PAYS (mint: PAYZP1W3UmdEsNLJwmH61TNqACYJTvhXy8SCN4Tmeta), price: $0.04 -- **Monthly allowance**: $33.5K -- **Launch mechanism**: Futardio v0.6 (pro-rata) +Modular Solana protocol unifying peer-to-peer lending, leveraged liquidity provisioning, and yield routing into a single capital-efficient engine. Matches lenders and borrowers at fair mid-market rates, eliminating the wide APY spreads seen in pool-based models like Kamino and Juplend. Integrates with Raydium CLMM, Meteora DLMM, and DAMM v2 pools. + +## Investment Rationale (from raise) + +The pitch: every dollar on Paystream is always moving, always earning. Pool-based lending models have structural inefficiency — wide APY spreads between what lenders earn and borrowers pay. P2P matching eliminates the spread. Leveraged LP strategies turn idle capital into productive liquidity. The combination targets higher yields for lenders, lower rates for borrowers, and zero idle funds. + +## ICO Details + +- **Platform:** MetaDAO curated launchpad (7th launch) +- **Date:** October 23-27, 2025 +- **Target:** $550K +- **Committed:** $6.15M (11x oversubscribed) +- **Final raise:** $750K +- **Launch mechanism:** Futardio v0.6 (pro-rata) + +## Current State (as of early 2026) + +- **Trading:** ~$0.073, down from $0.09 ATH. Market cap ~$680K — true micro-cap +- **Volume:** Extremely thin (~$3.5K daily) +- **Supply:** ~12.9M circulating of 24.75M max +- **Achievement:** Won the **Solana Colosseum 2025 hackathon** +- **Treasury:** $241K USDC remaining, $33.5K monthly allowance + +## Team + +Founded by **Maushish Yadav**, formerly a crypto security researcher/auditor who audited protocols including Lido, Thorchain, and TempleGold. Security background is relevant for a DeFi lending protocol. + +## Governance Activity + +| Decision | Date | Outcome | Record | +|----------|------|---------|--------| +| ICO launch | 2025-10-23 | Completed, $750K raised | [[paystream-futardio-fundraise]] | +| $225K treasury buyback | 2026-01-16 | Passed — 4,500 orders over 15 days at max $0.065/token | See inbox/archive | + +The buyback follows the NAV-defense pattern now standard across MetaDAO launches — when an ownership coin trades significantly below treasury NAV, the rational move is buybacks until price converges. + +## Open Questions + +- **Adoption.** Extremely thin trading volume and micro-cap status suggest limited market awareness. The hackathon win is a signal but the protocol needs users. +- **Competitive moat.** P2P lending + leveraged LP is a crowded space on Solana. What prevents Kamino, MarginFi, or Juplend from adding similar P2P matching? +- **Treasury runway.** $241K at $33.5K/month gives ~7 months without revenue. The buyback spent $225K — aggressive given the treasury size. ## Timeline -- **2025-10-23** — Futardio launch opens ($550K target) -- **2025-10-27** — Launch closes. $750K raised. -- **2026-01-00** — ICO performance: maximum 30% drawdown from launch price -## Relationship to KB -- futardio — launched on Futardio platform +- **2025-10-23** — MetaDAO curated ICO opens ($550K target) +- **2025-10-27** — ICO closes. $750K raised (11x oversubscribed). +- **2025** — Won Solana Colosseum hackathon +- **2026-01-16** — $225K USDC treasury buyback proposal passed (max $0.065/token, 90-day cooldown) --- -Relevant Entities: -- futardio — launch platform -- [[metadao]] — parent ecosystem +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #7) Topics: - [[internet finance and decision markets]] diff --git a/entities/internet-finance/solomon.md b/entities/internet-finance/solomon.md index f0dfcc8a2..2dcfe4cb1 100644 --- a/entities/internet-finance/solomon.md +++ b/entities/internet-finance/solomon.md @@ -4,62 +4,97 @@ entity_type: company name: "Solomon" domain: internet-finance handles: ["@solomon_labs"] +website: https://solomonlabs.org status: active tracked_by: rio created: 2026-03-11 -last_updated: 2026-03-11 -founded: 2025-11-14 -founders: ["Ranga (@oxranga)"] -category: "Futardio-launched ownership coin with active futarchy governance (Solana)" -parent: "futardio" -stage: early -key_metrics: - raise: "$8M raised ($103M committed — 13x oversubscription)" - treasury: "$6.1M USDC" - token_price: "$0.55" - monthly_allowance: "$100K" - governance: "Active futarchy governance + treasury subcommittee (DP-00001)" -competitors: [] +last_updated: 2026-04-02 +parent: "[[metadao]]" +launch_platform: metadao-curated +launch_order: 8 +category: "Yield-bearing stablecoin protocol (Solana)" +stage: growth +token_symbol: "$SOLO" +token_mint: "SoLo9oxzLDpcq1dpqAgMwgce5WqkRDtNXK7EPnbmeta" +founded_by: "Ranga C (@oxranga)" built_on: ["Solana", "MetaDAO Autocrat"] -tags: ["ownership-coins", "futarchy", "treasury-management", "metadao-ecosystem"] +tags: [metadao-curated-launch, ownership-coin, stablecoin, yield, treasury-management] +competitors: ["Ethena", "Ondo Finance", "Mountain Protocol"] source_archive: "inbox/archive/2025-11-14-futardio-launch-solomon.md" --- # Solomon ## Overview -One of the first successful Futardio launches. Raised $8M through the pro-rata mechanism ($103M committed = 13x oversubscription). Notable for implementing structured treasury management through futarchy — the treasury subcommittee proposal (DP-00001) established operational governance scaffolding on top of futarchy's market-based decision mechanism. -## Current State -- **Product**: USDv — yield-bearing stablecoin. YaaS (Yield-as-a-Service) streams yield to approved USDv holders, LP positions, and treasury balances without wrappers or vaults. -- **Governance**: Active futarchy governance through MetaDAO Autocrat. Treasury subcommittee proposal (DP-00001) passed March 9, 2026 (cleared 1.5% TWAP threshold by +2.22%). Moves up to $150K USDC into segregated legal budget, nominates 4 subcommittee designates. -- **Treasury**: Actively managed through buybacks and strategic allocations. DP-00001 is step 1 of 3: (1) legal/pre-formation, (2) SOLO buyback framework, (3) treasury account activation. -- **YaaS status**: Closed beta — LP volume crossed $1M, OroGold GOLD/USDv pool delivering 59.6% APY. First deployment drove +22.05% LP APY with 3.5x pool growth. -- **Significance**: Test case for whether futarchy-governed organizations converge on traditional corporate governance scaffolding for operations +Composable yield-bearing stablecoin protocol on Solana. Core product is USDv — a stablecoin that generates yield from delta-neutral basis trades (spot long / perp short on BTC/ETH/SOL majors) with T-bill integration in the last mile. YaaS (Yield-as-a-Service) streams yield to approved USDv holders, LP positions, and treasury balances without wrappers or vaults. + +## Investment Rationale (from raise) + +The largest MetaDAO curated ICO by committed capital ($102.9M from 6,603 contributors). The thesis: yield-bearing stablecoins are the next major DeFi primitive, and Solomon's approach — basis trades + T-bills, distributed through YaaS — avoids the centralization risks of Ethena while maintaining competitive yields. The massive oversubscription (13x) reflected conviction that this was the strongest product thesis in the MetaDAO pipeline. + +## ICO Details + +- **Platform:** MetaDAO curated launchpad (8th launch) +- **Date:** November 14-18, 2025 +- **Target:** $2M +- **Committed:** $102.9M from 6,603 contributors (51.5x oversubscribed — largest in MetaDAO history) +- **Final raise:** $8M (capped) +- **Launch mechanism:** Futardio v0.6 (pro-rata) + +## Current State (as of early 2026) + +**Product:** +- USDv live in **private beta** with seven-figure TVL +- TVL reached **$3M** (30% growth from prior update) +- sUSDv beta rate: **~20.9% APY** +- YaaS integration progressing with a major neobank partner (Avici) +- Cantina audit completed +- Legal clearance ~1 month away + +**Token:** Trading ~$0.66-$0.85 range. Down from $1.41 ATH. Very low secondary volume (~$53/day). + +**Team:** Led by Ranga C, who publishes Lab Notes on Substack. New developer hired (Google/Superteam/Solana hackathon background). 50+ commits in recent sprint — Solana parsing, AMM execution layer, internal tooling. Recruiting senior backend. + +## Governance Activity + +Solomon has the most sophisticated governance formation of any MetaDAO project — methodically building corporate-style governance scaffolding through futarchy approvals: + +| Decision | Date | Outcome | Record | +|----------|------|---------|--------| +| ICO launch | 2025-11-14 | Completed, $8M raised | [[solomon-futardio-launch]] | +| DP-00001: Treasury subcommittee + legal budget | 2026-03 | Passed (+2.22% above TWAP threshold) | [[solomon-treasury-subcommittee]] | +| DP-00002: $1M SOLO acquisition + restricted incentives reserve | 2026-03 | Passed | [[solomon-solo-acquisition]] | + +**DP-00001** details: $150K capped legal/compliance budget in segregated wallet. Pre-formation treasury subcommittee with 4 designates. Staged approach: (1) legal foundation → (2) policy framework → (3) delegated authority. No authority to move general funds yet. + +**DP-00002** details: $1M USDC to acquire SOLO at max $0.74. Tokens held in restricted reserve for future incentive programs (Pips program has first call). Cannot be self-dealt, lent, pledged, or used for compensation without governance approval. + +## Why Solomon Matters for MetaDAO + +Solomon is the strongest existence proof that futarchy-governed organizations can build real corporate governance infrastructure. The staged approach — legal first, then policy, then delegated authority — mirrors how traditional startups formalize governance, but every step requires market-based approval rather than board votes. If Solomon ships USDv at scale with 20%+ yields and proper governance, it validates the entire ownership coin model. + +## Open Questions + +- **Ethena comparison.** USDv uses the same basis trade strategy as Ethena's USDe. What's the structural advantage beyond decentralized governance? Scale matters for basis trade profitability. +- **"Hedge fund in disguise?"** Meme Insider questioned whether USDv is just a hedge fund wrapped in stablecoin branding. The counter: transparent governance + T-bill integration + YaaS distribution make it structurally different from an opaque fund. +- **Low secondary liquidity.** $53/day volume despite $8M raise suggests most holders are passive. Does the market believe in the product or was this an oversubscription-driven allocation play? ## Timeline -- **2025-11-14** — Solomon launches via Futardio ($103M committed, $8M raised) -- **2026-02/03** — Lab Notes series (Ranga documenting progress publicly) -- **2026-03** — Treasury subcommittee proposal (DP-00001) — formalized operational governance -- **2026-01-00** — ICO performance: maximum 30% drawdown from launch price, part of convergence toward lower volatility in recent MetaDAO launches -## Competitive Position -Solomon is not primarily a competitive entity — it's an existence proof. It demonstrates that futarchy-governed organizations can raise capital, manage treasuries, and create operational governance structures. The key question is whether the futarchy layer adds genuine value beyond what a normal startup with transparent treasury management would achieve. - -## Investment Thesis -Solomon validates the ownership coin model: futarchy governance + permissionless capital formation + active treasury management. If Solomon outperforms comparable projects without futarchy governance, it strengthens the case for market-based governance as an organizational primitive. - -**Thesis status:** WATCHING - -## Relationship to KB -- [[futarchy-governed DAOs converge on traditional corporate governance scaffolding for treasury operations because market mechanisms alone cannot provide operational security and legal compliance]] — Solomon's DP-00001 is evidence for this -- [[ownership coins primary value proposition is investor protection not governance quality because anti-rug enforcement through market-governed liquidation creates credible exit guarantees that no amount of decision optimization can match]] — Solomon tests this +- **2025-11-14** — MetaDAO curated ICO opens ($2M target) +- **2025-11-18** — ICO closes. $8M raised ($102.9M committed, 51.5x oversubscribed). +- **2026-01** — Max 30% drawdown from launch price +- **2026-02/03** — Lab Notes series published (Ranga documenting progress publicly) +- **2026-03** — DP-00001: Treasury subcommittee + legal budget passed +- **2026-03** — DP-00002: $1M SOLO acquisition + restricted reserve passed +- **2026-03** — USDv private beta with $3M TVL, 20.9% APY --- -Relevant Entities: -- [[metadao]] — parent platform -- futardio — launch mechanism +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #8) +- [[avici]] — YaaS integration partner (neobank + yield) Topics: - [[internet finance and decision markets]] diff --git a/entities/internet-finance/zklsol.md b/entities/internet-finance/zklsol.md index e48500a3c..2a25e96e3 100644 --- a/entities/internet-finance/zklsol.md +++ b/entities/internet-finance/zklsol.md @@ -8,40 +8,82 @@ website: https://zklsol.org status: active tracked_by: rio created: 2026-03-11 -last_updated: 2026-03-11 -parent: "futardio" -category: "LST-based privacy mixer (Solana)" -stage: growth -funding: "Raised via Futardio ICO (target $300K)" +last_updated: 2026-04-02 +parent: "[[metadao]]" +launch_platform: metadao-curated +launch_order: 6 +category: "Zero-knowledge privacy mixer with yield (Solana)" +stage: restructuring +token_symbol: "$ZKFG" +token_mint: "ZKFHiLAfAFMTcDAuCtjNW54VzpERvoe7PBF9mYgmeta" built_on: ["Solana"] -tags: ["privacy", "lst", "defi", "futardio-launch", "ownership-coin"] +tags: [metadao-curated-launch, ownership-coin, privacy, zk, lst, defi] +competitors: ["Tornado Cash (defunct)", "Railgun", "other privacy mixers"] source_archive: "inbox/archive/2025-10-20-futardio-launch-zklsol.md" --- # ZKLSOL ## Overview -Zero-Knowledge Liquid Staking on Solana. Privacy mixer that converts deposited SOL to LST during the mixing period, so users earn staking yield while waiting for privacy — solving the opportunity cost paradox of traditional mixers. -## Current State -- **Raised**: $969K final (target $300K, $14.9M committed — 50x oversubscribed) -- **Treasury**: $575K USDC remaining -- **Token**: ZKLSOL (mint: ZKFHiLAfAFMTcDAuCtjNW54VzpERvoe7PBF9mYgmeta), price: $0.05 -- **Monthly allowance**: $50K -- **Launch mechanism**: Futardio v0.6 (pro-rata) +Zero-Knowledge Liquid Staking on Solana. Privacy mixer that converts deposited SOL to LST during the mixing period, so users earn staking yield while waiting for privacy — solving the opportunity cost paradox of traditional mixers. Upon deposit, SOL converts to LST and is staked. Users withdraw the LST after a sufficient waiting period without loss of yield. + +## Investment Rationale (from raise) + +"Cryptocurrency mixers embody a core paradox: robust anonymity requires funds to dwell in the mixer for extended periods... This delays access to capital, clashing with users' need for swift liquidity." + +ZKLSOL's insight: if deposited funds are converted to LSTs, the waiting period that privacy requires becomes yield-generating instead of capital-destroying. This aligns anonymity with economic incentives — users are paid to wait for privacy rather than paying an opportunity cost. The design bridges security and efficiency, potentially unlocking wider DeFi privacy adoption. + +## ICO Details + +- **Platform:** MetaDAO curated launchpad (6th launch) +- **Date:** October 20-24, 2025 +- **Target:** $300K +- **Committed:** $14.9M (50x oversubscribed) +- **Final raise:** $969,420 +- **Launch mechanism:** Futardio v0.6 (pro-rata) + +## Current State (as of early 2026) + +- **Stage:** Restructuring +- **Treasury:** $575K USDC remaining (after two buyback rounds) +- **Monthly allowance:** $50K +- **Product:** Devnet app live at app.zklsol.org. Roadmap at roadmap.zklsol.org. +- **Also known as:** Turbine.cash (rebranding reference in some sources) + +## Governance Activity — Most Active Treasury Defense + +ZKLSOL has the most governance activity of any MetaDAO launch relative to its size. The team voluntarily burned their entire performance package — an extraordinary alignment signal: + +| Decision | Date | Outcome | Record | +|----------|------|---------|--------| +| ICO launch | 2025-10-20 | Completed, $969K raised (50x oversubscribed) | [[zklsol-futardio-launch]] | +| Team token burn | 2025-11 | Team burned entire performance package | [[zklsol-burn-team-performance-package]] | +| $200K buyback | 2026-01 | Passed — 4,000 orders over ~14 days at max $0.082/token | [[zklsol-200k-buyback]] | +| $500K restructuring buyback | 2026-02 | Passed — 4,000 orders at max $0.076/token + 50% FutarchyAMM liquidity to treasury | [[zklsol-restructuring-proposal]] | + +**Team token burn:** The team voluntarily destroyed their entire performance package to signal alignment with holders. This is the most aggressive team-alignment move in the MetaDAO ecosystem — zero upside for the team beyond whatever tokens they purchased in the ICO like everyone else. + +**Restructuring (Feb 2026):** Proph3t proposed the $500K buyback, acknowledging ZKFG had traded below NAV since inception. The proposal also moved 50% of FutarchyAMM liquidity to treasury for operations. Key quote: "When an ownership coin trades at significant discount to NAV, the right thing to do is buybacks until it gets there. We communicate to projects beforehand: you can raise more, but the money you raise will be at risk." + +## Open Questions + +- **Regulatory risk.** Privacy mixers are the most scrutinized category in crypto after Tornado Cash sanctions. ZKLSOL's LST innovation is clever but doesn't change the regulatory exposure of the mixing function itself. +- **Post-restructuring viability.** Two buyback rounds consumed ~$700K of a $969K raise. Treasury has $575K remaining at $50K/month = ~11 months. Can the product ship before runway expires? +- **Market demand.** Privacy is a feature people say they want but rarely pay for. The mixer market on Solana is small. Does LST-based yield change the demand equation enough? ## Timeline -- **2025-10-20** — Futardio launch opens ($300K target) -- **2026-01-00** — ICO performance: maximum 30% drawdown from launch price -## Relationship to KB -- futardio — launched on Futardio platform +- **2025-10-20** — MetaDAO curated ICO opens ($300K target) +- **2025-10-24** — ICO closes. $969K raised (50x oversubscribed). +- **2025-11** — Team burns entire performance package tokens +- **2026-01** — $200K treasury buyback (4,000 orders over 14 days, max $0.082/token) +- **2026-02** — $500K restructuring buyback + 50% FutarchyAMM liquidity moved to treasury --- -Relevant Entities: -- futardio — launch platform -- [[metadao]] — parent ecosystem +Relevant Notes: +- [[metadao]] — launch platform (curated ICO #6) Topics: - [[internet finance and decision markets]] -- 2.45.2 From 2641137abb30bbc0c68ebae9210ce69adcdd0302 Mon Sep 17 00:00:00 2001 From: m3taversal Date: Thu, 2 Apr 2026 10:38:11 +0100 Subject: [PATCH 2/4] rio: enhance Loyal + ZKLSOL entities with X research findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Loyal: added team (Eden, Chris, Basil, Vasiliy — SF-based), product details (privacy-first AI oracle, TEE stack, B2B Q2 2026), Solana ecosystem recognition - ZKLSOL: documented quiet rebrand to Turbine (zklsol.org → turbine.cash), devnet-only status 6 months post-ICO, near-ATL price ($0.048), $142/day volume Pentagon-Agent: Rio <244ba05f-3aa3-4079-8c59-6d68a77c76fe> --- entities/internet-finance/loyal.md | 14 ++++++++++++-- entities/internet-finance/zklsol.md | 19 +++++++++++++------ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/entities/internet-finance/loyal.md b/entities/internet-finance/loyal.md index 21a67d277..d067e7a35 100644 --- a/entities/internet-finance/loyal.md +++ b/entities/internet-finance/loyal.md @@ -17,7 +17,8 @@ category: "Decentralized private AI intelligence protocol (Solana)" stage: early token_symbol: "$LOYAL" token_mint: "LYLikzBQtpa9ZgVrJsqYGQpR3cC1WMJrBHaXGrQmeta" -founded_by: "unknown" +founded_by: "Eden, Chris, Basil, Vasiliy" +headquarters: "San Francisco, CA" built_on: ["Solana", "MagicBlock", "Arcium"] tags: [metadao-curated-launch, ownership-coin, privacy, ai, confidential-computing] competitors: ["Venice.ai", "private AI chat alternatives"] @@ -49,7 +50,16 @@ The pitch is existential: as AI becomes a primary interface for knowledge work, - **Treasury:** $260K USDC remaining (after $1.5M buyback) - **Monthly allowance:** $60K -- **Product status:** In development. Private AI chat protocol powered by MagicBlock + Arcium confidential computing stack. +- **Market cap:** ~$5.0M +- **Token supply:** 20,976,923 LOYAL total (10M ICO pro-rata, 2M primary liquidity, 3M single-sided Meteora) +- **Product status:** Active development. Positioned as "privacy-first AI oracle on Solana" — described as "Chainlink but for confidential data." Uses TEE (Intel TDX, AMD SEV-SNP) + Nvidia confidential computing for end-to-end encryption. Product capabilities include summarizing Telegram chats, running branded agents, processing sensitive documents, and on-chain workflows (payments, invoicing, asset management). +- **Ecosystem recognition:** Listed by Solana as one of 12 official privacy ecosystem projects +- **GitHub:** Active commits through Feb/March 2026 (github.com/loyal-labs) +- **Roadmap:** Core B2B features targeting Q2 2026. Broader roadmap through Q4 2026 / H1 2027 targeting finance, healthcare, and law verticals. + +## Team + +SF-based team of 4 — Eden, Chris, Basil, and Vasiliy — working together ~3 years on anti-surveillance solutions. One member is a Colgate University Applied Math/CS grad with 3 peer-reviewed AI publications. ## Governance Activity — Active Treasury Defense diff --git a/entities/internet-finance/zklsol.md b/entities/internet-finance/zklsol.md index 2a25e96e3..e2377239a 100644 --- a/entities/internet-finance/zklsol.md +++ b/entities/internet-finance/zklsol.md @@ -43,13 +43,18 @@ ZKLSOL's insight: if deposited funds are converted to LSTs, the waiting period t - **Final raise:** $969,420 - **Launch mechanism:** Futardio v0.6 (pro-rata) -## Current State (as of early 2026) +## Current State (as of April 2026) -- **Stage:** Restructuring +- **Stage:** Restructuring / rebranding +- **Market cap:** ~$280K (rank #4288). Near all-time low ($0.048 vs $0.047 ATL on Mar 30, 2026). +- **Volume:** $142/day — effectively illiquid +- **Supply:** 5.77M circulating / 12.9M total / 25.8M max - **Treasury:** $575K USDC remaining (after two buyback rounds) - **Monthly allowance:** $50K -- **Product:** Devnet app live at app.zklsol.org. Roadmap at roadmap.zklsol.org. -- **Also known as:** Turbine.cash (rebranding reference in some sources) +- **Product:** Devnet only — anonymous deposits and withdrawals working. Planned features include one-click batch withdrawals and OFAC compliance tools. No mainnet mixer 6 months post-ICO. +- **Rebrand to Turbine:** zklsol.org now redirects (302) to **turbine.cash**. docs.zklsol.org redirects to docs.turbine.cash. Site reads "turbine - Earn in Private." No formal rebrand announcement found. Token ticker remains $ZKFG on exchanges. +- **Team:** Anonymous/pseudonymous. No Discord — Telegram only. ~1,978 X followers. +- **Exchanges:** MetaDAO Futarchy AMM, Meteora (ZKFG/SOL pair) ## Governance Activity — Most Active Treasury Defense @@ -68,9 +73,11 @@ ZKLSOL has the most governance activity of any MetaDAO launch relative to its si ## Open Questions -- **Regulatory risk.** Privacy mixers are the most scrutinized category in crypto after Tornado Cash sanctions. ZKLSOL's LST innovation is clever but doesn't change the regulatory exposure of the mixing function itself. +- **Quiet rebrand.** zklsol.org → turbine.cash with no formal announcement is a transparency concern. The token ticker remains ZKFG while the product rebrands to Turbine — this creates confusion. +- **Devnet only after 6 months.** No mainnet mixer launch despite raising $969K. The buybacks consumed most of the raise. What has the team been building? +- **Regulatory risk.** Privacy mixers are the most scrutinized category in crypto after Tornado Cash sanctions. ZKLSOL's LST innovation is clever but doesn't change the regulatory exposure. The planned OFAC compliance tools suggest awareness. - **Post-restructuring viability.** Two buyback rounds consumed ~$700K of a $969K raise. Treasury has $575K remaining at $50K/month = ~11 months. Can the product ship before runway expires? -- **Market demand.** Privacy is a feature people say they want but rarely pay for. The mixer market on Solana is small. Does LST-based yield change the demand equation enough? +- **Near-ATL price signals.** Trading at $0.048 vs $0.047 ATL with $142/day volume. The market has largely abandoned this token. Anonymous team + no mainnet product + quiet rebrand is not a confidence-building combination. ## Timeline -- 2.45.2 From 945258a13fc02a4fc976038dbc5adceb9556ad27 Mon Sep 17 00:00:00 2001 From: m3taversal Date: Thu, 2 Apr 2026 11:48:09 +0100 Subject: [PATCH 3/4] Add Phase 1+2 instrumentation: review records, cascade automation, cross-domain index, agent state MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 1 — Audit logging infrastructure: - review_records table (migration v12) capturing every eval verdict with outcome, rejection reason, disagreement type - Cascade automation: auto-flag dependent beliefs/positions when merged claims change - Merge frontmatter stamps: last_review metadata on merged claim files Phase 2 — Cross-domain and state tracking: - Cross-domain citation index: entity overlap detection across domains on every merge - Agent-state schema v1: file-backed state for VPS agents (memory, tasks, inbox, metrics) - Cascade completion tracking: process-cascade-inbox.py logs review outcomes - research-session.sh: state hooks + cascade processing integration All changes are live on VPS. This commit brings the code under version control for review. Co-Authored-By: Claude Opus 4.6 (1M context) --- ops/agent-state/SCHEMA.md | 255 ++++ ops/agent-state/bootstrap.sh | 145 +++ ops/agent-state/lib-state.sh | 258 ++++ ops/agent-state/process-cascade-inbox.py | 113 ++ ops/pipeline-v2/lib/cascade.py | 274 ++++ ops/pipeline-v2/lib/cross_domain.py | 230 ++++ ops/pipeline-v2/lib/db.py | 625 +++++++++ ops/pipeline-v2/lib/evaluate.py | 1465 ++++++++++++++++++++++ ops/pipeline-v2/lib/merge.py | 1449 +++++++++++++++++++++ ops/research-session.sh | 74 +- 10 files changed, 4884 insertions(+), 4 deletions(-) create mode 100644 ops/agent-state/SCHEMA.md create mode 100755 ops/agent-state/bootstrap.sh create mode 100755 ops/agent-state/lib-state.sh create mode 100644 ops/agent-state/process-cascade-inbox.py create mode 100644 ops/pipeline-v2/lib/cascade.py create mode 100644 ops/pipeline-v2/lib/cross_domain.py create mode 100644 ops/pipeline-v2/lib/db.py create mode 100644 ops/pipeline-v2/lib/evaluate.py create mode 100644 ops/pipeline-v2/lib/merge.py diff --git a/ops/agent-state/SCHEMA.md b/ops/agent-state/SCHEMA.md new file mode 100644 index 000000000..63cc6f0f0 --- /dev/null +++ b/ops/agent-state/SCHEMA.md @@ -0,0 +1,255 @@ +# Agent State Schema v1 + +File-backed durable state for teleo agents running headless on VPS. +Survives context truncation, crash recovery, and session handoffs. + +## Design Principles + +1. **Three formats** — JSON for structured fields, JSONL for append-only logs, Markdown for context-window-friendly content +2. **Many small files** — selective loading, crash isolation, no locks needed +3. **Write on events** — not timers. State updates happen when something meaningful changes. +4. **Shared-nothing writes** — each agent owns its directory. Communication via inbox files. +5. **State ≠ Git** — state is operational (how the agent functions). Git is output (what the agent produces). + +## Directory Layout + +``` +/opt/teleo-eval/agent-state/{agent}/ +├── report.json # Current status — read every wake +├── tasks.json # Active task queue — read every wake +├── session.json # Current/last session metadata +├── memory.md # Accumulated cross-session knowledge (structured) +├── inbox/ # Messages from other agents/orchestrator +│ └── {uuid}.json # One file per message, atomic create +├── journal.jsonl # Append-only session log +└── metrics.json # Cumulative performance counters +``` + +## File Specifications + +### report.json + +Written: after each meaningful action (session start, key finding, session end) +Read: every wake, by orchestrator for monitoring + +```json +{ + "agent": "rio", + "updated_at": "2026-03-31T22:00:00Z", + "status": "idle | researching | extracting | evaluating | error", + "summary": "Completed research session — 8 sources archived on Solana launchpad mechanics", + "current_task": null, + "last_session": { + "id": "20260331-220000", + "started_at": "2026-03-31T20:30:00Z", + "ended_at": "2026-03-31T22:00:00Z", + "outcome": "completed | timeout | error", + "sources_archived": 8, + "branch": "rio/research-2026-03-31", + "pr_number": 247 + }, + "blocked_by": null, + "next_priority": "Follow up on conditional AMM thread from @0xfbifemboy" +} +``` + +### tasks.json + +Written: when task status changes +Read: every wake + +```json +{ + "agent": "rio", + "updated_at": "2026-03-31T22:00:00Z", + "tasks": [ + { + "id": "task-001", + "type": "research | extract | evaluate | follow-up | disconfirm", + "description": "Investigate conditional AMM mechanisms in MetaDAO v2", + "status": "pending | active | completed | dropped", + "priority": "high | medium | low", + "created_at": "2026-03-31T22:00:00Z", + "context": "Flagged in research session 2026-03-31 — @0xfbifemboy thread on conditional liquidity", + "follow_up_from": null, + "completed_at": null, + "outcome": null + } + ] +} +``` + +### session.json + +Written: at session start and session end +Read: every wake (for continuation), by orchestrator for scheduling + +```json +{ + "agent": "rio", + "session_id": "20260331-220000", + "started_at": "2026-03-31T20:30:00Z", + "ended_at": "2026-03-31T22:00:00Z", + "type": "research | extract | evaluate | ad-hoc", + "domain": "internet-finance", + "branch": "rio/research-2026-03-31", + "status": "running | completed | timeout | error", + "model": "sonnet", + "timeout_seconds": 5400, + "research_question": "How is conditional liquidity being implemented in Solana AMMs?", + "belief_targeted": "Markets aggregate information better than votes because skin-in-the-game creates selection pressure on beliefs", + "disconfirmation_target": "Cases where prediction markets failed to aggregate information despite financial incentives", + "sources_archived": 8, + "sources_expected": 10, + "tokens_used": null, + "cost_usd": null, + "errors": [], + "handoff_notes": "Found 3 sources on conditional AMM failures — needs extraction. Also flagged @metaproph3t thread for Theseus (AI governance angle)." +} +``` + +### memory.md + +Written: at session end, when learning something critical +Read: every wake (included in research prompt context) + +```markdown +# Rio — Operational Memory + +## Cross-Session Patterns +- Conditional AMMs keep appearing across 3+ independent sources (sessions 03-28, 03-29, 03-31). This is likely a real trend, not cherry-picking. +- @0xfbifemboy consistently produces highest-signal threads in the DeFi mechanism design space. + +## Dead Ends (don't re-investigate) +- Polymarket fee structure analysis (2026-03-25): fully documented in existing claims, no new angles. +- Jupiter governance token utility (2026-03-27): vaporware, no mechanism to analyze. + +## Open Questions +- Is MetaDAO's conditional market maker manipulation-resistant at scale? No evidence either way yet. +- How does futarchy handle low-liquidity markets? This is the keystone weakness. + +## Corrections +- Previously believed Drift protocol was pure order-book. Actually hybrid AMM+CLOB. Updated 2026-03-30. + +## Cross-Agent Flags Received +- Theseus (2026-03-29): "Check if MetaDAO governance has AI agent participation — alignment implications" +- Leo (2026-03-28): "Your conditional AMM analysis connects to Astra's resource allocation claims" +``` + +### inbox/{uuid}.json + +Written: by other agents or orchestrator +Read: checked on wake, deleted after processing + +```json +{ + "id": "msg-abc123", + "from": "theseus", + "to": "rio", + "created_at": "2026-03-31T18:00:00Z", + "type": "flag | task | question | cascade", + "priority": "high | normal", + "subject": "Check MetaDAO for AI agent participation", + "body": "Found evidence that AI agents are trading on Drift — check if any are participating in MetaDAO conditional markets. Alignment implications if automated agents are influencing futarchic governance.", + "source_ref": "theseus/research-2026-03-31", + "expires_at": null +} +``` + +### journal.jsonl + +Written: append at session boundaries +Read: debug/audit only (never loaded into agent context by default) + +```jsonl +{"ts":"2026-03-31T20:30:00Z","event":"session_start","session_id":"20260331-220000","type":"research"} +{"ts":"2026-03-31T20:35:00Z","event":"orient_complete","files_read":["identity.md","beliefs.md","reasoning.md","_map.md"]} +{"ts":"2026-03-31T21:30:00Z","event":"sources_archived","count":5,"domain":"internet-finance"} +{"ts":"2026-03-31T22:00:00Z","event":"session_end","outcome":"completed","sources_archived":8,"handoff":"conditional AMM failures need extraction"} +``` + +### metrics.json + +Written: at session end (cumulative counters) +Read: by CI scoring system, by orchestrator for scheduling decisions + +```json +{ + "agent": "rio", + "updated_at": "2026-03-31T22:00:00Z", + "lifetime": { + "sessions_total": 47, + "sessions_completed": 42, + "sessions_timeout": 3, + "sessions_error": 2, + "sources_archived": 312, + "claims_proposed": 89, + "claims_accepted": 71, + "claims_challenged": 12, + "claims_rejected": 6, + "disconfirmation_attempts": 47, + "disconfirmation_hits": 8, + "cross_agent_flags_sent": 23, + "cross_agent_flags_received": 15 + }, + "rolling_30d": { + "sessions": 12, + "sources_archived": 87, + "claims_proposed": 24, + "acceptance_rate": 0.83, + "avg_sources_per_session": 7.25 + } +} +``` + +## Integration Points + +### research-session.sh + +Add these hooks: + +1. **Pre-session** (after branch creation, before Claude launch): + - Write `session.json` with status "running" + - Write `report.json` with status "researching" + - Append session_start to `journal.jsonl` + - Include `memory.md` and `tasks.json` in the research prompt + +2. **Post-session** (after commit, before/after PR): + - Update `session.json` with outcome, source count, branch, PR number + - Update `report.json` with summary and next_priority + - Update `metrics.json` counters + - Append session_end to `journal.jsonl` + - Process and clean `inbox/` (mark processed messages) + +3. **On error/timeout**: + - Update `session.json` status to "error" or "timeout" + - Update `report.json` with error info + - Append error event to `journal.jsonl` + +### Pipeline daemon (teleo-pipeline.py) + +- Read `report.json` for all agents to build dashboard +- Write to `inbox/` when cascade events need agent attention +- Read `metrics.json` for scheduling decisions (deprioritize agents with high error rates) + +### Claude research prompt + +Add to the prompt: +``` +### Step 0: Load Operational State (1 min) +Read /opt/teleo-eval/agent-state/{agent}/memory.md — this is your cross-session operational memory. +Read /opt/teleo-eval/agent-state/{agent}/tasks.json — check for pending tasks. +Check /opt/teleo-eval/agent-state/{agent}/inbox/ for messages from other agents. +Process any high-priority inbox items before choosing your research direction. +``` + +## Bootstrap + +Run `ops/agent-state/bootstrap.sh` to create directories and seed initial state for all agents. + +## Migration from Existing State + +- `research-journal.md` continues as-is (agent-written, in git). `memory.md` is the structured equivalent for operational state (not in git). +- `ops/sessions/*.json` continue for backward compat. `session.json` per agent is the richer replacement. +- `ops/queue.md` remains the human-visible task board. `tasks.json` per agent is the machine-readable equivalent. +- Workspace flags (`~/.pentagon/workspace/collective/flag-*`) migrate to `inbox/` messages over time. diff --git a/ops/agent-state/bootstrap.sh b/ops/agent-state/bootstrap.sh new file mode 100755 index 000000000..087cff910 --- /dev/null +++ b/ops/agent-state/bootstrap.sh @@ -0,0 +1,145 @@ +#!/bin/bash +# Bootstrap agent-state directories for all teleo agents. +# Run once on VPS: bash ops/agent-state/bootstrap.sh +# Safe to re-run — skips existing files, only creates missing ones. + +set -euo pipefail + +STATE_ROOT="${TELEO_STATE_ROOT:-/opt/teleo-eval/agent-state}" + +AGENTS=("rio" "clay" "theseus" "vida" "astra" "leo") +DOMAINS=("internet-finance" "entertainment" "ai-alignment" "health" "space-development" "grand-strategy") + +log() { echo "[$(date -Iseconds)] $*"; } + +for i in "${!AGENTS[@]}"; do + AGENT="${AGENTS[$i]}" + DOMAIN="${DOMAINS[$i]}" + DIR="$STATE_ROOT/$AGENT" + + log "Bootstrapping $AGENT..." + mkdir -p "$DIR/inbox" + + # report.json — current status + if [ ! -f "$DIR/report.json" ]; then + cat > "$DIR/report.json" < "$DIR/tasks.json" < "$DIR/session.json" < "$DIR/memory.md" < "$DIR/metrics.json" < "$DIR/journal.jsonl" + log " Created journal.jsonl" + fi + +done + +log "Bootstrap complete. State root: $STATE_ROOT" +log "Agents initialized: ${AGENTS[*]}" diff --git a/ops/agent-state/lib-state.sh b/ops/agent-state/lib-state.sh new file mode 100755 index 000000000..1b168da66 --- /dev/null +++ b/ops/agent-state/lib-state.sh @@ -0,0 +1,258 @@ +#!/bin/bash +# lib-state.sh — Bash helpers for reading/writing agent state files. +# Source this in pipeline scripts: source ops/agent-state/lib-state.sh +# +# All writes use atomic rename (write to .tmp, then mv) to prevent corruption. +# All reads return valid JSON or empty string on missing/corrupt files. + +STATE_ROOT="${TELEO_STATE_ROOT:-/opt/teleo-eval/agent-state}" + +# --- Internal helpers --- + +_state_dir() { + local agent="$1" + echo "$STATE_ROOT/$agent" +} + +# Atomic write: write to tmp file, then rename. Prevents partial reads. +_atomic_write() { + local filepath="$1" + local content="$2" + local tmpfile="${filepath}.tmp.$$" + echo "$content" > "$tmpfile" + mv -f "$tmpfile" "$filepath" +} + +# --- Report (current status) --- + +state_read_report() { + local agent="$1" + local file="$(_state_dir "$agent")/report.json" + [ -f "$file" ] && cat "$file" || echo "{}" +} + +state_update_report() { + local agent="$1" + local status="$2" + local summary="$3" + local file="$(_state_dir "$agent")/report.json" + + # Read existing, merge with updates using python (available on VPS) + python3 -c " +import json, sys +try: + with open('$file') as f: + data = json.load(f) +except: + data = {'agent': '$agent'} +data['status'] = '$status' +data['summary'] = '''$summary''' +data['updated_at'] = '$(date -u +%Y-%m-%dT%H:%M:%SZ)' +print(json.dumps(data, indent=2)) +" | _atomic_write_stdin "$file" +} + +# Variant that takes full JSON from stdin +_atomic_write_stdin() { + local filepath="$1" + local tmpfile="${filepath}.tmp.$$" + cat > "$tmpfile" + mv -f "$tmpfile" "$filepath" +} + +# Full report update with session info (called at session end) +state_finalize_report() { + local agent="$1" + local status="$2" + local summary="$3" + local session_id="$4" + local started_at="$5" + local ended_at="$6" + local outcome="$7" + local sources="$8" + local branch="$9" + local pr_number="${10}" + local next_priority="${11:-null}" + local file="$(_state_dir "$agent")/report.json" + + python3 -c " +import json +data = { + 'agent': '$agent', + 'updated_at': '$ended_at', + 'status': '$status', + 'summary': '''$summary''', + 'current_task': None, + 'last_session': { + 'id': '$session_id', + 'started_at': '$started_at', + 'ended_at': '$ended_at', + 'outcome': '$outcome', + 'sources_archived': $sources, + 'branch': '$branch', + 'pr_number': $pr_number + }, + 'blocked_by': None, + 'next_priority': $([ "$next_priority" = "null" ] && echo "None" || echo "'$next_priority'") +} +print(json.dumps(data, indent=2)) +" | _atomic_write_stdin "$file" +} + +# --- Session --- + +state_start_session() { + local agent="$1" + local session_id="$2" + local type="$3" + local domain="$4" + local branch="$5" + local model="${6:-sonnet}" + local timeout="${7:-5400}" + local started_at + started_at="$(date -u +%Y-%m-%dT%H:%M:%SZ)" + local file="$(_state_dir "$agent")/session.json" + + python3 -c " +import json +data = { + 'agent': '$agent', + 'session_id': '$session_id', + 'started_at': '$started_at', + 'ended_at': None, + 'type': '$type', + 'domain': '$domain', + 'branch': '$branch', + 'status': 'running', + 'model': '$model', + 'timeout_seconds': $timeout, + 'research_question': None, + 'belief_targeted': None, + 'disconfirmation_target': None, + 'sources_archived': 0, + 'sources_expected': 0, + 'tokens_used': None, + 'cost_usd': None, + 'errors': [], + 'handoff_notes': None +} +print(json.dumps(data, indent=2)) +" | _atomic_write_stdin "$file" + + echo "$started_at" +} + +state_end_session() { + local agent="$1" + local outcome="$2" + local sources="${3:-0}" + local pr_number="${4:-null}" + local file="$(_state_dir "$agent")/session.json" + + python3 -c " +import json +with open('$file') as f: + data = json.load(f) +data['ended_at'] = '$(date -u +%Y-%m-%dT%H:%M:%SZ)' +data['status'] = '$outcome' +data['sources_archived'] = $sources +print(json.dumps(data, indent=2)) +" | _atomic_write_stdin "$file" +} + +# --- Journal (append-only JSONL) --- + +state_journal_append() { + local agent="$1" + local event="$2" + shift 2 + # Remaining args are key=value pairs for extra fields + local file="$(_state_dir "$agent")/journal.jsonl" + local extras="" + for kv in "$@"; do + local key="${kv%%=*}" + local val="${kv#*=}" + extras="$extras, \"$key\": \"$val\"" + done + echo "{\"ts\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\",\"event\":\"$event\"$extras}" >> "$file" +} + +# --- Metrics --- + +state_update_metrics() { + local agent="$1" + local outcome="$2" + local sources="${3:-0}" + local file="$(_state_dir "$agent")/metrics.json" + + python3 -c " +import json +try: + with open('$file') as f: + data = json.load(f) +except: + data = {'agent': '$agent', 'lifetime': {}, 'rolling_30d': {}} + +lt = data.setdefault('lifetime', {}) +lt['sessions_total'] = lt.get('sessions_total', 0) + 1 +if '$outcome' == 'completed': + lt['sessions_completed'] = lt.get('sessions_completed', 0) + 1 +elif '$outcome' == 'timeout': + lt['sessions_timeout'] = lt.get('sessions_timeout', 0) + 1 +elif '$outcome' == 'error': + lt['sessions_error'] = lt.get('sessions_error', 0) + 1 +lt['sources_archived'] = lt.get('sources_archived', 0) + $sources + +data['updated_at'] = '$(date -u +%Y-%m-%dT%H:%M:%SZ)' +print(json.dumps(data, indent=2)) +" | _atomic_write_stdin "$file" +} + +# --- Inbox --- + +state_check_inbox() { + local agent="$1" + local inbox="$(_state_dir "$agent")/inbox" + [ -d "$inbox" ] && ls "$inbox"/*.json 2>/dev/null || true +} + +state_send_message() { + local from="$1" + local to="$2" + local type="$3" + local subject="$4" + local body="$5" + local inbox="$(_state_dir "$to")/inbox" + local msg_id="msg-$(date +%s)-$$" + local file="$inbox/${msg_id}.json" + + mkdir -p "$inbox" + python3 -c " +import json +data = { + 'id': '$msg_id', + 'from': '$from', + 'to': '$to', + 'created_at': '$(date -u +%Y-%m-%dT%H:%M:%SZ)', + 'type': '$type', + 'priority': 'normal', + 'subject': '''$subject''', + 'body': '''$body''', + 'source_ref': None, + 'expires_at': None +} +print(json.dumps(data, indent=2)) +" | _atomic_write_stdin "$file" + echo "$msg_id" +} + +# --- State directory check --- + +state_ensure_dir() { + local agent="$1" + local dir="$(_state_dir "$agent")" + if [ ! -d "$dir" ]; then + echo "ERROR: Agent state not initialized for $agent. Run bootstrap.sh first." >&2 + return 1 + fi +} diff --git a/ops/agent-state/process-cascade-inbox.py b/ops/agent-state/process-cascade-inbox.py new file mode 100644 index 000000000..f314762a4 --- /dev/null +++ b/ops/agent-state/process-cascade-inbox.py @@ -0,0 +1,113 @@ +#!/usr/bin/env python3 +"""Process cascade inbox messages after a research session. + +For each unread cascade-*.md in an agent's inbox: +1. Logs cascade_reviewed event to pipeline.db audit_log +2. Moves the file to inbox/processed/ + +Usage: python3 process-cascade-inbox.py +""" + +import json +import os +import re +import shutil +import sqlite3 +import sys +from datetime import datetime, timezone +from pathlib import Path + +AGENT_STATE_DIR = Path(os.environ.get("AGENT_STATE_DIR", "/opt/teleo-eval/agent-state")) +PIPELINE_DB = Path(os.environ.get("PIPELINE_DB", "/opt/teleo-eval/pipeline/pipeline.db")) + + +def parse_frontmatter(text: str) -> dict: + """Parse YAML-like frontmatter from markdown.""" + fm = {} + match = re.match(r'^---\n(.*?)\n---', text, re.DOTALL) + if not match: + return fm + for line in match.group(1).strip().splitlines(): + if ':' in line: + key, val = line.split(':', 1) + fm[key.strip()] = val.strip().strip('"') + return fm + + +def process_agent_inbox(agent: str) -> int: + """Process cascade messages in agent's inbox. Returns count processed.""" + inbox_dir = AGENT_STATE_DIR / agent / "inbox" + if not inbox_dir.exists(): + return 0 + + cascade_files = sorted(inbox_dir.glob("cascade-*.md")) + if not cascade_files: + return 0 + + # Ensure processed dir exists + processed_dir = inbox_dir / "processed" + processed_dir.mkdir(exist_ok=True) + + processed = 0 + now = datetime.now(timezone.utc).isoformat() + + try: + conn = sqlite3.connect(str(PIPELINE_DB), timeout=10) + conn.execute("PRAGMA journal_mode=WAL") + except sqlite3.Error as e: + print(f"WARNING: Cannot connect to pipeline.db: {e}", file=sys.stderr) + # Still move files even if DB is unavailable + conn = None + + for cf in cascade_files: + try: + text = cf.read_text() + fm = parse_frontmatter(text) + + # Skip already-processed files + if fm.get("status") == "processed": + continue + + # Log to audit_log + if conn: + detail = { + "agent": agent, + "cascade_file": cf.name, + "subject": fm.get("subject", "unknown"), + "original_created": fm.get("created", "unknown"), + "reviewed_at": now, + } + conn.execute( + "INSERT INTO audit_log (stage, event, detail, timestamp) VALUES (?, ?, ?, ?)", + ("cascade", "cascade_reviewed", json.dumps(detail), now), + ) + + # Move to processed + dest = processed_dir / cf.name + shutil.move(str(cf), str(dest)) + processed += 1 + + except Exception as e: + print(f"WARNING: Failed to process {cf.name}: {e}", file=sys.stderr) + + if conn: + try: + conn.commit() + conn.close() + except sqlite3.Error: + pass + + return processed + + +if __name__ == "__main__": + if len(sys.argv) < 2: + print(f"Usage: {sys.argv[0]} ", file=sys.stderr) + sys.exit(1) + + agent = sys.argv[1] + count = process_agent_inbox(agent) + if count > 0: + print(f"Processed {count} cascade message(s) for {agent}") + # Exit 0 regardless — non-fatal + sys.exit(0) diff --git a/ops/pipeline-v2/lib/cascade.py b/ops/pipeline-v2/lib/cascade.py new file mode 100644 index 000000000..13a370743 --- /dev/null +++ b/ops/pipeline-v2/lib/cascade.py @@ -0,0 +1,274 @@ +"""Cascade automation — auto-flag dependent beliefs/positions when claims change. + +Hook point: called from merge.py after _embed_merged_claims, before _delete_remote_branch. +Uses the same main_sha/branch_sha diff to detect changed claim files, then scans +all agent beliefs and positions for depends_on references to those claims. + +Notifications are written to /opt/teleo-eval/agent-state/{agent}/inbox/ using +the same atomic-write pattern as lib-state.sh. +""" + +import asyncio +import hashlib +import json +import logging +import os +import re +import tempfile +from datetime import datetime, timezone +from pathlib import Path + +logger = logging.getLogger("pipeline.cascade") + +AGENT_STATE_DIR = Path("/opt/teleo-eval/agent-state") +CLAIM_DIRS = {"domains/", "core/", "foundations/", "decisions/"} +AGENT_NAMES = ["rio", "leo", "clay", "astra", "vida", "theseus"] + + +def _extract_claim_titles_from_diff(diff_files: list[str]) -> set[str]: + """Extract claim titles from changed file paths.""" + titles = set() + for fpath in diff_files: + if not fpath.endswith(".md"): + continue + if not any(fpath.startswith(d) for d in CLAIM_DIRS): + continue + basename = os.path.basename(fpath) + if basename.startswith("_") or basename == "directory.md": + continue + title = basename.removesuffix(".md") + titles.add(title) + return titles + + +def _normalize_for_match(text: str) -> str: + """Normalize for fuzzy matching: lowercase, hyphens to spaces, strip punctuation, collapse whitespace.""" + text = text.lower().strip() + text = text.replace("-", " ") + text = re.sub(r"[^\w\s]", "", text) + text = re.sub(r"\s+", " ", text) + return text + + +def _slug_to_words(slug: str) -> str: + """Convert kebab-case slug to space-separated words.""" + return slug.replace("-", " ") + + +def _parse_depends_on(file_path: Path) -> tuple[str, list[str]]: + """Parse a belief or position file's depends_on entries. + + Returns (agent_name, [dependency_titles]). + """ + try: + content = file_path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return ("", []) + + agent = "" + deps = [] + in_frontmatter = False + in_depends = False + + for line in content.split("\n"): + if line.strip() == "---": + if not in_frontmatter: + in_frontmatter = True + continue + else: + break + + if in_frontmatter: + if line.startswith("agent:"): + agent = line.split(":", 1)[1].strip().strip('"').strip("'") + elif line.startswith("depends_on:"): + in_depends = True + rest = line.split(":", 1)[1].strip() + if rest.startswith("["): + items = re.findall(r'"([^"]+)"|\'([^\']+)\'', rest) + for item in items: + dep = item[0] or item[1] + dep = dep.strip("[]").replace("[[", "").replace("]]", "") + deps.append(dep) + in_depends = False + elif in_depends: + if line.startswith(" - "): + dep = line.strip().lstrip("- ").strip('"').strip("'") + dep = dep.replace("[[", "").replace("]]", "") + deps.append(dep) + elif line.strip() and not line.startswith(" "): + in_depends = False + + # Also scan body for [[wiki-links]] + body_links = re.findall(r"\[\[([^\]]+)\]\]", content) + for link in body_links: + if link not in deps: + deps.append(link) + + return (agent, deps) + + +def _write_inbox_message(agent: str, subject: str, body: str) -> bool: + """Write a cascade notification to an agent's inbox. Atomic tmp+rename.""" + inbox_dir = AGENT_STATE_DIR / agent / "inbox" + if not inbox_dir.exists(): + logger.warning("cascade: no inbox dir for agent %s, skipping", agent) + return False + + ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S") + file_hash = hashlib.md5(f"{agent}-{subject}-{body[:200]}".encode()).hexdigest()[:8] + filename = f"cascade-{ts}-{subject[:60]}-{file_hash}.md" + final_path = inbox_dir / filename + + try: + fd, tmp_path = tempfile.mkstemp(dir=str(inbox_dir), suffix=".tmp") + with os.fdopen(fd, "w") as f: + f.write(f"---\n") + f.write(f"type: cascade\n") + f.write(f"from: pipeline\n") + f.write(f"to: {agent}\n") + f.write(f"subject: \"{subject}\"\n") + f.write(f"created: {datetime.now(timezone.utc).isoformat()}\n") + f.write(f"status: unread\n") + f.write(f"---\n\n") + f.write(body) + os.rename(tmp_path, str(final_path)) + return True + except OSError: + logger.exception("cascade: failed to write inbox message for %s", agent) + return False + + +def _find_matches(deps: list[str], claim_lookup: dict[str, str]) -> list[str]: + """Check if any dependency matches a changed claim. + + Uses exact normalized match first, then substring containment for longer + strings only (min 15 chars) to avoid false positives on short generic names. + """ + matched = [] + for dep in deps: + norm = _normalize_for_match(dep) + if norm in claim_lookup: + matched.append(claim_lookup[norm]) + else: + # Substring match only for sufficiently specific strings + shorter = min(len(norm), min((len(k) for k in claim_lookup), default=0)) + if shorter >= 15: + for claim_norm, claim_orig in claim_lookup.items(): + if claim_norm in norm or norm in claim_norm: + matched.append(claim_orig) + break + return matched + + +def _format_cascade_body( + file_name: str, + file_type: str, + matched_claims: list[str], + pr_num: int, +) -> str: + """Format the cascade notification body.""" + claims_list = "\n".join(f"- {c}" for c in matched_claims) + return ( + f"# Cascade: upstream claims changed\n\n" + f"Your {file_type} **{file_name}** depends on claims that were modified in PR #{pr_num}.\n\n" + f"## Changed claims\n\n{claims_list}\n\n" + f"## Action needed\n\n" + f"Review whether your {file_type}'s confidence, description, or grounding " + f"needs updating in light of these changes. If the evidence strengthened, " + f"consider increasing confidence. If it weakened or contradicted, flag for " + f"re-evaluation.\n" + ) + + +async def cascade_after_merge( + main_sha: str, + branch_sha: str, + pr_num: int, + main_worktree: Path, + conn=None, +) -> int: + """Scan for beliefs/positions affected by claims changed in this merge. + + Returns the number of cascade notifications sent. + """ + # 1. Get changed files + proc = await asyncio.create_subprocess_exec( + "git", "diff", "--name-only", "--diff-filter=ACMR", + main_sha, branch_sha, + cwd=str(main_worktree), + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + ) + try: + stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=10) + except asyncio.TimeoutError: + proc.kill() + await proc.wait() + logger.warning("cascade: git diff timed out") + return 0 + + if proc.returncode != 0: + logger.warning("cascade: git diff failed (rc=%d)", proc.returncode) + return 0 + + diff_files = [f for f in stdout.decode().strip().split("\n") if f] + + # 2. Extract claim titles from changed files + changed_claims = _extract_claim_titles_from_diff(diff_files) + if not changed_claims: + return 0 + + logger.info("cascade: %d claims changed in PR #%d: %s", + len(changed_claims), pr_num, list(changed_claims)[:5]) + + # Build normalized lookup for fuzzy matching + claim_lookup = {} + for claim in changed_claims: + claim_lookup[_normalize_for_match(claim)] = claim + claim_lookup[_normalize_for_match(_slug_to_words(claim))] = claim + + # 3. Scan all beliefs and positions + notifications = 0 + agents_dir = main_worktree / "agents" + if not agents_dir.exists(): + logger.warning("cascade: no agents/ dir in worktree") + return 0 + + for agent_name in AGENT_NAMES: + agent_dir = agents_dir / agent_name + if not agent_dir.exists(): + continue + + for subdir, file_type in [("beliefs", "belief"), ("positions", "position")]: + target_dir = agent_dir / subdir + if not target_dir.exists(): + continue + for md_file in target_dir.glob("*.md"): + _, deps = _parse_depends_on(md_file) + matched = _find_matches(deps, claim_lookup) + if matched: + body = _format_cascade_body(md_file.name, file_type, matched, pr_num) + if _write_inbox_message(agent_name, f"claim-changed-affects-{file_type}", body): + notifications += 1 + logger.info("cascade: notified %s — %s '%s' affected by %s", + agent_name, file_type, md_file.stem, matched) + + if notifications: + logger.info("cascade: sent %d notifications for PR #%d", notifications, pr_num) + + # Write structured audit_log entry for cascade tracking (Page 4 data) + if conn is not None: + try: + conn.execute( + "INSERT INTO audit_log (stage, event, detail) VALUES (?, ?, ?)", + ("cascade", "cascade_triggered", json.dumps({ + "pr": pr_num, + "claims_changed": list(changed_claims)[:20], + "notifications_sent": notifications, + })), + ) + except Exception: + logger.exception("cascade: audit_log write failed (non-fatal)") + + return notifications diff --git a/ops/pipeline-v2/lib/cross_domain.py b/ops/pipeline-v2/lib/cross_domain.py new file mode 100644 index 000000000..9f22b1a1a --- /dev/null +++ b/ops/pipeline-v2/lib/cross_domain.py @@ -0,0 +1,230 @@ +"""Cross-domain citation index — detect entity overlap across domains. + +Hook point: called from merge.py after cascade_after_merge. +After a claim merges, checks if its referenced entities also appear in claims +from other domains. Logs connections to audit_log for silo detection. + +Two detection methods: +1. Entity name matching — entity names appearing in claim body text (word-boundary) +2. Source overlap — claims citing the same source archive files + +At ~600 claims and ~100 entities, full scan per merge takes <1 second. +""" + +import asyncio +import json +import logging +import os +import re +from pathlib import Path + +logger = logging.getLogger("pipeline.cross_domain") + +# Minimum entity name length to avoid false positives (ORE, QCX, etc) +MIN_ENTITY_NAME_LEN = 4 + +# Entity names that are common English words — skip to avoid false positives +ENTITY_STOPLIST = {"versus", "island", "loyal", "saber", "nebula", "helium", "coal", "snapshot", "dropout"} + + +def _build_entity_names(worktree: Path) -> dict[str, str]: + """Build mapping of entity_slug -> display_name from entity files.""" + names = {} + entity_dir = worktree / "entities" + if not entity_dir.exists(): + return names + for md_file in entity_dir.rglob("*.md"): + if md_file.name.startswith("_"): + continue + try: + content = md_file.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + continue + for line in content.split("\n"): + if line.startswith("name:"): + name = line.split(":", 1)[1].strip().strip('"').strip("'") + if len(name) >= MIN_ENTITY_NAME_LEN and name.lower() not in ENTITY_STOPLIST: + names[md_file.stem] = name + break + return names + + +def _compile_entity_patterns(entity_names: dict[str, str]) -> dict[str, re.Pattern]: + """Pre-compile word-boundary regex for each entity name.""" + patterns = {} + for slug, name in entity_names.items(): + try: + patterns[slug] = re.compile(r'\b' + re.escape(name) + r'\b', re.IGNORECASE) + except re.error: + continue + return patterns + + +def _extract_source_refs(content: str) -> set[str]: + """Extract source archive references ([[YYYY-MM-DD-...]]) from content.""" + return set(re.findall(r"\[\[(20\d{2}-\d{2}-\d{2}-[^\]]+)\]\]", content)) + + +def _find_entity_mentions(content: str, patterns: dict[str, re.Pattern]) -> set[str]: + """Find entity slugs whose names appear in the content (word-boundary match).""" + found = set() + for slug, pat in patterns.items(): + if pat.search(content): + found.add(slug) + return found + + +def _scan_domain_claims(worktree: Path, patterns: dict[str, re.Pattern]) -> dict[str, list[dict]]: + """Build domain -> [claim_info] mapping for all claims.""" + domain_claims = {} + domains_dir = worktree / "domains" + if not domains_dir.exists(): + return domain_claims + + for domain_dir in domains_dir.iterdir(): + if not domain_dir.is_dir(): + continue + claims = [] + for claim_file in domain_dir.glob("*.md"): + if claim_file.name.startswith("_") or claim_file.name == "directory.md": + continue + try: + content = claim_file.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + continue + claims.append({ + "slug": claim_file.stem, + "entities": _find_entity_mentions(content, patterns), + "sources": _extract_source_refs(content), + }) + domain_claims[domain_dir.name] = claims + return domain_claims + + +async def cross_domain_after_merge( + main_sha: str, + branch_sha: str, + pr_num: int, + main_worktree: Path, + conn=None, +) -> int: + """Detect cross-domain entity/source overlap for claims changed in this merge. + + Returns the number of cross-domain connections found. + """ + # 1. Get changed files + proc = await asyncio.create_subprocess_exec( + "git", "diff", "--name-only", "--diff-filter=ACMR", + main_sha, branch_sha, + cwd=str(main_worktree), + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + ) + try: + stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=10) + except asyncio.TimeoutError: + proc.kill() + await proc.wait() + logger.warning("cross_domain: git diff timed out") + return 0 + + if proc.returncode != 0: + return 0 + + diff_files = [f for f in stdout.decode().strip().split("\n") if f] + + # 2. Filter to claim files + changed_claims = [] + for fpath in diff_files: + if not fpath.endswith(".md") or not fpath.startswith("domains/"): + continue + parts = fpath.split("/") + if len(parts) < 3: + continue + basename = os.path.basename(fpath) + if basename.startswith("_") or basename == "directory.md": + continue + changed_claims.append({"path": fpath, "domain": parts[1], "slug": Path(basename).stem}) + + if not changed_claims: + return 0 + + # 3. Build entity patterns and scan all claims + entity_names = _build_entity_names(main_worktree) + if not entity_names: + return 0 + + patterns = _compile_entity_patterns(entity_names) + domain_claims = _scan_domain_claims(main_worktree, patterns) + + # 4. For each changed claim, find cross-domain connections + total_connections = 0 + all_connections = [] + + for claim in changed_claims: + claim_path = main_worktree / claim["path"] + try: + content = claim_path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + continue + + my_entities = _find_entity_mentions(content, patterns) + my_sources = _extract_source_refs(content) + + if not my_entities and not my_sources: + continue + + connections = [] + for other_domain, other_claims in domain_claims.items(): + if other_domain == claim["domain"]: + continue + for other in other_claims: + shared_entities = my_entities & other["entities"] + shared_sources = my_sources & other["sources"] + + # Threshold: >=2 shared entities, OR 1 entity + 1 source + entity_count = len(shared_entities) + source_count = len(shared_sources) + + if entity_count >= 2 or (entity_count >= 1 and source_count >= 1): + connections.append({ + "other_claim": other["slug"], + "other_domain": other_domain, + "shared_entities": sorted(shared_entities)[:5], + "shared_sources": sorted(shared_sources)[:3], + }) + + if connections: + total_connections += len(connections) + all_connections.append({ + "claim": claim["slug"], + "domain": claim["domain"], + "connections": connections[:10], + }) + logger.info( + "cross_domain: %s (%s) has %d cross-domain connections", + claim["slug"], claim["domain"], len(connections), + ) + + # 5. Log to audit_log + if all_connections and conn is not None: + try: + conn.execute( + "INSERT INTO audit_log (stage, event, detail) VALUES (?, ?, ?)", + ("cross_domain", "connections_found", json.dumps({ + "pr": pr_num, + "total_connections": total_connections, + "claims_with_connections": len(all_connections), + "details": all_connections[:10], + })), + ) + except Exception: + logger.exception("cross_domain: audit_log write failed (non-fatal)") + + if total_connections: + logger.info( + "cross_domain: PR #%d — %d connections across %d claims", + pr_num, total_connections, len(all_connections), + ) + + return total_connections diff --git a/ops/pipeline-v2/lib/db.py b/ops/pipeline-v2/lib/db.py new file mode 100644 index 000000000..0e023bd97 --- /dev/null +++ b/ops/pipeline-v2/lib/db.py @@ -0,0 +1,625 @@ +"""SQLite database — schema, migrations, connection management.""" + +import json +import logging +import sqlite3 +from contextlib import contextmanager + +from . import config + +logger = logging.getLogger("pipeline.db") + +SCHEMA_VERSION = 12 + +SCHEMA_SQL = """ +CREATE TABLE IF NOT EXISTS schema_version ( + version INTEGER PRIMARY KEY, + applied_at TEXT DEFAULT (datetime('now')) +); + +CREATE TABLE IF NOT EXISTS sources ( + path TEXT PRIMARY KEY, + status TEXT NOT NULL DEFAULT 'unprocessed', + -- unprocessed, triaging, extracting, extracted, null_result, + -- needs_reextraction, error + priority TEXT DEFAULT 'medium', + -- critical, high, medium, low, skip + priority_log TEXT DEFAULT '[]', + -- JSON array: [{stage, priority, reasoning, ts}] + extraction_model TEXT, + claims_count INTEGER DEFAULT 0, + pr_number INTEGER, + transient_retries INTEGER DEFAULT 0, + substantive_retries INTEGER DEFAULT 0, + last_error TEXT, + feedback TEXT, + -- eval feedback for re-extraction (JSON) + cost_usd REAL DEFAULT 0, + created_at TEXT DEFAULT (datetime('now')), + updated_at TEXT DEFAULT (datetime('now')) +); + +CREATE TABLE IF NOT EXISTS prs ( + number INTEGER PRIMARY KEY, + source_path TEXT REFERENCES sources(path), + branch TEXT, + status TEXT NOT NULL DEFAULT 'open', + -- validating, open, reviewing, approved, merging, merged, closed, zombie, conflict + -- conflict: rebase failed or merge timed out — needs human intervention + domain TEXT, + agent TEXT, + commit_type TEXT CHECK(commit_type IS NULL OR commit_type IN ('extract', 'research', 'entity', 'decision', 'reweave', 'fix', 'challenge', 'enrich', 'synthesize', 'unknown')), + tier TEXT, + -- LIGHT, STANDARD, DEEP + tier0_pass INTEGER, + -- 0/1 + leo_verdict TEXT DEFAULT 'pending', + -- pending, approve, request_changes, skipped, failed + domain_verdict TEXT DEFAULT 'pending', + domain_agent TEXT, + domain_model TEXT, + priority TEXT, + -- NULL = inherit from source. Set explicitly for human-submitted PRs. + -- Pipeline PRs: COALESCE(p.priority, s.priority, 'medium') + -- Human PRs: 'critical' (detected via missing source_path or non-agent author) + origin TEXT DEFAULT 'pipeline', + -- pipeline | human | external + transient_retries INTEGER DEFAULT 0, + substantive_retries INTEGER DEFAULT 0, + last_error TEXT, + last_attempt TEXT, + cost_usd REAL DEFAULT 0, + created_at TEXT DEFAULT (datetime('now')), + merged_at TEXT +); + +CREATE TABLE IF NOT EXISTS costs ( + date TEXT, + model TEXT, + stage TEXT, + calls INTEGER DEFAULT 0, + input_tokens INTEGER DEFAULT 0, + output_tokens INTEGER DEFAULT 0, + cost_usd REAL DEFAULT 0, + PRIMARY KEY (date, model, stage) +); + +CREATE TABLE IF NOT EXISTS circuit_breakers ( + name TEXT PRIMARY KEY, + state TEXT DEFAULT 'closed', + -- closed, open, halfopen + failures INTEGER DEFAULT 0, + successes INTEGER DEFAULT 0, + tripped_at TEXT, + last_success_at TEXT, + -- heartbeat: if now() - last_success_at > 2*interval, stage is stalled (Vida) + last_update TEXT DEFAULT (datetime('now')) +); + +CREATE TABLE IF NOT EXISTS audit_log ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + timestamp TEXT DEFAULT (datetime('now')), + stage TEXT, + event TEXT, + detail TEXT +); + +CREATE TABLE IF NOT EXISTS response_audit ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + timestamp TEXT NOT NULL DEFAULT (datetime('now')), + chat_id INTEGER, + user TEXT, + agent TEXT DEFAULT 'rio', + model TEXT, + query TEXT, + conversation_window TEXT, + -- JSON: prior N messages for context + -- NOTE: intentional duplication of transcript data for audit self-containment. + -- Transcripts live in /opt/teleo-eval/transcripts/ but audit rows need prompt + -- context inline for retrieval-quality diagnosis. Primary driver of row size — + -- target for cleanup when 90-day retention policy lands. + entities_matched TEXT, + -- JSON: [{name, path, score, used_in_response}] + claims_matched TEXT, + -- JSON: [{path, title, score, source, used_in_response}] + retrieval_layers_hit TEXT, + -- JSON: ["keyword","qdrant","graph"] + retrieval_gap TEXT, + -- What the KB was missing (if anything) + market_data TEXT, + -- JSON: injected token prices + research_context TEXT, + -- Haiku pre-pass results if any + kb_context_text TEXT, + -- Full context string sent to model + tool_calls TEXT, + -- JSON: ordered array [{tool, input, output, duration_ms, ts}] + raw_response TEXT, + display_response TEXT, + confidence_score REAL, + -- Model self-rated retrieval quality 0.0-1.0 + response_time_ms INTEGER, + -- Eval pipeline columns (v10) + prompt_tokens INTEGER, + completion_tokens INTEGER, + generation_cost REAL, + embedding_cost REAL, + total_cost REAL, + blocked INTEGER DEFAULT 0, + block_reason TEXT, + query_type TEXT, + created_at TEXT DEFAULT (datetime('now')) +); + +CREATE INDEX IF NOT EXISTS idx_sources_status ON sources(status); +CREATE INDEX IF NOT EXISTS idx_prs_status ON prs(status); +CREATE INDEX IF NOT EXISTS idx_prs_domain ON prs(domain); +CREATE INDEX IF NOT EXISTS idx_costs_date ON costs(date); +CREATE INDEX IF NOT EXISTS idx_audit_stage ON audit_log(stage); +CREATE INDEX IF NOT EXISTS idx_response_audit_ts ON response_audit(timestamp); +CREATE INDEX IF NOT EXISTS idx_response_audit_agent ON response_audit(agent); +CREATE INDEX IF NOT EXISTS idx_response_audit_chat_ts ON response_audit(chat_id, timestamp); +""" + + +def get_connection(readonly: bool = False) -> sqlite3.Connection: + """Create a SQLite connection with WAL mode and proper settings.""" + config.DB_PATH.parent.mkdir(parents=True, exist_ok=True) + conn = sqlite3.connect( + str(config.DB_PATH), + timeout=30, + isolation_level=None, # autocommit — we manage transactions explicitly + ) + conn.row_factory = sqlite3.Row + conn.execute("PRAGMA journal_mode=WAL") + conn.execute("PRAGMA busy_timeout=10000") + conn.execute("PRAGMA foreign_keys=ON") + if readonly: + conn.execute("PRAGMA query_only=ON") + return conn + + +@contextmanager +def transaction(conn: sqlite3.Connection): + """Context manager for explicit transactions.""" + conn.execute("BEGIN") + try: + yield conn + conn.execute("COMMIT") + except Exception: + conn.execute("ROLLBACK") + raise + + +# Branch prefix → (agent, commit_type) mapping. +# Single source of truth — used by merge.py at INSERT time and migration v7 backfill. +# Unknown prefixes → ('unknown', 'unknown') + warning log. +BRANCH_PREFIX_MAP = { + "extract": ("pipeline", "extract"), + "ingestion": ("pipeline", "extract"), + "epimetheus": ("epimetheus", "extract"), + "rio": ("rio", "research"), + "theseus": ("theseus", "research"), + "astra": ("astra", "research"), + "vida": ("vida", "research"), + "clay": ("clay", "research"), + "leo": ("leo", "entity"), + "reweave": ("pipeline", "reweave"), + "fix": ("pipeline", "fix"), +} + + +def classify_branch(branch: str) -> tuple[str, str]: + """Derive (agent, commit_type) from branch prefix. + + Returns ('unknown', 'unknown') and logs a warning for unrecognized prefixes. + """ + prefix = branch.split("/", 1)[0] if "/" in branch else branch + result = BRANCH_PREFIX_MAP.get(prefix) + if result is None: + logger.warning("Unknown branch prefix %r in branch %r — defaulting to ('unknown', 'unknown')", prefix, branch) + return ("unknown", "unknown") + return result + + +def migrate(conn: sqlite3.Connection): + """Run schema migrations.""" + conn.executescript(SCHEMA_SQL) + + # Check current version + try: + row = conn.execute("SELECT MAX(version) as v FROM schema_version").fetchone() + current = row["v"] if row and row["v"] else 0 + except sqlite3.OperationalError: + current = 0 + + # --- Incremental migrations --- + if current < 2: + # Phase 2: add multiplayer columns to prs table + for stmt in [ + "ALTER TABLE prs ADD COLUMN priority TEXT", + "ALTER TABLE prs ADD COLUMN origin TEXT DEFAULT 'pipeline'", + "ALTER TABLE prs ADD COLUMN last_error TEXT", + ]: + try: + conn.execute(stmt) + except sqlite3.OperationalError: + pass # Column already exists (idempotent) + logger.info("Migration v2: added priority, origin, last_error to prs") + + if current < 3: + # Phase 3: retry budget — track eval attempts and issue tags per PR + for stmt in [ + "ALTER TABLE prs ADD COLUMN eval_attempts INTEGER DEFAULT 0", + "ALTER TABLE prs ADD COLUMN eval_issues TEXT DEFAULT '[]'", + ]: + try: + conn.execute(stmt) + except sqlite3.OperationalError: + pass # Column already exists (idempotent) + logger.info("Migration v3: added eval_attempts, eval_issues to prs") + + if current < 4: + # Phase 4: auto-fixer — track fix attempts per PR + for stmt in [ + "ALTER TABLE prs ADD COLUMN fix_attempts INTEGER DEFAULT 0", + ]: + try: + conn.execute(stmt) + except sqlite3.OperationalError: + pass # Column already exists (idempotent) + logger.info("Migration v4: added fix_attempts to prs") + + if current < 5: + # Phase 5: contributor identity system — tracks who contributed what + # Aligned with schemas/attribution.md (5 roles) + Leo's tier system. + # CI is COMPUTED from raw counts × weights, never stored. + conn.executescript(""" + CREATE TABLE IF NOT EXISTS contributors ( + handle TEXT PRIMARY KEY, + display_name TEXT, + agent_id TEXT, + first_contribution TEXT, + last_contribution TEXT, + tier TEXT DEFAULT 'new', + -- new, contributor, veteran + sourcer_count INTEGER DEFAULT 0, + extractor_count INTEGER DEFAULT 0, + challenger_count INTEGER DEFAULT 0, + synthesizer_count INTEGER DEFAULT 0, + reviewer_count INTEGER DEFAULT 0, + claims_merged INTEGER DEFAULT 0, + challenges_survived INTEGER DEFAULT 0, + domains TEXT DEFAULT '[]', + highlights TEXT DEFAULT '[]', + identities TEXT DEFAULT '{}', + created_at TEXT DEFAULT (datetime('now')), + updated_at TEXT DEFAULT (datetime('now')) + ); + + CREATE INDEX IF NOT EXISTS idx_contributors_tier ON contributors(tier); + """) + logger.info("Migration v5: added contributors table") + + if current < 6: + # Phase 6: analytics — time-series metrics snapshots for trending dashboard + conn.executescript(""" + CREATE TABLE IF NOT EXISTS metrics_snapshots ( + ts TEXT DEFAULT (datetime('now')), + throughput_1h INTEGER, + approval_rate REAL, + open_prs INTEGER, + merged_total INTEGER, + closed_total INTEGER, + conflict_total INTEGER, + evaluated_24h INTEGER, + fix_success_rate REAL, + rejection_broken_wiki_links INTEGER DEFAULT 0, + rejection_frontmatter_schema INTEGER DEFAULT 0, + rejection_near_duplicate INTEGER DEFAULT 0, + rejection_confidence INTEGER DEFAULT 0, + rejection_other INTEGER DEFAULT 0, + extraction_model TEXT, + eval_domain_model TEXT, + eval_leo_model TEXT, + prompt_version TEXT, + pipeline_version TEXT, + source_origin_agent INTEGER DEFAULT 0, + source_origin_human INTEGER DEFAULT 0, + source_origin_scraper INTEGER DEFAULT 0 + ); + + CREATE INDEX IF NOT EXISTS idx_snapshots_ts ON metrics_snapshots(ts); + """) + logger.info("Migration v6: added metrics_snapshots table for analytics dashboard") + + if current < 7: + # Phase 7: agent attribution + commit_type for dashboard + # commit_type column + backfill agent/commit_type from branch prefix + try: + conn.execute("ALTER TABLE prs ADD COLUMN commit_type TEXT CHECK(commit_type IS NULL OR commit_type IN ('extract', 'research', 'entity', 'decision', 'reweave', 'fix', 'unknown'))") + except sqlite3.OperationalError: + pass # column already exists from CREATE TABLE + # Backfill agent and commit_type from branch prefix + rows = conn.execute("SELECT number, branch FROM prs WHERE branch IS NOT NULL").fetchall() + for row in rows: + agent, commit_type = classify_branch(row["branch"]) + conn.execute( + "UPDATE prs SET agent = ?, commit_type = ? WHERE number = ? AND (agent IS NULL OR commit_type IS NULL)", + (agent, commit_type, row["number"]), + ) + backfilled = len(rows) + logger.info("Migration v7: added commit_type column, backfilled %d PRs with agent/commit_type", backfilled) + + if current < 8: + # Phase 8: response audit — full-chain visibility for agent response quality + # Captures: query → tool calls → retrieval → context → response → confidence + # Approved by Ganymede (architecture), Rio (agent needs), Rhea (ops) + conn.executescript(""" + CREATE TABLE IF NOT EXISTS response_audit ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + timestamp TEXT NOT NULL DEFAULT (datetime('now')), + chat_id INTEGER, + user TEXT, + agent TEXT DEFAULT 'rio', + model TEXT, + query TEXT, + conversation_window TEXT, -- intentional transcript duplication for audit self-containment + entities_matched TEXT, + claims_matched TEXT, + retrieval_layers_hit TEXT, + retrieval_gap TEXT, + market_data TEXT, + research_context TEXT, + kb_context_text TEXT, + tool_calls TEXT, + raw_response TEXT, + display_response TEXT, + confidence_score REAL, + response_time_ms INTEGER, + created_at TEXT DEFAULT (datetime('now')) + ); + + CREATE INDEX IF NOT EXISTS idx_response_audit_ts ON response_audit(timestamp); + CREATE INDEX IF NOT EXISTS idx_response_audit_agent ON response_audit(agent); + CREATE INDEX IF NOT EXISTS idx_response_audit_chat_ts ON response_audit(chat_id, timestamp); + """) + logger.info("Migration v8: added response_audit table for agent response auditing") + + if current < 9: + # Phase 9: rebuild prs table to expand CHECK constraint on commit_type. + # SQLite cannot ALTER CHECK constraints in-place — must rebuild table. + # Old constraint (v7): extract,research,entity,decision,reweave,fix,unknown + # New constraint: adds challenge,enrich,synthesize + # Also re-derive commit_type from branch prefix for rows with invalid/NULL values. + + # Step 1: Get all column names from existing table + cols_info = conn.execute("PRAGMA table_info(prs)").fetchall() + col_names = [c["name"] for c in cols_info] + col_list = ", ".join(col_names) + + # Step 2: Create new table with expanded CHECK constraint + conn.executescript(f""" + CREATE TABLE prs_new ( + number INTEGER PRIMARY KEY, + source_path TEXT REFERENCES sources(path), + branch TEXT, + status TEXT NOT NULL DEFAULT 'open', + domain TEXT, + agent TEXT, + commit_type TEXT CHECK(commit_type IS NULL OR commit_type IN ('extract','research','entity','decision','reweave','fix','challenge','enrich','synthesize','unknown')), + tier TEXT, + tier0_pass INTEGER, + leo_verdict TEXT DEFAULT 'pending', + domain_verdict TEXT DEFAULT 'pending', + domain_agent TEXT, + domain_model TEXT, + priority TEXT, + origin TEXT DEFAULT 'pipeline', + transient_retries INTEGER DEFAULT 0, + substantive_retries INTEGER DEFAULT 0, + last_error TEXT, + last_attempt TEXT, + cost_usd REAL DEFAULT 0, + created_at TEXT DEFAULT (datetime('now')), + merged_at TEXT + ); + INSERT INTO prs_new ({col_list}) SELECT {col_list} FROM prs; + DROP TABLE prs; + ALTER TABLE prs_new RENAME TO prs; + """) + logger.info("Migration v9: rebuilt prs table with expanded commit_type CHECK constraint") + + # Step 3: Re-derive commit_type from branch prefix for invalid/NULL values + rows = conn.execute( + """SELECT number, branch FROM prs + WHERE branch IS NOT NULL + AND (commit_type IS NULL + OR commit_type NOT IN ('extract','research','entity','decision','reweave','fix','challenge','enrich','synthesize','unknown'))""" + ).fetchall() + fixed = 0 + for row in rows: + agent, commit_type = classify_branch(row["branch"]) + conn.execute( + "UPDATE prs SET agent = COALESCE(agent, ?), commit_type = ? WHERE number = ?", + (agent, commit_type, row["number"]), + ) + fixed += 1 + conn.commit() + logger.info("Migration v9: re-derived commit_type for %d PRs with invalid/NULL values", fixed) + + if current < 10: + # Add eval pipeline columns to response_audit + # VPS may already be at v10/v11 from prior (incomplete) deploys — use IF NOT EXISTS pattern + for col_def in [ + ("prompt_tokens", "INTEGER"), + ("completion_tokens", "INTEGER"), + ("generation_cost", "REAL"), + ("embedding_cost", "REAL"), + ("total_cost", "REAL"), + ("blocked", "INTEGER DEFAULT 0"), + ("block_reason", "TEXT"), + ("query_type", "TEXT"), + ]: + try: + conn.execute(f"ALTER TABLE response_audit ADD COLUMN {col_def[0]} {col_def[1]}") + except sqlite3.OperationalError: + pass # Column already exists + conn.commit() + logger.info("Migration v10: added eval pipeline columns to response_audit") + + + if current < 11: + # Phase 11: compute tracking — extended costs table columns + # (May already exist on VPS from manual deploy — idempotent ALTERs) + for col_def in [ + ("duration_ms", "INTEGER DEFAULT 0"), + ("cache_read_tokens", "INTEGER DEFAULT 0"), + ("cache_write_tokens", "INTEGER DEFAULT 0"), + ("cost_estimate_usd", "REAL DEFAULT 0"), + ]: + try: + conn.execute(f"ALTER TABLE costs ADD COLUMN {col_def[0]} {col_def[1]}") + except sqlite3.OperationalError: + pass # Column already exists + conn.commit() + logger.info("Migration v11: added compute tracking columns to costs") + + if current < 12: + # Phase 12: structured review records — captures all evaluation outcomes + # including rejections, disagreements, and approved-with-changes. + # Schema locked with Leo (2026-04-01). + conn.executescript(""" + CREATE TABLE IF NOT EXISTS review_records ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + pr_number INTEGER NOT NULL, + claim_path TEXT, + domain TEXT, + agent TEXT, + reviewer TEXT NOT NULL, + reviewer_model TEXT, + outcome TEXT NOT NULL + CHECK (outcome IN ('approved', 'approved-with-changes', 'rejected')), + rejection_reason TEXT + CHECK (rejection_reason IS NULL OR rejection_reason IN ( + 'fails-standalone-test', 'duplicate', 'scope-mismatch', + 'evidence-insufficient', 'framing-poor', 'other' + )), + disagreement_type TEXT + CHECK (disagreement_type IS NULL OR disagreement_type IN ( + 'factual', 'scope', 'framing', 'evidence' + )), + notes TEXT, + batch_id TEXT, + claims_in_batch INTEGER DEFAULT 1, + reviewed_at TEXT DEFAULT (datetime('now')) + ); + CREATE INDEX IF NOT EXISTS idx_review_records_pr ON review_records(pr_number); + CREATE INDEX IF NOT EXISTS idx_review_records_outcome ON review_records(outcome); + CREATE INDEX IF NOT EXISTS idx_review_records_domain ON review_records(domain); + CREATE INDEX IF NOT EXISTS idx_review_records_reviewer ON review_records(reviewer); + """) + logger.info("Migration v12: created review_records table") + + if current < SCHEMA_VERSION: + conn.execute( + "INSERT OR REPLACE INTO schema_version (version) VALUES (?)", + (SCHEMA_VERSION,), + ) + conn.commit() # Explicit commit — executescript auto-commits DDL but not subsequent DML + logger.info("Database migrated to schema version %d", SCHEMA_VERSION) + else: + logger.debug("Database at schema version %d", current) + + +def audit(conn: sqlite3.Connection, stage: str, event: str, detail: str = None): + """Write an audit log entry.""" + conn.execute( + "INSERT INTO audit_log (stage, event, detail) VALUES (?, ?, ?)", + (stage, event, detail), + ) + + + + +def record_review(conn, pr_number: int, reviewer: str, outcome: str, *, + claim_path: str = None, domain: str = None, agent: str = None, + reviewer_model: str = None, rejection_reason: str = None, + disagreement_type: str = None, notes: str = None, + claims_in_batch: int = 1): + """Record a structured review outcome. + + Called from evaluate stage after Leo/domain reviewer returns a verdict. + outcome must be: approved, approved-with-changes, or rejected. + """ + batch_id = str(pr_number) + conn.execute( + """INSERT INTO review_records + (pr_number, claim_path, domain, agent, reviewer, reviewer_model, + outcome, rejection_reason, disagreement_type, notes, + batch_id, claims_in_batch) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + (pr_number, claim_path, domain, agent, reviewer, reviewer_model, + outcome, rejection_reason, disagreement_type, notes, + batch_id, claims_in_batch), + ) + +def append_priority_log(conn: sqlite3.Connection, path: str, stage: str, priority: str, reasoning: str): + """Append a priority assessment to a source's priority_log. + + NOTE: This does NOT update the source's priority column. The priority column + is the authoritative priority, set only by initial triage or human override. + The priority_log records each stage's opinion for offline calibration analysis. + (Bug caught by Theseus — original version overwrote priority with each stage's opinion.) + (Race condition fix per Vida — read-then-write wrapped in transaction.) + """ + conn.execute("BEGIN") + try: + row = conn.execute("SELECT priority_log FROM sources WHERE path = ?", (path,)).fetchone() + if not row: + conn.execute("ROLLBACK") + return + log = json.loads(row["priority_log"] or "[]") + log.append({"stage": stage, "priority": priority, "reasoning": reasoning}) + conn.execute( + "UPDATE sources SET priority_log = ?, updated_at = datetime('now') WHERE path = ?", + (json.dumps(log), path), + ) + conn.execute("COMMIT") + except Exception: + conn.execute("ROLLBACK") + raise + + +def insert_response_audit(conn: sqlite3.Connection, **kwargs): + """Insert a response audit record. All fields optional except query.""" + cols = [ + "timestamp", "chat_id", "user", "agent", "model", "query", + "conversation_window", "entities_matched", "claims_matched", + "retrieval_layers_hit", "retrieval_gap", "market_data", + "research_context", "kb_context_text", "tool_calls", + "raw_response", "display_response", "confidence_score", + "response_time_ms", + # Eval pipeline columns (v10) + "prompt_tokens", "completion_tokens", "generation_cost", + "embedding_cost", "total_cost", "blocked", "block_reason", + "query_type", + ] + present = {k: v for k, v in kwargs.items() if k in cols and v is not None} + if not present: + return + col_names = ", ".join(present.keys()) + placeholders = ", ".join("?" for _ in present) + conn.execute( + f"INSERT INTO response_audit ({col_names}) VALUES ({placeholders})", + tuple(present.values()), + ) + + +def set_priority(conn: sqlite3.Connection, path: str, priority: str, reason: str = "human override"): + """Set a source's authoritative priority. Used for human overrides and initial triage.""" + conn.execute( + "UPDATE sources SET priority = ?, updated_at = datetime('now') WHERE path = ?", + (priority, path), + ) + append_priority_log(conn, path, "override", priority, reason) diff --git a/ops/pipeline-v2/lib/evaluate.py b/ops/pipeline-v2/lib/evaluate.py new file mode 100644 index 000000000..074abe41a --- /dev/null +++ b/ops/pipeline-v2/lib/evaluate.py @@ -0,0 +1,1465 @@ +"""Evaluate stage — PR lifecycle orchestration. + +Tier-based review routing. Model diversity: GPT-4o (domain) + Sonnet (Leo STANDARD) ++ Opus (Leo DEEP) = two model families, no correlated blind spots. + +Flow per PR: + 1. Triage → Haiku (OpenRouter) → DEEP / STANDARD / LIGHT + 2. Tier overrides: + a. Claim-shape detector: type: claim in YAML → STANDARD min (Theseus) + b. Random pre-merge promotion: 15% of LIGHT → STANDARD (Rio) + 3. Domain review → GPT-4o (OpenRouter) — skipped for LIGHT when LIGHT_SKIP_LLM=True + 4. Leo review → Opus DEEP / Sonnet STANDARD (OpenRouter) — skipped for LIGHT + 5. Post reviews, submit formal Forgejo approvals, update SQLite + 6. If both approve → status = 'approved' (merge module picks it up) + 7. Retry budget: 3 attempts max, disposition on attempt 2+ + +Design reviewed by Ganymede, Rio, Theseus, Rhea, Leo. +LLM transport and prompts extracted to lib/llm.py (Phase 3c). +""" + +import json +import logging +import random +import re +from datetime import datetime, timezone + +from . import config, db +from .domains import agent_for_domain, detect_domain_from_diff +from .forgejo import api as forgejo_api +from .forgejo import get_agent_token, get_pr_diff, repo_path +from .llm import run_batch_domain_review, run_domain_review, run_leo_review, triage_pr +from .feedback import format_rejection_comment +from .validate import load_existing_claims + +logger = logging.getLogger("pipeline.evaluate") + + +# ─── Diff helpers ────────────────────────────────────────────────────────── + + +def _filter_diff(diff: str) -> tuple[str, str]: + """Filter diff to only review-relevant files. + + Returns (review_diff, entity_diff). + Strips: inbox/, schemas/, skills/, agents/*/musings/ + """ + sections = re.split(r"(?=^diff --git )", diff, flags=re.MULTILINE) + skip_patterns = [r"^diff --git a/(inbox/(archive|queue|null-result)|schemas|skills|agents/[^/]+/musings)/"] + core_domains = {"living-agents", "living-capital", "teleohumanity", "mechanisms"} + + claim_sections = [] + entity_sections = [] + + for section in sections: + if not section.strip(): + continue + if any(re.match(p, section) for p in skip_patterns): + continue + entity_match = re.match(r"^diff --git a/entities/([^/]+)/", section) + if entity_match and entity_match.group(1) not in core_domains: + entity_sections.append(section) + continue + claim_sections.append(section) + + return "".join(claim_sections), "".join(entity_sections) + + +def _extract_changed_files(diff: str) -> str: + """Extract changed file paths from diff.""" + return "\n".join( + line.replace("diff --git a/", "").split(" b/")[0] for line in diff.split("\n") if line.startswith("diff --git") + ) + + +def _is_musings_only(diff: str) -> bool: + """Check if PR only modifies musing files.""" + has_musings = False + has_other = False + for line in diff.split("\n"): + if line.startswith("diff --git"): + if "agents/" in line and "/musings/" in line: + has_musings = True + else: + has_other = True + return has_musings and not has_other + + +# ─── NOTE: Tier 0.5 mechanical pre-check moved to validate.py ──────────── +# Tier 0.5 now runs as part of the validate stage (before eval), not inside +# evaluate_pr(). This prevents wasting eval_attempts on mechanically fixable +# PRs. Eval trusts that tier0_pass=1 means all mechanical checks passed. + + +# ─── Tier overrides ─────────────────────────────────────────────────────── + + +def _diff_contains_claim_type(diff: str) -> bool: + """Claim-shape detector: check if any file in diff has type: claim in frontmatter. + + Mechanical check ($0). If YAML declares type: claim, this is a factual claim — + not an entity update or formatting fix. Must be classified STANDARD minimum + regardless of Haiku triage. Catches factual claims disguised as LIGHT content. + (Theseus: converts semantic problem to mechanical check) + """ + for line in diff.split("\n"): + if line.startswith("+") and not line.startswith("+++"): + stripped = line[1:].strip() + if stripped in ("type: claim", 'type: "claim"', "type: 'claim'"): + return True + return False + + +def _deterministic_tier(diff: str) -> str | None: + """Deterministic tier routing — skip Haiku triage for obvious cases. + + Checks diff file patterns before calling the LLM. Returns tier string + if deterministic, None if Haiku triage is needed. + + Rules (Leo-calibrated): + - All files in entities/ only → LIGHT + - All files in inbox/ only (queue, archive, null-result) → LIGHT + - Any file in core/ or foundations/ → DEEP (structural KB changes) + - Has challenged_by field → DEEP (challenges existing claims) + - Modifies existing file (not new) in domains/ → DEEP (enrichment/change) + - Otherwise → None (needs Haiku triage) + + NOTE: Cross-domain wiki links are NOT a DEEP signal — most claims link + across domains, that's the whole point of the knowledge graph (Leo). + """ + changed_files = [] + for line in diff.split("\n"): + if line.startswith("diff --git a/"): + path = line.replace("diff --git a/", "").split(" b/")[0] + changed_files.append(path) + + if not changed_files: + return None + + # All entities/ only → LIGHT + if all(f.startswith("entities/") for f in changed_files): + logger.info("Deterministic tier: LIGHT (all files in entities/)") + return "LIGHT" + + # All inbox/ only (queue, archive, null-result) → LIGHT + if all(f.startswith("inbox/") for f in changed_files): + logger.info("Deterministic tier: LIGHT (all files in inbox/)") + return "LIGHT" + + # Any file in core/ or foundations/ → DEEP (structural KB changes) + if any(f.startswith("core/") or f.startswith("foundations/") for f in changed_files): + logger.info("Deterministic tier: DEEP (touches core/ or foundations/)") + return "DEEP" + + # Check diff content for DEEP signals + has_challenged_by = False + has_modified_claim = False + new_files: set[str] = set() + + lines = diff.split("\n") + for i, line in enumerate(lines): + # Detect new files + if line.startswith("--- /dev/null") and i + 1 < len(lines) and lines[i + 1].startswith("+++ b/"): + new_files.add(lines[i + 1][6:]) + # Check for challenged_by field + if line.startswith("+") and not line.startswith("+++"): + stripped = line[1:].strip() + if stripped.startswith("challenged_by:"): + has_challenged_by = True + + if has_challenged_by: + logger.info("Deterministic tier: DEEP (has challenged_by field)") + return "DEEP" + + # NOTE: Modified existing domain claims are NOT auto-DEEP — enrichments + # (appending evidence) are common and should be STANDARD. Let Haiku triage + # distinguish enrichments from structural changes. + + return None + + +# ─── Verdict parsing ────────────────────────────────────────────────────── + + +def _parse_verdict(review_text: str, reviewer: str) -> str: + """Parse VERDICT tag from review. Returns 'approve' or 'request_changes'.""" + upper = reviewer.upper() + if f"VERDICT:{upper}:APPROVE" in review_text: + return "approve" + elif f"VERDICT:{upper}:REQUEST_CHANGES" in review_text: + return "request_changes" + else: + logger.warning("No parseable verdict from %s — treating as request_changes", reviewer) + return "request_changes" + + +# Map model-invented tags to valid tags. Models consistently ignore the valid +# tag list and invent their own. This normalizes them. (Ganymede, Mar 14) +_TAG_ALIASES: dict[str, str] = { + "schema_violation": "frontmatter_schema", + "missing_schema_fields": "frontmatter_schema", + "missing_schema": "frontmatter_schema", + "schema": "frontmatter_schema", + "missing_frontmatter": "frontmatter_schema", + "redundancy": "near_duplicate", + "duplicate": "near_duplicate", + "missing_confidence": "confidence_miscalibration", + "confidence_error": "confidence_miscalibration", + "vague_claims": "scope_error", + "unfalsifiable": "scope_error", + "unverified_wiki_links": "broken_wiki_links", + "unverified-wiki-links": "broken_wiki_links", + "missing_wiki_links": "broken_wiki_links", + "invalid_wiki_links": "broken_wiki_links", + "wiki_link_errors": "broken_wiki_links", + "overclaiming": "title_overclaims", + "title_overclaim": "title_overclaims", + "date_error": "date_errors", + "factual_error": "factual_discrepancy", + "factual_inaccuracy": "factual_discrepancy", +} + +VALID_ISSUE_TAGS = {"broken_wiki_links", "frontmatter_schema", "title_overclaims", + "confidence_miscalibration", "date_errors", "factual_discrepancy", + "near_duplicate", "scope_error"} + + +def _normalize_tag(tag: str) -> str | None: + """Normalize a model-generated tag to a valid tag, or None if unrecognizable.""" + tag = tag.strip().lower().replace("-", "_") + if tag in VALID_ISSUE_TAGS: + return tag + if tag in _TAG_ALIASES: + return _TAG_ALIASES[tag] + # Fuzzy: check if any valid tag is a substring or vice versa + for valid in VALID_ISSUE_TAGS: + if valid in tag or tag in valid: + return valid + return None + + +def _parse_issues(review_text: str) -> list[str]: + """Extract issue tags from review. + + First tries structured comment with tag normalization. + Falls back to keyword inference from prose. + """ + match = re.search(r"", review_text) + if match: + raw_tags = [tag.strip() for tag in match.group(1).split(",") if tag.strip()] + normalized = [] + for tag in raw_tags: + norm = _normalize_tag(tag) + if norm and norm not in normalized: + normalized.append(norm) + else: + logger.debug("Unrecognized issue tag '%s' — dropped", tag) + if normalized: + return normalized + # Fallback: infer tags from review prose + return _infer_issues_from_prose(review_text) + + +# Keyword patterns for inferring issue tags from unstructured review prose. +# Conservative: only match unambiguous indicators. Order doesn't matter. +_PROSE_TAG_PATTERNS: dict[str, list[re.Pattern]] = { + "frontmatter_schema": [ + re.compile(r"frontmatter", re.IGNORECASE), + re.compile(r"missing.{0,20}(type|domain|confidence|source|created)\b", re.IGNORECASE), + re.compile(r"yaml.{0,10}(invalid|missing|error|schema)", re.IGNORECASE), + re.compile(r"required field", re.IGNORECASE), + re.compile(r"lacks?.{0,15}(required|yaml|schema|fields)", re.IGNORECASE), + re.compile(r"missing.{0,15}(schema|fields|frontmatter)", re.IGNORECASE), + re.compile(r"schema.{0,10}(compliance|violation|missing|invalid)", re.IGNORECASE), + ], + "broken_wiki_links": [ + re.compile(r"(broken|dead|invalid).{0,10}(wiki.?)?link", re.IGNORECASE), + re.compile(r"wiki.?link.{0,20}(not found|missing|broken|invalid|resolv|unverif)", re.IGNORECASE), + re.compile(r"\[\[.{1,80}\]\].{0,20}(not found|doesn.t exist|missing)", re.IGNORECASE), + re.compile(r"unverified.{0,10}(wiki|link)", re.IGNORECASE), + ], + "factual_discrepancy": [ + re.compile(r"factual.{0,10}(error|inaccura|discrepanc|incorrect)", re.IGNORECASE), + re.compile(r"misrepresent", re.IGNORECASE), + ], + "confidence_miscalibration": [ + re.compile(r"confidence.{0,20}(too high|too low|miscalibrat|overstat|should be)", re.IGNORECASE), + re.compile(r"(overstat|understat).{0,20}confidence", re.IGNORECASE), + ], + "scope_error": [ + re.compile(r"scope.{0,10}(error|too broad|overscop|unscoped)", re.IGNORECASE), + re.compile(r"unscoped.{0,10}(universal|claim)", re.IGNORECASE), + re.compile(r"(vague|unfalsifiable).{0,15}(claim|assertion)", re.IGNORECASE), + re.compile(r"not.{0,10}(specific|falsifiable|disagreeable).{0,10}enough", re.IGNORECASE), + ], + "title_overclaims": [ + re.compile(r"title.{0,20}(overclaim|overstat|too broad)", re.IGNORECASE), + re.compile(r"overclaim", re.IGNORECASE), + ], + "near_duplicate": [ + re.compile(r"near.?duplicate", re.IGNORECASE), + re.compile(r"(very|too) similar.{0,20}(claim|title|existing)", re.IGNORECASE), + re.compile(r"duplicate.{0,20}(of|claim|title|existing|information)", re.IGNORECASE), + re.compile(r"redundan", re.IGNORECASE), + ], +} + + +def _infer_issues_from_prose(review_text: str) -> list[str]: + """Infer issue tags from unstructured review text via keyword matching. + + Fallback for reviews that reject without structured tags. + Conservative: requires at least one unambiguous keyword match per tag. + """ + inferred = [] + for tag, patterns in _PROSE_TAG_PATTERNS.items(): + if any(p.search(review_text) for p in patterns): + inferred.append(tag) + return inferred + + +async def _post_formal_approvals(pr_number: int, pr_author: str): + """Submit formal Forgejo reviews from 2 agents (not the PR author).""" + approvals = 0 + for agent_name in ["leo", "vida", "theseus", "clay", "astra", "rio"]: + if agent_name == pr_author: + continue + if approvals >= 2: + break + token = get_agent_token(agent_name) + if token: + result = await forgejo_api( + "POST", + repo_path(f"pulls/{pr_number}/reviews"), + {"body": "Approved.", "event": "APPROVED"}, + token=token, + ) + if result is not None: + approvals += 1 + logger.debug("Formal approval for PR #%d by %s (%d/2)", pr_number, agent_name, approvals) + + +# ─── Retry budget helpers ───────────────────────────────────────────────── + + +async def _terminate_pr(conn, pr_number: int, reason: str): + """Terminal state: close PR on Forgejo, mark source needs_human.""" + # Get issue tags for structured feedback + row = conn.execute("SELECT eval_issues, agent FROM prs WHERE number = ?", (pr_number,)).fetchone() + issues = [] + if row and row["eval_issues"]: + try: + issues = json.loads(row["eval_issues"]) + except (json.JSONDecodeError, TypeError): + pass + + # Post structured rejection comment with quality gate guidance (Epimetheus) + if issues: + feedback_body = format_rejection_comment(issues, source="eval_terminal") + comment_body = ( + f"**Closed by eval pipeline** — {reason}.\n\n" + f"Evaluated {config.MAX_EVAL_ATTEMPTS} times without passing. " + f"Source will be re-queued with feedback.\n\n" + f"{feedback_body}" + ) + else: + comment_body = ( + f"**Closed by eval pipeline** — {reason}.\n\n" + f"Evaluated {config.MAX_EVAL_ATTEMPTS} times without passing. " + f"Source will be re-queued with feedback." + ) + + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + {"body": comment_body}, + ) + await forgejo_api( + "PATCH", + repo_path(f"pulls/{pr_number}"), + {"state": "closed"}, + ) + + # Update PR status + conn.execute( + "UPDATE prs SET status = 'closed', last_error = ? WHERE number = ?", + (reason, pr_number), + ) + + # Tag source for re-extraction with feedback + cursor = conn.execute( + """UPDATE sources SET status = 'needs_reextraction', + updated_at = datetime('now') + WHERE path = (SELECT source_path FROM prs WHERE number = ?)""", + (pr_number,), + ) + if cursor.rowcount == 0: + logger.warning("PR #%d: no source_path linked — source not requeued for re-extraction", pr_number) + + db.audit( + conn, + "evaluate", + "pr_terminated", + json.dumps( + { + "pr": pr_number, + "reason": reason, + } + ), + ) + logger.info("PR #%d: TERMINATED — %s", pr_number, reason) + + +def _classify_issues(issues: list[str]) -> str: + """Classify issue tags as 'mechanical', 'substantive', or 'mixed'.""" + if not issues: + return "unknown" + mechanical = set(issues) & config.MECHANICAL_ISSUE_TAGS + substantive = set(issues) & config.SUBSTANTIVE_ISSUE_TAGS + if substantive and not mechanical: + return "substantive" + if mechanical and not substantive: + return "mechanical" + if mechanical and substantive: + return "mixed" + return "unknown" # tags not in either set + + +async def _dispose_rejected_pr(conn, pr_number: int, eval_attempts: int, all_issues: list[str]): + """Disposition logic for rejected PRs on attempt 2+. + + Attempt 1: normal — back to open, wait for fix. + Attempt 2: check issue classification. + - Mechanical only: keep open for one more attempt (auto-fix future). + - Substantive or mixed: close PR, requeue source. + Attempt 3+: terminal. + """ + if eval_attempts < 2: + # Attempt 1: post structured feedback so agent learns, but don't close + if all_issues: + feedback_body = format_rejection_comment(all_issues, source="eval_attempt_1") + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + {"body": feedback_body}, + ) + return + + classification = _classify_issues(all_issues) + + if eval_attempts >= config.MAX_EVAL_ATTEMPTS: + # Terminal + await _terminate_pr(conn, pr_number, f"eval budget exhausted after {eval_attempts} attempts") + return + + if classification == "mechanical": + # Mechanical issues only — keep open for one more attempt. + # Future: auto-fix module will push fixes here. + logger.info( + "PR #%d: attempt %d, mechanical issues only (%s) — keeping open for fix attempt", + pr_number, + eval_attempts, + all_issues, + ) + db.audit( + conn, + "evaluate", + "mechanical_retry", + json.dumps( + { + "pr": pr_number, + "attempt": eval_attempts, + "issues": all_issues, + } + ), + ) + else: + # Substantive, mixed, or unknown — close and requeue + logger.info( + "PR #%d: attempt %d, %s issues (%s) — closing and requeuing source", + pr_number, + eval_attempts, + classification, + all_issues, + ) + await _terminate_pr( + conn, pr_number, f"substantive issues after {eval_attempts} attempts: {', '.join(all_issues)}" + ) + + +# ─── Single PR evaluation ───────────────────────────────────────────────── + + +async def evaluate_pr(conn, pr_number: int, tier: str = None) -> dict: + """Evaluate a single PR. Returns result dict.""" + # Check eval attempt budget before claiming + row = conn.execute("SELECT eval_attempts FROM prs WHERE number = ?", (pr_number,)).fetchone() + eval_attempts = (row["eval_attempts"] or 0) if row else 0 + if eval_attempts >= config.MAX_EVAL_ATTEMPTS: + # Terminal — hard cap reached. Close PR, tag source. + logger.warning("PR #%d: eval_attempts=%d >= %d, terminal", pr_number, eval_attempts, config.MAX_EVAL_ATTEMPTS) + await _terminate_pr(conn, pr_number, "eval budget exhausted") + return {"pr": pr_number, "terminal": True, "reason": "eval_budget_exhausted"} + + # Atomic claim — prevent concurrent workers from evaluating the same PR (Ganymede #11) + cursor = conn.execute( + "UPDATE prs SET status = 'reviewing' WHERE number = ? AND status = 'open'", + (pr_number,), + ) + if cursor.rowcount == 0: + logger.debug("PR #%d already claimed by another worker, skipping", pr_number) + return {"pr": pr_number, "skipped": True, "reason": "already_claimed"} + + # Increment eval_attempts — but not if this is a merge-failure re-entry (Ganymede+Rhea) + merge_cycled = conn.execute( + "SELECT merge_cycled FROM prs WHERE number = ?", (pr_number,) + ).fetchone() + if merge_cycled and merge_cycled["merge_cycled"]: + # Merge cycling — don't burn eval budget, clear flag + conn.execute("UPDATE prs SET merge_cycled = 0 WHERE number = ?", (pr_number,)) + logger.info("PR #%d: merge-cycled re-eval, not incrementing eval_attempts", pr_number) + else: + conn.execute( + "UPDATE prs SET eval_attempts = COALESCE(eval_attempts, 0) + 1 WHERE number = ?", + (pr_number,), + ) + eval_attempts += 1 + + # Fetch diff + diff = await get_pr_diff(pr_number) + if not diff: + # Close PRs with no diff — stale branch, nothing to evaluate + conn.execute("UPDATE prs SET status='closed', last_error='closed: no diff against main (stale branch)' WHERE number = ?", (pr_number,)) + return {"pr": pr_number, "skipped": True, "reason": "no_diff_closed"} + + # Musings bypass + if _is_musings_only(diff): + logger.info("PR #%d is musings-only — auto-approving", pr_number) + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + {"body": "Auto-approved: musings bypass eval per collective policy."}, + ) + conn.execute( + """UPDATE prs SET status = 'approved', leo_verdict = 'skipped', + domain_verdict = 'skipped' WHERE number = ?""", + (pr_number,), + ) + return {"pr": pr_number, "auto_approved": True, "reason": "musings_only"} + + # NOTE: Tier 0.5 mechanical checks now run in validate stage (before eval). + # tier0_pass=1 guarantees all mechanical checks passed. No Tier 0.5 here. + + # Filter diff + review_diff, _entity_diff = _filter_diff(diff) + if not review_diff: + review_diff = diff + files = _extract_changed_files(diff) + + # Detect domain + domain = detect_domain_from_diff(diff) + agent = agent_for_domain(domain) + + # Default NULL domain to 'general' (archive-only PRs have no domain files) + if domain is None: + domain = "general" + + # Update PR domain if not set + conn.execute( + "UPDATE prs SET domain = COALESCE(domain, ?), domain_agent = ? WHERE number = ?", + (domain, agent, pr_number), + ) + + # Step 1: Triage (if not already triaged) + # Try deterministic routing first ($0), fall back to Haiku triage ($0.001) + if tier is None: + tier = _deterministic_tier(diff) + if tier is not None: + db.audit( + conn, "evaluate", "deterministic_tier", + json.dumps({"pr": pr_number, "tier": tier}), + ) + else: + tier, triage_usage = await triage_pr(diff) + # Record triage cost + from . import costs + costs.record_usage( + conn, config.TRIAGE_MODEL, "eval_triage", + input_tokens=triage_usage.get("prompt_tokens", 0), + output_tokens=triage_usage.get("completion_tokens", 0), + backend="openrouter", + ) + + # Tier overrides (claim-shape detector + random promotion) + # Order matters: claim-shape catches obvious cases, random promotion catches the rest. + + # Claim-shape detector: type: claim in YAML → STANDARD minimum (Theseus) + if tier == "LIGHT" and _diff_contains_claim_type(diff): + tier = "STANDARD" + logger.info("PR #%d: claim-shape detector upgraded LIGHT → STANDARD (type: claim found)", pr_number) + db.audit( + conn, "evaluate", "claim_shape_upgrade", json.dumps({"pr": pr_number, "from": "LIGHT", "to": "STANDARD"}) + ) + + # Random pre-merge promotion: 15% of LIGHT → STANDARD (Rio) + if tier == "LIGHT" and random.random() < config.LIGHT_PROMOTION_RATE: + tier = "STANDARD" + logger.info( + "PR #%d: random promotion LIGHT → STANDARD (%.0f%% rate)", pr_number, config.LIGHT_PROMOTION_RATE * 100 + ) + db.audit(conn, "evaluate", "random_promotion", json.dumps({"pr": pr_number, "from": "LIGHT", "to": "STANDARD"})) + + conn.execute("UPDATE prs SET tier = ? WHERE number = ?", (tier, pr_number)) + + # Update last_attempt timestamp (status already set to 'reviewing' by atomic claim above) + conn.execute( + "UPDATE prs SET last_attempt = datetime('now') WHERE number = ?", + (pr_number,), + ) + + # Check if domain review already completed (resuming after Leo rate limit) + existing = conn.execute("SELECT domain_verdict, leo_verdict FROM prs WHERE number = ?", (pr_number,)).fetchone() + existing_domain_verdict = existing["domain_verdict"] if existing else "pending" + _existing_leo_verdict = existing["leo_verdict"] if existing else "pending" + + # Step 2: Domain review (GPT-4o via OpenRouter) + # LIGHT tier: skip entirely when LIGHT_SKIP_LLM enabled (Rhea: config flag rollback) + # Skip if already completed from a previous attempt + domain_review = None # Initialize — used later for feedback extraction (Ganymede #12) + domain_usage = {"prompt_tokens": 0, "completion_tokens": 0} + leo_usage = {"prompt_tokens": 0, "completion_tokens": 0} + if tier == "LIGHT" and config.LIGHT_SKIP_LLM: + domain_verdict = "skipped" + logger.info("PR #%d: LIGHT tier — skipping domain review (LIGHT_SKIP_LLM=True)", pr_number) + conn.execute( + "UPDATE prs SET domain_verdict = 'skipped', domain_model = 'none' WHERE number = ?", + (pr_number,), + ) + elif existing_domain_verdict not in ("pending", None): + domain_verdict = existing_domain_verdict + logger.info("PR #%d: domain review already done (%s), skipping to Leo", pr_number, domain_verdict) + else: + logger.info("PR #%d: domain review (%s/%s, tier=%s)", pr_number, agent, domain, tier) + domain_review, domain_usage = await run_domain_review(review_diff, files, domain or "general", agent) + + if domain_review is None: + # OpenRouter failure (timeout, error) — revert to open for retry. + # NOT a rate limit — don't trigger 15-min backoff, just skip this PR. + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_number,)) + return {"pr": pr_number, "skipped": True, "reason": "openrouter_failed"} + + domain_verdict = _parse_verdict(domain_review, agent) + conn.execute( + "UPDATE prs SET domain_verdict = ?, domain_model = ? WHERE number = ?", + (domain_verdict, config.EVAL_DOMAIN_MODEL, pr_number), + ) + + # Post domain review as comment (from agent's Forgejo account) + agent_tok = get_agent_token(agent) + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + {"body": domain_review}, + token=agent_tok, + ) + + # If domain review rejects, skip Leo review (save Opus) + if domain_verdict == "request_changes": + logger.info("PR #%d: domain rejected, skipping Leo review", pr_number) + domain_issues = _parse_issues(domain_review) if domain_review else [] + conn.execute( + """UPDATE prs SET status = 'open', leo_verdict = 'skipped', + last_error = 'domain review requested changes', + eval_issues = ? + WHERE number = ?""", + (json.dumps(domain_issues), pr_number), + ) + db.audit( + conn, "evaluate", "domain_rejected", json.dumps({"pr": pr_number, "agent": agent, "issues": domain_issues}) + ) + + # Record structured review outcome + claim_files = [f for f in files if any(f.startswith(d) for d in ("domains/", "core/", "foundations/", "decisions/"))] + db.record_review( + conn, pr_number, reviewer=agent, outcome="rejected", + domain=domain, agent=agent, reviewer_model=config.EVAL_DOMAIN_MODEL, + rejection_reason=None, # TODO: parse from domain_issues when Leo starts tagging + notes=json.dumps(domain_issues) if domain_issues else None, + claims_in_batch=max(len(claim_files), 1), + ) + + # Disposition: check if this PR should be terminated or kept open + await _dispose_rejected_pr(conn, pr_number, eval_attempts, domain_issues) + + return { + "pr": pr_number, + "domain_verdict": domain_verdict, + "leo_verdict": "skipped", + "eval_attempts": eval_attempts, + } + + # Step 3: Leo review (Opus — only if domain passes, skipped for LIGHT) + leo_verdict = "skipped" + leo_review = None # Initialize — used later for issue extraction + if tier != "LIGHT": + logger.info("PR #%d: Leo review (tier=%s)", pr_number, tier) + leo_review, leo_usage = await run_leo_review(review_diff, files, tier) + + if leo_review is None: + # DEEP: Opus rate limited (queue for later). STANDARD: OpenRouter failed (skip, retry next cycle). + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_number,)) + reason = "opus_rate_limited" if tier == "DEEP" else "openrouter_failed" + return {"pr": pr_number, "skipped": True, "reason": reason} + + leo_verdict = _parse_verdict(leo_review, "LEO") + conn.execute("UPDATE prs SET leo_verdict = ? WHERE number = ?", (leo_verdict, pr_number)) + + # Post Leo review as comment (from Leo's Forgejo account) + leo_tok = get_agent_token("Leo") + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + {"body": leo_review}, + token=leo_tok, + ) + else: + # LIGHT tier: Leo is auto-skipped, domain verdict is the only gate + conn.execute("UPDATE prs SET leo_verdict = 'skipped' WHERE number = ?", (pr_number,)) + + # Step 4: Determine final verdict + # "skipped" counts as approve (LIGHT skips both reviews deliberately) + both_approve = leo_verdict in ("approve", "skipped") and domain_verdict in ("approve", "skipped") + + if both_approve: + # Get PR author for formal approvals + pr_info = await forgejo_api( + "GET", + repo_path(f"pulls/{pr_number}"), + ) + pr_author = pr_info.get("user", {}).get("login", "") if pr_info else "" + + # Submit formal Forgejo reviews (required for merge) + await _post_formal_approvals(pr_number, pr_author) + + conn.execute( + "UPDATE prs SET status = 'approved' WHERE number = ?", + (pr_number,), + ) + db.audit( + conn, + "evaluate", + "approved", + json.dumps({"pr": pr_number, "tier": tier, "domain": domain, "leo": leo_verdict, "domain_agent": agent}), + ) + logger.info("PR #%d: APPROVED (tier=%s, leo=%s, domain=%s)", pr_number, tier, leo_verdict, domain_verdict) + + # Record structured review outcome + claim_files = [f for f in files if any(f.startswith(d) for d in ("domains/", "core/", "foundations/", "decisions/"))] + db.record_review( + conn, pr_number, reviewer="leo", outcome="approved", + domain=domain, agent=agent, + reviewer_model=config.MODEL_SONNET if tier == "STANDARD" else "opus", + claims_in_batch=max(len(claim_files), 1), + ) + else: + # Collect all issue tags from both reviews + all_issues = [] + if domain_verdict == "request_changes" and domain_review is not None: + all_issues.extend(_parse_issues(domain_review)) + if leo_verdict == "request_changes" and leo_review is not None: + all_issues.extend(_parse_issues(leo_review)) + + conn.execute( + "UPDATE prs SET status = 'open', eval_issues = ? WHERE number = ?", + (json.dumps(all_issues), pr_number), + ) + # Store feedback for re-extraction path + feedback = {"leo": leo_verdict, "domain": domain_verdict, "tier": tier, "issues": all_issues} + conn.execute( + "UPDATE sources SET feedback = ? WHERE path = (SELECT source_path FROM prs WHERE number = ?)", + (json.dumps(feedback), pr_number), + ) + db.audit( + conn, + "evaluate", + "changes_requested", + json.dumps( + {"pr": pr_number, "tier": tier, "leo": leo_verdict, "domain": domain_verdict, "issues": all_issues} + ), + ) + + # Record structured review outcome for Leo rejection + claim_files = [f for f in files if any(f.startswith(d) for d in ("domains/", "core/", "foundations/", "decisions/"))] + reviewer = "leo" if leo_verdict == "request_changes" else agent + db.record_review( + conn, pr_number, reviewer=reviewer, outcome="rejected", + domain=domain, agent=agent, + reviewer_model=config.MODEL_SONNET if tier == "STANDARD" else "opus", + notes=json.dumps(all_issues) if all_issues else None, + claims_in_batch=max(len(claim_files), 1), + ) + logger.info( + "PR #%d: CHANGES REQUESTED (leo=%s, domain=%s, issues=%s)", + pr_number, + leo_verdict, + domain_verdict, + all_issues, + ) + + # Disposition: check if this PR should be terminated or kept open + await _dispose_rejected_pr(conn, pr_number, eval_attempts, all_issues) + + # Record cost (only for reviews that actually ran) + from . import costs + + if domain_verdict != "skipped": + costs.record_usage( + conn, config.EVAL_DOMAIN_MODEL, "eval_domain", + input_tokens=domain_usage.get("prompt_tokens", 0), + output_tokens=domain_usage.get("completion_tokens", 0), + backend="openrouter", + ) + if leo_verdict not in ("skipped",): + if tier == "DEEP": + costs.record_usage( + conn, config.EVAL_LEO_MODEL, "eval_leo", + input_tokens=leo_usage.get("prompt_tokens", 0), + output_tokens=leo_usage.get("completion_tokens", 0), + backend="max", + duration_ms=leo_usage.get("duration_ms", 0), + cache_read_tokens=leo_usage.get("cache_read_tokens", 0), + cache_write_tokens=leo_usage.get("cache_write_tokens", 0), + cost_estimate_usd=leo_usage.get("cost_estimate_usd", 0.0), + ) + else: + costs.record_usage( + conn, config.EVAL_LEO_STANDARD_MODEL, "eval_leo", + input_tokens=leo_usage.get("prompt_tokens", 0), + output_tokens=leo_usage.get("completion_tokens", 0), + backend="openrouter", + ) + + return { + "pr": pr_number, + "tier": tier, + "domain": domain, + "leo_verdict": leo_verdict, + "domain_verdict": domain_verdict, + "approved": both_approve, + } + + +# ─── Rate limit backoff ─────────────────────────────────────────────────── + +# When rate limited, don't retry for 15 minutes. Prevents ~2700 wasted +# CLI calls overnight when Opus is exhausted. +_rate_limit_backoff_until: datetime | None = None +_RATE_LIMIT_BACKOFF_MINUTES = 15 + + +# ─── Batch domain review ───────────────────────────────────────────────── + + +def _parse_batch_response(response: str, pr_numbers: list[int], agent: str) -> dict[int, str]: + """Parse batched domain review into per-PR review sections. + + Returns {pr_number: review_text} for each PR found in the response. + Missing PRs are omitted — caller handles fallback. + """ + agent_upper = agent.upper() + result: dict[int, str] = {} + + # Split by PR verdict markers: + # Each marker terminates the previous PR's section + pattern = re.compile( + r"" + ) + + matches = list(pattern.finditer(response)) + if not matches: + return result + + for i, match in enumerate(matches): + pr_num = int(match.group(1)) + verdict = match.group(2) + marker_end = match.end() + + # Find the start of this PR's section by looking for the section header + # or the end of the previous verdict + section_header = f"=== PR #{pr_num}" + header_pos = response.rfind(section_header, 0, match.start()) + + if header_pos >= 0: + # Extract from header to end of verdict marker + section_text = response[header_pos:marker_end].strip() + else: + # No header found — extract from previous marker end to this marker end + prev_end = matches[i - 1].end() if i > 0 else 0 + section_text = response[prev_end:marker_end].strip() + + # Re-format as individual review comment + # Strip the batch section header, keep just the review content + # Add batch label for traceability + pr_nums_str = ", ".join(f"#{n}" for n in pr_numbers) + review_text = ( + f"*(batch review with PRs {pr_nums_str})*\n\n" + f"{section_text}\n" + ) + result[pr_num] = review_text + + return result + + +def _validate_batch_fanout( + parsed: dict[int, str], + pr_diffs: list[dict], + agent: str, +) -> tuple[dict[int, str], list[int]]: + """Validate batch fan-out for completeness and cross-contamination. + + Returns (valid_reviews, fallback_pr_numbers). + - valid_reviews: reviews that passed validation + - fallback_pr_numbers: PRs that need individual review (missing or cross-contaminated) + """ + valid: dict[int, str] = {} + fallback: list[int] = [] + + # Build file map: pr_number → set of path segments for matching. + # Use full paths (e.g., "domains/internet-finance/dao.md") not bare filenames + # to avoid false matches on short names like "dao.md" or "space.md" (Leo note #3). + pr_files: dict[int, set[str]] = {} + for pr in pr_diffs: + files = set() + for line in pr["diff"].split("\n"): + if line.startswith("diff --git a/"): + path = line.replace("diff --git a/", "").split(" b/")[0] + files.add(path) + # Also add the last 2 path segments (e.g., "internet-finance/dao.md") + # for models that abbreviate paths + parts = path.split("/") + if len(parts) >= 2: + files.add("/".join(parts[-2:])) + pr_files[pr["number"]] = files + + for pr in pr_diffs: + pr_num = pr["number"] + + # Completeness check: is there a review for this PR? + if pr_num not in parsed: + logger.warning("Batch fan-out: PR #%d missing from response — fallback to individual", pr_num) + fallback.append(pr_num) + continue + + review = parsed[pr_num] + + # Cross-contamination check: does review mention at least one file from this PR? + # Use path segments (min 10 chars) to avoid false substring matches on short names. + my_files = pr_files.get(pr_num, set()) + mentions_own_file = any(f in review for f in my_files if len(f) >= 10) + + if not mentions_own_file and my_files: + # Check if it references files from OTHER PRs (cross-contamination signal) + other_files = set() + for other_pr in pr_diffs: + if other_pr["number"] != pr_num: + other_files.update(pr_files.get(other_pr["number"], set())) + mentions_other = any(f in review for f in other_files if len(f) >= 10) + + if mentions_other: + logger.warning( + "Batch fan-out: PR #%d review references files from another PR — cross-contamination, fallback", + pr_num, + ) + fallback.append(pr_num) + continue + # If it doesn't mention any files at all, could be a generic review — accept it + # (some PRs have short diffs where the model doesn't reference filenames) + + valid[pr_num] = review + + return valid, fallback + + +async def _run_batch_domain_eval( + conn, batch_prs: list[dict], domain: str, agent: str, +) -> tuple[int, int]: + """Execute batch domain review for a group of same-domain STANDARD PRs. + + 1. Claim all PRs atomically + 2. Run single batch domain review + 3. Parse + validate fan-out + 4. Post per-PR comments + 5. Continue to individual Leo review for each + 6. Fall back to individual review for any validation failures + + Returns (succeeded, failed). + """ + from .forgejo import get_pr_diff as _get_pr_diff + + succeeded = 0 + failed = 0 + + # Step 1: Fetch diffs and build batch + pr_diffs = [] + claimed_prs = [] + for pr_row in batch_prs: + pr_num = pr_row["number"] + + # Atomic claim + cursor = conn.execute( + "UPDATE prs SET status = 'reviewing' WHERE number = ? AND status = 'open'", + (pr_num,), + ) + if cursor.rowcount == 0: + continue + + # Increment eval_attempts — skip if merge-cycled (Ganymede+Rhea) + mc_row = conn.execute("SELECT merge_cycled FROM prs WHERE number = ?", (pr_num,)).fetchone() + if mc_row and mc_row["merge_cycled"]: + conn.execute( + "UPDATE prs SET merge_cycled = 0, last_attempt = datetime('now') WHERE number = ?", + (pr_num,), + ) + logger.info("PR #%d: merge-cycled re-eval, not incrementing eval_attempts", pr_num) + else: + conn.execute( + "UPDATE prs SET eval_attempts = COALESCE(eval_attempts, 0) + 1, " + "last_attempt = datetime('now') WHERE number = ?", + (pr_num,), + ) + + diff = await _get_pr_diff(pr_num) + if not diff: + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_num,)) + continue + + # Musings bypass + if _is_musings_only(diff): + await forgejo_api( + "POST", + repo_path(f"issues/{pr_num}/comments"), + {"body": "Auto-approved: musings bypass eval per collective policy."}, + ) + conn.execute( + "UPDATE prs SET status = 'approved', leo_verdict = 'skipped', " + "domain_verdict = 'skipped' WHERE number = ?", + (pr_num,), + ) + succeeded += 1 + continue + + review_diff, _ = _filter_diff(diff) + if not review_diff: + review_diff = diff + files = _extract_changed_files(diff) + + # Build label from branch name or first claim filename + branch = pr_row.get("branch", "") + label = branch.split("/")[-1][:60] if branch else f"pr-{pr_num}" + + pr_diffs.append({ + "number": pr_num, + "label": label, + "diff": review_diff, + "files": files, + "full_diff": diff, # kept for Leo review + "file_count": len([l for l in files.split("\n") if l.strip()]), + }) + claimed_prs.append(pr_num) + + if not pr_diffs: + return 0, 0 + + # Enforce BATCH_EVAL_MAX_DIFF_BYTES — split if total diff is too large. + # We only know diff sizes after fetching, so enforce here not in _build_domain_batches. + total_bytes = sum(len(p["diff"].encode()) for p in pr_diffs) + if total_bytes > config.BATCH_EVAL_MAX_DIFF_BYTES and len(pr_diffs) > 1: + # Keep PRs up to the byte cap, revert the rest to open for next cycle + kept = [] + running_bytes = 0 + for p in pr_diffs: + p_bytes = len(p["diff"].encode()) + if running_bytes + p_bytes > config.BATCH_EVAL_MAX_DIFF_BYTES and kept: + break + kept.append(p) + running_bytes += p_bytes + overflow = [p for p in pr_diffs if p not in kept] + for p in overflow: + conn.execute( + "UPDATE prs SET status = 'open', eval_attempts = COALESCE(eval_attempts, 1) - 1 " + "WHERE number = ?", + (p["number"],), + ) + claimed_prs.remove(p["number"]) + logger.info( + "PR #%d: diff too large for batch (%d bytes total), deferring to next cycle", + p["number"], total_bytes, + ) + pr_diffs = kept + + if not pr_diffs: + return 0, 0 + + # Detect domain for all PRs (should be same domain) + conn.execute( + "UPDATE prs SET domain = COALESCE(domain, ?), domain_agent = ? WHERE number IN ({})".format( + ",".join("?" * len(claimed_prs)) + ), + [domain, agent] + claimed_prs, + ) + + # Step 2: Run batch domain review + logger.info( + "Batch domain review: %d PRs in %s domain (PRs: %s)", + len(pr_diffs), + domain, + ", ".join(f"#{p['number']}" for p in pr_diffs), + ) + batch_response, batch_domain_usage = await run_batch_domain_review(pr_diffs, domain, agent) + + if batch_response is None: + # Complete failure — revert all to open + logger.warning("Batch domain review failed — reverting all PRs to open") + for pr_num in claimed_prs: + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_num,)) + return 0, len(claimed_prs) + + # Step 3: Parse + validate fan-out + parsed = _parse_batch_response(batch_response, claimed_prs, agent) + valid_reviews, fallback_prs = _validate_batch_fanout(parsed, pr_diffs, agent) + + db.audit( + conn, "evaluate", "batch_domain_review", + json.dumps({ + "domain": domain, + "batch_size": len(pr_diffs), + "valid": len(valid_reviews), + "fallback": fallback_prs, + }), + ) + + # Record batch domain review cost ONCE for the whole batch (not per-PR) + from . import costs + costs.record_usage( + conn, config.EVAL_DOMAIN_MODEL, "eval_domain", + input_tokens=batch_domain_usage.get("prompt_tokens", 0), + output_tokens=batch_domain_usage.get("completion_tokens", 0), + backend="openrouter", + ) + + # Step 4: Process valid reviews — post comments + continue to Leo + for pr_data in pr_diffs: + pr_num = pr_data["number"] + + if pr_num in fallback_prs: + # Revert — will be picked up by individual eval next cycle + conn.execute( + "UPDATE prs SET status = 'open', eval_attempts = COALESCE(eval_attempts, 1) - 1 " + "WHERE number = ?", + (pr_num,), + ) + logger.info("PR #%d: batch fallback — will retry individually", pr_num) + continue + + if pr_num not in valid_reviews: + # Should not happen, but safety + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_num,)) + continue + + review_text = valid_reviews[pr_num] + domain_verdict = _parse_verdict(review_text, agent) + + # Post domain review comment + agent_tok = get_agent_token(agent) + await forgejo_api( + "POST", + repo_path(f"issues/{pr_num}/comments"), + {"body": review_text}, + token=agent_tok, + ) + + conn.execute( + "UPDATE prs SET domain_verdict = ?, domain_model = ? WHERE number = ?", + (domain_verdict, config.EVAL_DOMAIN_MODEL, pr_num), + ) + + # If domain rejects, handle disposition (same as individual path) + if domain_verdict == "request_changes": + domain_issues = _parse_issues(review_text) + eval_attempts = (conn.execute( + "SELECT eval_attempts FROM prs WHERE number = ?", (pr_num,) + ).fetchone()["eval_attempts"] or 0) + + conn.execute( + "UPDATE prs SET status = 'open', leo_verdict = 'skipped', " + "last_error = 'domain review requested changes', eval_issues = ? WHERE number = ?", + (json.dumps(domain_issues), pr_num), + ) + db.audit( + conn, "evaluate", "domain_rejected", + json.dumps({"pr": pr_num, "agent": agent, "issues": domain_issues, "batch": True}), + ) + await _dispose_rejected_pr(conn, pr_num, eval_attempts, domain_issues) + succeeded += 1 + continue + + # Domain approved — continue to individual Leo review + logger.info("PR #%d: batch domain approved, proceeding to individual Leo review", pr_num) + + review_diff = pr_data["diff"] + files = pr_data["files"] + + leo_review, leo_usage = await run_leo_review(review_diff, files, "STANDARD") + + if leo_review is None: + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_num,)) + logger.debug("PR #%d: Leo review failed, will retry next cycle", pr_num) + continue + + if leo_review == "RATE_LIMITED": + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_num,)) + logger.info("PR #%d: Leo rate limited, will retry next cycle", pr_num) + continue + + leo_verdict = _parse_verdict(leo_review, "LEO") + conn.execute("UPDATE prs SET leo_verdict = ? WHERE number = ?", (leo_verdict, pr_num)) + + # Post Leo review + leo_tok = get_agent_token("Leo") + await forgejo_api( + "POST", + repo_path(f"issues/{pr_num}/comments"), + {"body": leo_review}, + token=leo_tok, + ) + + costs.record_usage( + conn, config.EVAL_LEO_STANDARD_MODEL, "eval_leo", + input_tokens=leo_usage.get("prompt_tokens", 0), + output_tokens=leo_usage.get("completion_tokens", 0), + backend="openrouter", + ) + + # Final verdict + both_approve = leo_verdict in ("approve", "skipped") and domain_verdict in ("approve", "skipped") + + if both_approve: + pr_info = await forgejo_api("GET", repo_path(f"pulls/{pr_num}")) + pr_author = pr_info.get("user", {}).get("login", "") if pr_info else "" + await _post_formal_approvals(pr_num, pr_author) + conn.execute("UPDATE prs SET status = 'approved' WHERE number = ?", (pr_num,)) + db.audit( + conn, "evaluate", "approved", + json.dumps({"pr": pr_num, "tier": "STANDARD", "domain": domain, + "leo": leo_verdict, "domain_agent": agent, "batch": True}), + ) + logger.info("PR #%d: APPROVED (batch domain + individual Leo)", pr_num) + else: + all_issues = [] + if leo_verdict == "request_changes": + all_issues.extend(_parse_issues(leo_review)) + conn.execute( + "UPDATE prs SET status = 'open', eval_issues = ? WHERE number = ?", + (json.dumps(all_issues), pr_num), + ) + feedback = {"leo": leo_verdict, "domain": domain_verdict, + "tier": "STANDARD", "issues": all_issues} + conn.execute( + "UPDATE sources SET feedback = ? WHERE path = (SELECT source_path FROM prs WHERE number = ?)", + (json.dumps(feedback), pr_num), + ) + db.audit( + conn, "evaluate", "changes_requested", + json.dumps({"pr": pr_num, "tier": "STANDARD", "leo": leo_verdict, + "domain": domain_verdict, "issues": all_issues, "batch": True}), + ) + eval_attempts = (conn.execute( + "SELECT eval_attempts FROM prs WHERE number = ?", (pr_num,) + ).fetchone()["eval_attempts"] or 0) + await _dispose_rejected_pr(conn, pr_num, eval_attempts, all_issues) + + succeeded += 1 + + return succeeded, failed + + +def _build_domain_batches( + rows: list, conn, +) -> tuple[dict[str, list[dict]], list[dict]]: + """Group STANDARD PRs by domain for batch eval. DEEP and LIGHT stay individual. + + Returns (batches_by_domain, individual_prs). + Respects BATCH_EVAL_MAX_PRS and BATCH_EVAL_MAX_DIFF_BYTES. + """ + domain_candidates: dict[str, list[dict]] = {} + individual: list[dict] = [] + + for row in rows: + pr_num = row["number"] + tier = row["tier"] + + # Only batch STANDARD PRs with pending domain review + if tier != "STANDARD": + individual.append(row) + continue + + # Check if domain review already done (resuming after Leo rate limit) + existing = conn.execute( + "SELECT domain_verdict, domain FROM prs WHERE number = ?", (pr_num,) + ).fetchone() + if existing and existing["domain_verdict"] not in ("pending", None): + individual.append(row) + continue + + domain = existing["domain"] if existing and existing["domain"] else "general" + domain_candidates.setdefault(domain, []).append(row) + + # Build sized batches per domain + batches: dict[str, list[dict]] = {} + for domain, prs in domain_candidates.items(): + if len(prs) == 1: + # Single PR — no batching benefit, process individually + individual.extend(prs) + continue + # Cap at BATCH_EVAL_MAX_PRS + batch = prs[: config.BATCH_EVAL_MAX_PRS] + batches[domain] = batch + # Overflow goes individual + individual.extend(prs[config.BATCH_EVAL_MAX_PRS :]) + + return batches, individual + + +# ─── Main entry point ────────────────────────────────────────────────────── + + +async def evaluate_cycle(conn, max_workers=None) -> tuple[int, int]: + """Run one evaluation cycle. + + Groups eligible STANDARD PRs by domain for batch domain review. + DEEP PRs get individual eval. LIGHT PRs get auto-approved. + Leo review always individual (safety net for batch cross-contamination). + """ + global _rate_limit_backoff_until + + # Check if we're in Opus rate-limit backoff + opus_backoff = False + if _rate_limit_backoff_until is not None: + now = datetime.now(timezone.utc) + if now < _rate_limit_backoff_until: + remaining = int((_rate_limit_backoff_until - now).total_seconds()) + logger.debug("Opus rate limit backoff: %d seconds remaining — triage + domain review continue", remaining) + opus_backoff = True + else: + logger.info("Rate limit backoff expired, resuming full eval cycles") + _rate_limit_backoff_until = None + + # Find PRs ready for evaluation + if opus_backoff: + verdict_filter = "AND (p.domain_verdict = 'pending' OR (p.leo_verdict = 'pending' AND p.tier != 'DEEP'))" + else: + verdict_filter = "AND (p.leo_verdict = 'pending' OR p.domain_verdict = 'pending')" + + # Stagger removed — migration protection no longer needed. Merge is domain-serialized + # and entity conflicts auto-resolve. Safe to let all eligible PRs enter eval. (Cory, Mar 14) + + rows = conn.execute( + f"""SELECT p.number, p.tier, p.branch, p.domain FROM prs p + LEFT JOIN sources s ON p.source_path = s.path + WHERE p.status = 'open' + AND p.tier0_pass = 1 + AND COALESCE(p.eval_attempts, 0) < {config.MAX_EVAL_ATTEMPTS} + {verdict_filter} + AND (p.last_attempt IS NULL + OR p.last_attempt < datetime('now', '-10 minutes')) + ORDER BY + CASE WHEN COALESCE(p.eval_attempts, 0) = 0 THEN 0 ELSE 1 END, + CASE COALESCE(p.priority, s.priority, 'medium') + WHEN 'critical' THEN 0 + WHEN 'high' THEN 1 + WHEN 'medium' THEN 2 + WHEN 'low' THEN 3 + ELSE 4 + END, + p.created_at ASC + LIMIT ?""", + (max_workers or config.MAX_EVAL_WORKERS,), + ).fetchall() + + if not rows: + return 0, 0 + + succeeded = 0 + failed = 0 + + # Group STANDARD PRs by domain for batch eval + domain_batches, individual_prs = _build_domain_batches(rows, conn) + + # Process batch domain reviews first + for domain, batch_prs in domain_batches.items(): + try: + agent = agent_for_domain(domain) + b_succeeded, b_failed = await _run_batch_domain_eval( + conn, batch_prs, domain, agent, + ) + succeeded += b_succeeded + failed += b_failed + except Exception: + logger.exception("Batch eval failed for domain %s", domain) + # Revert all to open + for pr_row in batch_prs: + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (pr_row["number"],)) + failed += len(batch_prs) + + # Process individual PRs (DEEP, LIGHT, single-domain, fallback) + for row in individual_prs: + try: + if opus_backoff and row["tier"] == "DEEP": + existing = conn.execute( + "SELECT domain_verdict FROM prs WHERE number = ?", + (row["number"],), + ).fetchone() + if existing and existing["domain_verdict"] not in ("pending", None): + logger.debug( + "PR #%d: skipping DEEP during Opus backoff (domain already %s)", + row["number"], + existing["domain_verdict"], + ) + continue + + result = await evaluate_pr(conn, row["number"], tier=row["tier"]) + if result.get("skipped"): + reason = result.get("reason", "") + logger.debug("PR #%d skipped: %s", row["number"], reason) + if "rate_limited" in reason: + from datetime import timedelta + + if reason == "opus_rate_limited": + _rate_limit_backoff_until = datetime.now(timezone.utc) + timedelta( + minutes=_RATE_LIMIT_BACKOFF_MINUTES + ) + opus_backoff = True + logger.info( + "Opus rate limited — backing off Opus for %d min, continuing triage+domain", + _RATE_LIMIT_BACKOFF_MINUTES, + ) + continue + else: + _rate_limit_backoff_until = datetime.now(timezone.utc) + timedelta( + minutes=_RATE_LIMIT_BACKOFF_MINUTES + ) + logger.info( + "Rate limited (%s) — backing off for %d minutes", reason, _RATE_LIMIT_BACKOFF_MINUTES + ) + break + else: + succeeded += 1 + except Exception: + logger.exception("Failed to evaluate PR #%d", row["number"]) + failed += 1 + conn.execute("UPDATE prs SET status = 'open' WHERE number = ?", (row["number"],)) + + if succeeded or failed: + logger.info("Evaluate cycle: %d evaluated, %d errors", succeeded, failed) + + return succeeded, failed diff --git a/ops/pipeline-v2/lib/merge.py b/ops/pipeline-v2/lib/merge.py new file mode 100644 index 000000000..01fa7e013 --- /dev/null +++ b/ops/pipeline-v2/lib/merge.py @@ -0,0 +1,1449 @@ +"""Merge stage — domain-serialized priority queue with rebase-before-merge. + +Design reviewed by Ganymede (round 2) and Rhea. Key decisions: +- Two-layer locking: asyncio.Lock per domain (fast path) + prs.status (crash recovery) +- Rebase-before-merge with pinned force-with-lease SHA (Ganymede) +- Priority queue: COALESCE(p.priority, s.priority, 'medium') — PR > source > default +- Human PRs default to 'high', not 'critical' (Ganymede — prevents DoS on pipeline) +- 5-minute merge timeout — force-reset to 'conflict' (Rhea) +- Ack comment on human PR discovery (Rhea) +- Pagination on all Forgejo list endpoints (Ganymede standing rule) +""" + +import asyncio +import json +import logging +import os +import random +import re +import shutil +from collections import defaultdict + +from . import config, db +from .db import classify_branch +from .dedup import dedup_evidence_blocks +from .domains import detect_domain_from_branch +from .cascade import cascade_after_merge +from .forgejo import api as forgejo_api + +# Pipeline-owned branch prefixes — these get auto-merged via cherry-pick. +# Originally restricted to pipeline-only branches because rebase orphaned agent commits. +# Now safe for all branches: cherry-pick creates a fresh branch from main, never +# rewrites the source branch. (Original issue: Leo directive, PRs #2141, #157, #2142, #2180) +PIPELINE_OWNED_PREFIXES = ( + "extract/", "ingestion/", "epimetheus/", "reweave/", "fix/", + "theseus/", "rio/", "astra/", "vida/", "clay/", "leo/", "argus/", "oberon/", +) + +# Import worktree lock — file at /opt/teleo-eval/pipeline/lib/worktree_lock.py +try: + from .worktree_lock import async_main_worktree_lock +except ImportError: + import sys + sys.path.insert(0, os.path.dirname(__file__)) + from worktree_lock import async_main_worktree_lock +from .forgejo import get_agent_token, get_pr_diff, repo_path + +logger = logging.getLogger("pipeline.merge") + +# In-memory domain locks — fast path, lost on crash (durable layer is prs.status) +_domain_locks: dict[str, asyncio.Lock] = defaultdict(asyncio.Lock) + +# Merge timeout: if a PR stays 'merging' longer than this, force-reset (Rhea) +MERGE_TIMEOUT_SECONDS = 300 # 5 minutes + + +# --- Git helpers --- + + +async def _git(*args, cwd: str = None, timeout: int = 60) -> tuple[int, str]: + """Run a git command async. Returns (returncode, stdout+stderr).""" + proc = await asyncio.create_subprocess_exec( + "git", + *args, + cwd=cwd or str(config.REPO_DIR), + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + ) + try: + stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=timeout) + except asyncio.TimeoutError: + proc.kill() + await proc.wait() + return -1, f"git {args[0]} timed out after {timeout}s" + output = (stdout or b"").decode().strip() + if stderr: + output += "\n" + stderr.decode().strip() + return proc.returncode, output + + +# --- PR Discovery (Multiplayer v1) --- + + +async def discover_external_prs(conn) -> int: + """Scan Forgejo for open PRs not tracked in SQLite. + + Human PRs (non-pipeline author) get priority 'high' and origin 'human'. + Critical is reserved for explicit human override only. (Ganymede) + + Pagination on all Forgejo list endpoints. (Ganymede standing rule #5) + """ + known = {r["number"] for r in conn.execute("SELECT number FROM prs").fetchall()} + discovered = 0 + page = 1 + + while True: + prs = await forgejo_api( + "GET", + repo_path(f"pulls?state=open&limit=50&page={page}"), + ) + if not prs: + break + + for pr in prs: + if pr["number"] not in known: + # Detect origin: pipeline agents have per-agent Forgejo users + pipeline_users = {"teleo", "rio", "clay", "theseus", "vida", "astra", "leo"} + author = pr.get("user", {}).get("login", "") + is_pipeline = author.lower() in pipeline_users + origin = "pipeline" if is_pipeline else "human" + priority = "high" if origin == "human" else None + domain = None if not is_pipeline else detect_domain_from_branch(pr["head"]["ref"]) + agent, commit_type = classify_branch(pr["head"]["ref"]) + + conn.execute( + """INSERT OR IGNORE INTO prs + (number, branch, status, origin, priority, domain, agent, commit_type) + VALUES (?, ?, 'open', ?, ?, ?, ?, ?)""", + (pr["number"], pr["head"]["ref"], origin, priority, domain, agent, commit_type), + ) + db.audit( + conn, + "merge", + "pr_discovered", + json.dumps( + { + "pr": pr["number"], + "origin": origin, + "author": pr.get("user", {}).get("login"), + "priority": priority or "inherited", + } + ), + ) + + # Ack comment on human PRs so contributor feels acknowledged (Rhea) + if origin == "human": + await _post_ack_comment(pr["number"]) + + discovered += 1 + + if len(prs) < 50: + break # Last page + page += 1 + + if discovered: + logger.info("Discovered %d external PRs", discovered) + return discovered + + +async def _post_ack_comment(pr_number: int): + """Post acknowledgment comment on human-submitted PR. (Rhea) + + Contributor should feel acknowledged immediately, not wonder if + their PR disappeared into a void. + """ + body = ( + "Thanks for the contribution! Your PR is queued for evaluation " + "(priority: high). Expected review time: ~5 minutes.\n\n" + "_This is an automated message from the Teleo pipeline._" + ) + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + {"body": body}, + ) + + +# --- Merge operations --- + + +async def _claim_next_pr(conn, domain: str) -> dict | None: + """Claim the next approved PR for a domain via atomic UPDATE. + + Priority inheritance: COALESCE(p.priority, s.priority, 'medium') + - Explicit PR priority (human PRs) > source priority (pipeline) > default medium + - NULL priorities fall to ELSE 4, which ranks below explicit 'medium' (WHEN 2) + - This is intentional: unclassified PRs don't jump ahead of triaged ones + (Rhea: document the precedence for future maintainers) + + NOT EXISTS enforces domain serialization in SQL — defense-in-depth even if + asyncio.Lock is bypassed. (Ganymede: approved) + """ + # Build prefix filter for pipeline-owned branches only + # Agent branches stay approved but are NOT auto-merged (Leo: PRs #2141, #157, #2142, #2180) + prefix_clauses = " OR ".join("p.branch LIKE ?" for _ in PIPELINE_OWNED_PREFIXES) + prefix_params = [f"{pfx}%" for pfx in PIPELINE_OWNED_PREFIXES] + row = conn.execute( + f"""UPDATE prs SET status = 'merging', last_attempt = datetime('now') + WHERE number = ( + SELECT p.number FROM prs p + LEFT JOIN sources s ON p.source_path = s.path + WHERE p.status = 'approved' + AND p.domain = ? + AND ({prefix_clauses}) + AND NOT EXISTS ( + SELECT 1 FROM prs p2 + WHERE p2.domain = p.domain + AND p2.status = 'merging' + ) + ORDER BY + CASE COALESCE(p.priority, s.priority, 'medium') + WHEN 'critical' THEN 0 + WHEN 'high' THEN 1 + WHEN 'medium' THEN 2 + WHEN 'low' THEN 3 + ELSE 4 + END, + -- Dependency ordering: PRs with fewer broken wiki links merge first. + -- "Creator" PRs (0 broken links) land before "consumer" PRs that + -- reference them, naturally resolving the dependency chain. (Rhea+Ganymede) + CASE WHEN p.eval_issues LIKE '%broken_wiki_links%' THEN 1 ELSE 0 END, + p.created_at ASC + LIMIT 1 + ) + RETURNING number, source_path, branch, domain""", + (domain, *prefix_params), + ).fetchone() + return dict(row) if row else None + + +async def _dedup_enriched_files(worktree_path: str) -> int: + """Scan rebased worktree for duplicate evidence blocks and dedup them. + + Returns count of files fixed. + """ + # Get list of modified claim files in this branch vs origin/main + rc, out = await _git("diff", "--name-only", "origin/main..HEAD", cwd=worktree_path) + if rc != 0: + return 0 + + fixed = 0 + for fpath in out.strip().split("\n"): + fpath = fpath.strip() + if not fpath or not fpath.endswith(".md"): + continue + # Only process claim files (domains/, core/, foundations/) + if not any(fpath.startswith(p) for p in ("domains/", "core/", "foundations/")): + continue + + full_path = os.path.join(worktree_path, fpath) + if not os.path.exists(full_path): + continue + + with open(full_path, "r") as f: + content = f.read() + + deduped = dedup_evidence_blocks(content) + if deduped != content: + with open(full_path, "w") as f: + f.write(deduped) + # Stage the fix + await _git("add", fpath, cwd=worktree_path) + fixed += 1 + + if fixed > 0: + # Amend the last commit to include dedup fixes (no new commit) + await _git( + "-c", "core.editor=true", "commit", "--amend", "--no-edit", + cwd=worktree_path, timeout=30, + ) + logger.info("Deduped evidence blocks in %d file(s) after rebase", fixed) + + return fixed + + +async def _cherry_pick_onto_main(branch: str) -> tuple[bool, str]: + """Cherry-pick extraction commits onto a fresh branch from main. + + Replaces rebase-retry: extraction commits ADD new files, so cherry-pick + applies cleanly ~99% of the time. For enrichments (editing existing files), + cherry-pick reports the exact conflict for human review. + + Leo's manual fix pattern (PRs #2178, #2141, #157, #2142): + 1. git checkout -b clean-branch main + 2. git cherry-pick + 3. Merge to main + """ + worktree_path = f"/tmp/teleo-merge-{branch.replace('/', '-')}" + clean_branch = f"_clean/{branch.replace('/', '-')}" + + # Fetch latest state — separate calls to avoid refspec issues with long branch names + rc, out = await _git("fetch", "origin", "main", timeout=15) + if rc != 0: + return False, f"fetch main failed: {out}" + rc, out = await _git("fetch", "origin", branch, timeout=15) + if rc != 0: + return False, f"fetch branch failed: {out}" + + # Check if already up to date + rc, merge_base = await _git("merge-base", "origin/main", f"origin/{branch}") + rc2, main_sha = await _git("rev-parse", "origin/main") + if rc == 0 and rc2 == 0 and merge_base.strip() == main_sha.strip(): + return True, "already up to date" + + # Get extraction commits (oldest first) + rc, commits_out = await _git( + "log", f"origin/main..origin/{branch}", "--format=%H", "--reverse", + timeout=10, + ) + if rc != 0 or not commits_out.strip(): + return False, f"no commits found on {branch}" + + commit_list = [c.strip() for c in commits_out.strip().split("\n") if c.strip()] + + # Create worktree from origin/main (fresh branch) + # Delete stale local branch if it exists from a previous failed attempt + await _git("branch", "-D", clean_branch) + rc, out = await _git("worktree", "add", "-b", clean_branch, worktree_path, "origin/main") + if rc != 0: + return False, f"worktree add failed: {out}" + + try: + # Cherry-pick each extraction commit + dropped_entities: set[str] = set() + picked_count = 0 + for commit_sha in commit_list: + # Detect merge commits — cherry-pick needs -m 1 to pick first-parent diff + rc_parents, parents_out = await _git( + "cat-file", "-p", commit_sha, cwd=worktree_path, timeout=5, + ) + parent_count = parents_out.count("\nparent ") + (1 if parents_out.startswith("parent ") else 0) + is_merge = parent_count >= 2 + + pick_args = ["cherry-pick"] + if is_merge: + pick_args.extend(["-m", "1"]) + logger.info("Cherry-pick %s: merge commit, using -m 1", commit_sha[:8]) + pick_args.append(commit_sha) + + rc, out = await _git(*pick_args, cwd=worktree_path, timeout=60) + if rc != 0 and "empty" in out.lower(): + # Content already on main — skip this commit + await _git("cherry-pick", "--skip", cwd=worktree_path) + logger.info("Cherry-pick %s: empty (already on main), skipping", commit_sha[:8]) + continue + picked_count += 1 + if rc != 0: + # Check if conflict is entity-only (same auto-resolution as before) + rc_ls, conflicting = await _git( + "diff", "--name-only", "--diff-filter=U", cwd=worktree_path + ) + conflict_files = [ + f.strip() for f in conflicting.split("\n") if f.strip() + ] if rc_ls == 0 else [] + + if conflict_files and all(f.startswith("entities/") for f in conflict_files): + # Entity conflicts: take main's version (entities are recoverable) + # In cherry-pick: --ours = branch we're ON (clean branch from origin/main) + # --theirs = commit being cherry-picked (extraction branch) + for cf in conflict_files: + await _git("checkout", "--ours", cf, cwd=worktree_path) + await _git("add", cf, cwd=worktree_path) + dropped_entities.update(conflict_files) + rc_cont, cont_out = await _git( + "-c", "core.editor=true", "cherry-pick", "--continue", + cwd=worktree_path, timeout=60, + ) + if rc_cont != 0: + await _git("cherry-pick", "--abort", cwd=worktree_path) + return False, f"cherry-pick entity resolution failed on {commit_sha[:8]}: {cont_out}" + logger.info( + "Cherry-pick entity conflict auto-resolved: dropped %s (recoverable)", + ", ".join(sorted(conflict_files)), + ) + else: + # Real conflict — report exactly what conflicted + conflict_detail = ", ".join(conflict_files) if conflict_files else out[:200] + await _git("cherry-pick", "--abort", cwd=worktree_path) + return False, f"cherry-pick conflict on {commit_sha[:8]}: {conflict_detail}" + + if dropped_entities: + logger.info( + "Cherry-pick auto-resolved entity conflicts in %s", + ", ".join(sorted(dropped_entities)), + ) + + # All commits were empty — content already on main + if picked_count == 0: + return True, "already merged (all commits empty)" + + # Post-pick dedup: remove duplicate evidence blocks (Leo: PRs #1751, #1752) + await _dedup_enriched_files(worktree_path) + + # Force-push clean branch as the original branch name + # Capture expected SHA for force-with-lease + rc, expected_sha = await _git("rev-parse", f"origin/{branch}") + if rc != 0: + return False, f"rev-parse origin/{branch} failed: {expected_sha}" + expected_sha = expected_sha.strip().split("\n")[0] + + rc, out = await _git( + "push", + f"--force-with-lease={branch}:{expected_sha}", + "origin", + f"HEAD:{branch}", + cwd=worktree_path, + timeout=30, + ) + if rc != 0: + return False, f"push rejected: {out}" + + return True, "cherry-picked and pushed" + + finally: + # Cleanup worktree and temp branch + await _git("worktree", "remove", "--force", worktree_path) + await _git("branch", "-D", clean_branch) + + +async def _resubmit_approvals(pr_number: int): + """Re-submit 2 formal Forgejo approvals after force-push invalidated them. + + Force-push (rebase) invalidates existing approvals. Branch protection + requires 2 approvals before the merge API will accept the request. + Same pattern as evaluate._post_formal_approvals. + """ + pr_info = await forgejo_api("GET", repo_path(f"pulls/{pr_number}")) + pr_author = pr_info.get("user", {}).get("login", "") if pr_info else "" + + approvals = 0 + for agent_name in ["leo", "vida", "theseus", "clay", "astra", "rio"]: + if agent_name == pr_author: + continue + if approvals >= 2: + break + token = get_agent_token(agent_name) + if token: + result = await forgejo_api( + "POST", + repo_path(f"pulls/{pr_number}/reviews"), + {"body": "Approved (post-rebase re-approval).", "event": "APPROVED"}, + token=token, + ) + if result is not None: + approvals += 1 + logger.debug( + "Post-rebase approval for PR #%d by %s (%d/2)", + pr_number, agent_name, approvals, + ) + + if approvals < 2: + logger.warning( + "Only %d/2 approvals submitted for PR #%d after rebase", + approvals, pr_number, + ) + + +async def _merge_pr(pr_number: int) -> tuple[bool, str]: + """Merge PR via Forgejo API. CURRENTLY UNUSED — local ff-push is the primary merge path. + + Kept as fallback: re-enable if Forgejo fixes the 405 bug (Ganymede's API-first design). + The local ff-push in _merge_domain_queue replaced this due to persistent 405 errors. + """ + # Check if already merged/closed on Forgejo (prevents 405 on re-merge attempts) + pr_info = await forgejo_api("GET", repo_path(f"pulls/{pr_number}")) + if pr_info: + if pr_info.get("merged"): + logger.info("PR #%d already merged on Forgejo, syncing status", pr_number) + return True, "already merged" + if pr_info.get("state") == "closed": + logger.warning("PR #%d closed on Forgejo but not merged", pr_number) + return False, "PR closed without merge" + + # Merge whitelist only allows leo and m3taversal — use Leo's token + leo_token = get_agent_token("leo") + if not leo_token: + return False, "no leo token for merge (merge whitelist requires leo)" + + # Pre-flight: verify approvals exist before attempting merge (Rhea: catches 405) + reviews = await forgejo_api("GET", repo_path(f"pulls/{pr_number}/reviews")) + if reviews is not None: + approval_count = sum(1 for r in reviews if r.get("state") == "APPROVED") + if approval_count < 2: + logger.info("PR #%d: only %d/2 approvals, resubmitting before merge", pr_number, approval_count) + await _resubmit_approvals(pr_number) + + # Retry with backoff + jitter for transient errors (Rhea: jitter prevents thundering herd) + delays = [0, 5, 15, 45] + for attempt, base_delay in enumerate(delays, 1): + if base_delay: + jittered = base_delay * (0.8 + random.random() * 0.4) + await asyncio.sleep(jittered) + + result = await forgejo_api( + "POST", + repo_path(f"pulls/{pr_number}/merge"), + {"Do": "merge", "merge_message_field": ""}, + token=leo_token, + ) + if result is not None: + return True, "merged" + + # Check if merge succeeded despite API error (timeout case — Rhea) + pr_check = await forgejo_api("GET", repo_path(f"pulls/{pr_number}")) + if pr_check and pr_check.get("merged"): + return True, "already merged" + + # Distinguish transient from permanent failures (Ganymede) + if pr_check and not pr_check.get("mergeable", True): + # PR not mergeable — branch diverged or conflict. Rebase needed, not retry. + return False, "merge rejected: PR not mergeable (needs rebase)" + + if attempt < len(delays): + logger.info("PR #%d: merge attempt %d failed (transient), retrying in %.0fs", + pr_number, attempt, delays[attempt] if attempt < len(delays) else 0) + + return False, "Forgejo merge API failed after 4 attempts (transient)" + + +async def _delete_remote_branch(branch: str): + """Delete remote branch immediately after merge. (Ganymede Q4: immediate, not batch) + + If DELETE fails, log and move on — stale branch is cosmetic, + stale merge is operational. + """ + result = await forgejo_api( + "DELETE", + repo_path(f"branches/{branch}"), + ) + if result is None: + logger.warning("Failed to delete remote branch %s — cosmetic, continuing", branch) + + +# --- Contributor attribution --- + + +def _is_knowledge_pr(diff: str) -> bool: + """Check if a PR touches knowledge files (claims, decisions, core, foundations). + + Knowledge PRs get full CI attribution weight. + Pipeline-only PRs (inbox, entities, agents, archive) get zero CI weight. + + Mixed PRs count as knowledge — if a PR adds a claim, it gets attribution + even if it also moves source files. Knowledge takes priority. (Ganymede review) + """ + knowledge_prefixes = ("domains/", "core/", "foundations/", "decisions/") + + for line in diff.split("\n"): + if line.startswith("+++ b/") or line.startswith("--- a/"): + path = line.split("/", 1)[1] if "/" in line else "" + if any(path.startswith(p) for p in knowledge_prefixes): + return True + + return False + + +def _refine_commit_type(diff: str, branch_commit_type: str) -> str: + """Refine commit_type from diff content when branch prefix is ambiguous. + + Branch prefix gives initial classification (extract, research, entity, etc.). + For 'extract' branches, diff content can distinguish: + - challenge: adds challenged_by edges to existing claims + - enrich: modifies existing claim frontmatter without new files + - extract: creates new claim files (default for extract branches) + + Only refines 'extract' type — other branch types (research, entity, reweave, fix) + are already specific enough. + """ + if branch_commit_type != "extract": + return branch_commit_type + + new_files = 0 + modified_files = 0 + has_challenge_edge = False + + in_diff_header = False + current_is_new = False + for line in diff.split("\n"): + if line.startswith("diff --git"): + in_diff_header = True + current_is_new = False + elif line.startswith("new file"): + current_is_new = True + elif line.startswith("+++ b/"): + path = line[6:] + if any(path.startswith(p) for p in ("domains/", "core/", "foundations/")): + if current_is_new: + new_files += 1 + else: + modified_files += 1 + in_diff_header = False + elif line.startswith("+") and not line.startswith("+++"): + if "challenged_by:" in line or "challenges:" in line: + has_challenge_edge = True + + if has_challenge_edge and new_files == 0: + return "challenge" + if modified_files > 0 and new_files == 0: + return "enrich" + return "extract" + + +async def _record_contributor_attribution(conn, pr_number: int, branch: str): + """Record contributor attribution after a successful merge. + + Parses git trailers and claim frontmatter to identify contributors + and their roles. Upserts into contributors table. Refines commit_type + from diff content. Pipeline-only PRs (no knowledge files) are skipped. + """ + import re as _re + from datetime import date as _date, datetime as _dt + + today = _date.today().isoformat() + + # Get the PR diff to parse claim frontmatter for attribution blocks + diff = await get_pr_diff(pr_number) + if not diff: + return + + # Pipeline-only PRs (inbox, entities, agents) don't count toward CI + if not _is_knowledge_pr(diff): + logger.info("PR #%d: pipeline-only commit — skipping CI attribution", pr_number) + return + + # Refine commit_type from diff content (branch prefix may be too broad) + row = conn.execute("SELECT commit_type FROM prs WHERE number = ?", (pr_number,)).fetchone() + branch_type = row["commit_type"] if row and row["commit_type"] else "extract" + refined_type = _refine_commit_type(diff, branch_type) + if refined_type != branch_type: + conn.execute("UPDATE prs SET commit_type = ? WHERE number = ?", (refined_type, pr_number)) + logger.info("PR #%d: commit_type refined %s → %s", pr_number, branch_type, refined_type) + + # Parse Pentagon-Agent trailer from branch commit messages + agents_found: set[str] = set() + rc, log_output = await _git( + "log", f"origin/main..origin/{branch}", "--format=%b%n%N", + timeout=10, + ) + if rc == 0: + for match in _re.finditer(r"Pentagon-Agent:\s*(\S+)\s*<([^>]+)>", log_output): + agent_name = match.group(1).lower() + agent_uuid = match.group(2) + _upsert_contributor( + conn, agent_name, agent_uuid, "extractor", today, + ) + agents_found.add(agent_name) + + # Parse attribution blocks from claim frontmatter in diff + # Look for added lines with attribution YAML + current_role = None + for line in diff.split("\n"): + if not line.startswith("+") or line.startswith("+++"): + continue + stripped = line[1:].strip() + + # Detect role sections in attribution block + for role in ("sourcer", "extractor", "challenger", "synthesizer", "reviewer"): + if stripped.startswith(f"{role}:"): + current_role = role + break + + # Extract handle from attribution entries + handle_match = _re.match(r'-\s*handle:\s*["\']?([^"\']+)["\']?', stripped) + if handle_match and current_role: + handle = handle_match.group(1).strip().lower() + agent_id_match = _re.search(r'agent_id:\s*["\']?([^"\']+)', stripped) + agent_id = agent_id_match.group(1).strip() if agent_id_match else None + _upsert_contributor(conn, handle, agent_id, current_role, today) + + # Fallback: if no attribution block found, credit the branch agent as extractor + if not agents_found: + # Try to infer agent from branch name (e.g., "extract/2026-03-05-...") + # The PR's agent field in SQLite is also available + row = conn.execute("SELECT agent FROM prs WHERE number = ?", (pr_number,)).fetchone() + if row and row["agent"]: + _upsert_contributor(conn, row["agent"].lower(), None, "extractor", today) + + # Increment claims_merged for all contributors on this PR + # (handled inside _upsert_contributor via the role counts) + + +def _upsert_contributor( + conn, handle: str, agent_id: str | None, role: str, date_str: str, +): + """Upsert a contributor record, incrementing the appropriate role count.""" + import json as _json + from datetime import datetime as _dt + + role_col = f"{role}_count" + if role_col not in ( + "sourcer_count", "extractor_count", "challenger_count", + "synthesizer_count", "reviewer_count", + ): + logger.warning("Unknown contributor role: %s", role) + return + + existing = conn.execute( + "SELECT handle FROM contributors WHERE handle = ?", (handle,) + ).fetchone() + + if existing: + conn.execute( + f"""UPDATE contributors SET + {role_col} = {role_col} + 1, + claims_merged = claims_merged + CASE WHEN ? IN ('extractor', 'sourcer') THEN 1 ELSE 0 END, + last_contribution = ?, + updated_at = datetime('now') + WHERE handle = ?""", + (role, date_str, handle), + ) + else: + conn.execute( + f"""INSERT INTO contributors (handle, agent_id, first_contribution, last_contribution, {role_col}, claims_merged) + VALUES (?, ?, ?, ?, 1, CASE WHEN ? IN ('extractor', 'sourcer') THEN 1 ELSE 0 END)""", + (handle, agent_id, date_str, date_str, role), + ) + + # Recalculate tier + _recalculate_tier(conn, handle) + + +def _recalculate_tier(conn, handle: str): + """Recalculate contributor tier based on config rules.""" + from datetime import date as _date, datetime as _dt + + row = conn.execute( + "SELECT claims_merged, challenges_survived, first_contribution, tier FROM contributors WHERE handle = ?", + (handle,), + ).fetchone() + if not row: + return + + current_tier = row["tier"] + claims_merged = row["claims_merged"] or 0 + challenges_survived = row["challenges_survived"] or 0 + first_contribution = row["first_contribution"] + + days_since_first = 0 + if first_contribution: + try: + first_date = _dt.strptime(first_contribution, "%Y-%m-%d").date() + days_since_first = (_date.today() - first_date).days + except ValueError: + pass + + # Check veteran first (higher tier) + vet_rules = config.CONTRIBUTOR_TIER_RULES["veteran"] + if (claims_merged >= vet_rules["claims_merged"] + and days_since_first >= vet_rules["min_days_since_first"] + and challenges_survived >= vet_rules["challenges_survived"]): + new_tier = "veteran" + elif claims_merged >= config.CONTRIBUTOR_TIER_RULES["contributor"]["claims_merged"]: + new_tier = "contributor" + else: + new_tier = "new" + + if new_tier != current_tier: + conn.execute( + "UPDATE contributors SET tier = ?, updated_at = datetime('now') WHERE handle = ?", + (new_tier, handle), + ) + logger.info("Contributor %s: tier %s → %s", handle, current_tier, new_tier) + db.audit( + conn, "contributor", "tier_change", + json.dumps({"handle": handle, "from": current_tier, "to": new_tier}), + ) + + +# --- Source archiving after merge (Ganymede review: closes near-duplicate loop) --- + +# Accumulates source moves during a merge cycle, batch-committed at the end +_pending_source_moves: list[tuple[str, str]] = [] # (queue_path, archive_path) + + +def _update_source_frontmatter_status(path: str, new_status: str): + """Update the status field in a source file's frontmatter. (Ganymede: 5 lines)""" + import re as _re + try: + text = open(path).read() + text = _re.sub(r"^status: .*$", f"status: {new_status}", text, count=1, flags=_re.MULTILINE) + open(path, "w").write(text) + except Exception as e: + logger.warning("Failed to update source status in %s: %s", path, e) + + +async def _embed_merged_claims(main_sha: str, branch_sha: str): + """Embed new/changed claim files from a merged PR into Qdrant. + + Diffs main_sha (pre-merge main HEAD) against branch_sha (merged branch tip) + to find ALL changed files across the entire branch, not just the last commit. + Also deletes Qdrant vectors for files removed by the branch. + + Non-fatal — embedding failure does not block the merge pipeline. + """ + try: + # --- Embed added/changed files --- + rc, diff_out = await _git( + "diff", "--name-only", "--diff-filter=ACMR", + main_sha, branch_sha, + cwd=str(config.MAIN_WORKTREE), + timeout=10, + ) + if rc != 0: + logger.warning("embed: diff failed (rc=%d), skipping", rc) + return + + embed_dirs = {"domains/", "core/", "foundations/", "decisions/", "entities/"} + md_files = [ + f for f in diff_out.strip().split("\n") + if f.endswith(".md") + and any(f.startswith(d) for d in embed_dirs) + and not f.split("/")[-1].startswith("_") + ] + + embedded = 0 + for fpath in md_files: + full_path = config.MAIN_WORKTREE / fpath + if not full_path.exists(): + continue + proc = await asyncio.create_subprocess_exec( + "python3", "/opt/teleo-eval/embed-claims.py", "--file", str(full_path), + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + ) + stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=30) + if proc.returncode == 0 and b"OK" in stdout: + embedded += 1 + else: + logger.warning("embed: failed for %s: %s", fpath, stderr.decode()[:200]) + + if embedded: + logger.info("embed: %d/%d files embedded into Qdrant", embedded, len(md_files)) + + # --- Delete vectors for removed files (Ganymede: stale vector cleanup) --- + rc, del_out = await _git( + "diff", "--name-only", "--diff-filter=D", + main_sha, branch_sha, + cwd=str(config.MAIN_WORKTREE), + timeout=10, + ) + if rc == 0 and del_out.strip(): + deleted_files = [ + f for f in del_out.strip().split("\n") + if f.endswith(".md") + and any(f.startswith(d) for d in embed_dirs) + ] + if deleted_files: + import hashlib + point_ids = [hashlib.md5(f.encode()).hexdigest() for f in deleted_files] + try: + import urllib.request + req = urllib.request.Request( + "http://localhost:6333/collections/teleo-claims/points/delete", + data=json.dumps({"points": point_ids}).encode(), + headers={"Content-Type": "application/json"}, + method="POST", + ) + urllib.request.urlopen(req, timeout=10) + logger.info("embed: deleted %d stale vectors from Qdrant", len(point_ids)) + except Exception: + logger.warning("embed: failed to delete stale vectors (non-fatal)") + except Exception: + logger.exception("embed: post-merge embedding failed (non-fatal)") + + +def _archive_source_for_pr(branch: str, domain: str, merged: bool = True): + """Move source from queue/ to archive/{domain}/ after PR merge or close. + + Only handles extract/ branches (Ganymede: skip research sessions). + Updates frontmatter: 'processed' for merged, 'rejected' for closed. + Accumulates moves for batch commit at end of merge cycle. + """ + if not branch.startswith("extract/"): + return + + source_slug = branch.replace("extract/", "", 1) + main_dir = config.MAIN_WORKTREE if hasattr(config, "MAIN_WORKTREE") else "/opt/teleo-eval/workspaces/main" + queue_path = os.path.join(main_dir, "inbox", "queue", f"{source_slug}.md") + archive_dir = os.path.join(main_dir, "inbox", "archive", domain or "unknown") + archive_path = os.path.join(archive_dir, f"{source_slug}.md") + + # Already in archive? Delete queue duplicate + if os.path.exists(archive_path): + if os.path.exists(queue_path): + try: + os.remove(queue_path) + _pending_source_moves.append((queue_path, "deleted")) + logger.info("Source dedup: deleted queue/%s (already in archive/%s)", source_slug, domain) + except Exception as e: + logger.warning("Source dedup failed: %s", e) + return + + # Move from queue to archive + if os.path.exists(queue_path): + # Update frontmatter before moving (Ganymede: distinguish merged vs rejected) + _update_source_frontmatter_status(queue_path, "processed" if merged else "rejected") + os.makedirs(archive_dir, exist_ok=True) + try: + shutil.move(queue_path, archive_path) + _pending_source_moves.append((queue_path, archive_path)) + logger.info("Source archived: queue/%s → archive/%s/ (status=%s)", + source_slug, domain, "processed" if merged else "rejected") + except Exception as e: + logger.warning("Source archive failed: %s", e) + + +async def _commit_source_moves(): + """Batch commit accumulated source moves. Called at end of merge cycle. + + Rhea review: fetch+reset before touching files, use main_worktree_lock, + crash gap is self-healing (reset --hard reverts uncommitted moves). + """ + if not _pending_source_moves: + return + + main_dir = config.MAIN_WORKTREE if hasattr(config, "MAIN_WORKTREE") else "/opt/teleo-eval/workspaces/main" + count = len(_pending_source_moves) + _pending_source_moves.clear() + + # Acquire file lock — coordinates with telegram bot and other daemon stages (Ganymede: Option C) + try: + async with async_main_worktree_lock(timeout=10): + # Sync worktree with remote (Rhea: fetch+reset, not pull) + await _git("fetch", "origin", "main", cwd=main_dir, timeout=30) + await _git("reset", "--hard", "origin/main", cwd=main_dir, timeout=30) + + await _git("add", "-A", "inbox/", cwd=main_dir) + + rc, out = await _git( + "commit", "-m", + f"pipeline: archive {count} source(s) post-merge\n\n" + f"Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>", + cwd=main_dir, + ) + if rc != 0: + if "nothing to commit" in out: + return + logger.warning("Source archive commit failed: %s", out) + return + + for attempt in range(3): + await _git("pull", "--rebase", "origin", "main", cwd=main_dir, timeout=30) + rc_push, _ = await _git("push", "origin", "main", cwd=main_dir, timeout=30) + if rc_push == 0: + logger.info("Committed + pushed %d source archive moves", count) + return + await asyncio.sleep(2) + + logger.warning("Failed to push source archive moves after 3 attempts") + await _git("reset", "--hard", "origin/main", cwd=main_dir) + except TimeoutError: + logger.warning("Source archive commit skipped: worktree lock timeout") + + +# --- Domain merge task --- + + +async def _merge_domain_queue(conn, domain: str) -> tuple[int, int]: + """Process the merge queue for a single domain. Returns (succeeded, failed).""" + succeeded = 0 + failed = 0 + + while True: + async with _domain_locks[domain]: + pr = await _claim_next_pr(conn, domain) + if not pr: + break # No more approved PRs for this domain + + pr_num = pr["number"] + branch = pr["branch"] + logger.info("Merging PR #%d (%s) in domain %s", pr_num, branch, domain) + + try: + # Cherry-pick onto fresh main (replaces rebase-retry — Leo+Cory directive) + # Extraction commits ADD new files, so cherry-pick applies cleanly. + # Rebase failed ~23% of the time due to main moving during replay. + pick_ok, pick_msg = await asyncio.wait_for( + _cherry_pick_onto_main(branch), + timeout=MERGE_TIMEOUT_SECONDS, + ) + except asyncio.TimeoutError: + logger.error( + "PR #%d merge timed out after %ds — resetting to conflict (Rhea)", pr_num, MERGE_TIMEOUT_SECONDS + ) + conn.execute( + "UPDATE prs SET status = 'conflict', merge_cycled = 1, merge_failures = COALESCE(merge_failures, 0) + 1, last_error = ? WHERE number = ?", + (f"merge timed out after {MERGE_TIMEOUT_SECONDS}s", pr_num), + ) + db.audit(conn, "merge", "timeout", json.dumps({"pr": pr_num, "timeout_seconds": MERGE_TIMEOUT_SECONDS})) + failed += 1 + continue + + if not pick_ok: + # Cherry-pick failed — this is a genuine conflict (not a race condition). + # No retry needed: cherry-pick onto fresh main means main can't have moved. + logger.warning("PR #%d cherry-pick failed: %s", pr_num, pick_msg) + conn.execute( + "UPDATE prs SET status = 'conflict', merge_cycled = 1, merge_failures = COALESCE(merge_failures, 0) + 1, last_error = ? WHERE number = ?", + (pick_msg[:500], pr_num), + ) + db.audit(conn, "merge", "cherry_pick_failed", json.dumps({"pr": pr_num, "error": pick_msg[:200]})) + failed += 1 + continue + + # Local ff-merge: push cherry-picked branch as main (Rhea's approach, Leo+Rhea: local primary) + # The branch was just cherry-picked onto origin/main, + # so origin/{branch} is a descendant of origin/main. Push it as main. + await _git("fetch", "origin", branch, timeout=15) + rc, main_sha = await _git("rev-parse", "origin/main") + main_sha = main_sha.strip() if rc == 0 else "" + rc, branch_sha = await _git("rev-parse", f"origin/{branch}") + branch_sha = branch_sha.strip() if rc == 0 else "" + + merge_ok = False + merge_msg = "" + if branch_sha: + rc, out = await _git( + "push", f"--force-with-lease=main:{main_sha}", + "origin", f"{branch_sha}:main", + timeout=30, + ) + if rc == 0: + merge_ok = True + merge_msg = f"merged (local ff-push, SHA: {branch_sha[:8]})" + # Close PR on Forgejo with merge SHA comment + leo_token = get_agent_token("leo") + await forgejo_api( + "POST", + repo_path(f"issues/{pr_num}/comments"), + {"body": f"Merged locally.\nMerge SHA: `{branch_sha}`\nBranch: `{branch}`"}, + ) + await forgejo_api( + "PATCH", + repo_path(f"pulls/{pr_num}"), + {"state": "closed"}, + token=leo_token, + ) + else: + merge_msg = f"local ff-push failed: {out[:200]}" + else: + merge_msg = f"could not resolve origin/{branch}" + + if not merge_ok: + logger.error("PR #%d merge failed: %s", pr_num, merge_msg) + conn.execute( + "UPDATE prs SET status = 'conflict', merge_cycled = 1, merge_failures = COALESCE(merge_failures, 0) + 1, last_error = ? WHERE number = ?", + (merge_msg[:500], pr_num), + ) + db.audit(conn, "merge", "merge_failed", json.dumps({"pr": pr_num, "error": merge_msg[:200]})) + failed += 1 + continue + + # Success — update status and cleanup + conn.execute( + """UPDATE prs SET status = 'merged', + merged_at = datetime('now'), + last_error = NULL + WHERE number = ?""", + (pr_num,), + ) + db.audit(conn, "merge", "merged", json.dumps({"pr": pr_num, "branch": branch})) + logger.info("PR #%d merged successfully", pr_num) + + # Record contributor attribution + try: + await _record_contributor_attribution(conn, pr_num, branch) + except Exception: + logger.exception("PR #%d: contributor attribution failed (non-fatal)", pr_num) + + # Archive source file (closes near-duplicate loop — Ganymede review) + _archive_source_for_pr(branch, domain) + + # Embed new/changed claims into Qdrant (non-fatal) + await _embed_merged_claims(main_sha, branch_sha) + + + # Cascade: notify agents whose beliefs/positions depend on changed claims + try: + cascaded = await cascade_after_merge(main_sha, branch_sha, pr_num, config.MAIN_WORKTREE) + if cascaded: + logger.info("PR #%d: %d cascade notifications sent", pr_num, cascaded) + except Exception: + logger.exception("PR #%d: cascade check failed (non-fatal)", pr_num) + # Delete remote branch immediately (Ganymede Q4) + await _delete_remote_branch(branch) + + # Prune local worktree metadata + await _git("worktree", "prune") + + succeeded += 1 + + return succeeded, failed + + +# --- Main entry point --- + + +async def _reconcile_db_state(conn): + """Reconcile pipeline DB against Forgejo's actual PR state. + + Fixes ghost PRs: DB says 'conflict' or 'open' but Forgejo says merged/closed. + Also detects deleted branches (rev-parse failures). (Leo's structural fix #1) + Run at the start of each merge cycle. + """ + stale = conn.execute( + "SELECT number, branch, status FROM prs WHERE status IN ('conflict', 'open', 'reviewing', 'approved')" + ).fetchall() + + if not stale: + return + + reconciled = 0 + for row in stale: + pr_number = row["number"] + branch = row["branch"] + db_status = row["status"] + + # Check Forgejo PR state + pr_info = await forgejo_api("GET", repo_path(f"pulls/{pr_number}")) + if not pr_info: + continue + + forgejo_state = pr_info.get("state", "") + is_merged = pr_info.get("merged", False) + + if is_merged and db_status != "merged": + conn.execute( + "UPDATE prs SET status = 'merged', merged_at = datetime('now') WHERE number = ?", + (pr_number,), + ) + reconciled += 1 + continue + + if forgejo_state == "closed" and not is_merged and db_status not in ("closed",): + # Agent PRs get merged via git push (not Forgejo merge API), so + # Forgejo shows merged=False. Check if branch content is on main. + if db_status == "approved" and branch: + # Agent merges are ff-push — no merge commit exists. + # Check if branch tip is an ancestor of main (content is on main). + rc, branch_sha = await _git( + "rev-parse", f"origin/{branch}", timeout=10, + ) + if rc == 0 and branch_sha.strip(): + rc2, _ = await _git( + "merge-base", "--is-ancestor", + branch_sha.strip(), "origin/main", + timeout=10, + ) + if rc2 == 0: + conn.execute( + "UPDATE prs SET status = 'merged', merged_at = datetime('now') WHERE number = ?", + (pr_number,), + ) + logger.info("Reconciled PR #%d: agent-merged (branch tip on main)", pr_number) + reconciled += 1 + continue + conn.execute( + "UPDATE prs SET status = 'closed', last_error = 'reconciled: closed on Forgejo' WHERE number = ?", + (pr_number,), + ) + reconciled += 1 + continue + + # Ghost PR detection: branch deleted but PR still open in DB (Fix #2) + # Ganymede: rc != 0 means remote unreachable — skip, don't close + if db_status in ("open", "reviewing") and branch: + rc, ls_out = await _git("ls-remote", "--heads", "origin", branch, timeout=10) + if rc != 0: + logger.warning("ls-remote failed for %s — skipping ghost check", branch) + continue + if not ls_out.strip(): + # Branch gone — close PR on Forgejo and in DB (Ganymede: don't leave orphans) + await forgejo_api( + "PATCH", + repo_path(f"pulls/{pr_number}"), + body={"state": "closed"}, + ) + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + body={"body": "Auto-closed: branch deleted from remote."}, + ) + conn.execute( + "UPDATE prs SET status = 'closed', last_error = 'reconciled: branch deleted' WHERE number = ?", + (pr_number,), + ) + logger.info("Ghost PR #%d: branch %s deleted, closing", pr_number, branch) + reconciled += 1 + + if reconciled: + logger.info("Reconciled %d stale PRs against Forgejo state", reconciled) + + +MAX_CONFLICT_REBASE_ATTEMPTS = 3 + + +async def _handle_permanent_conflicts(conn) -> int: + """Close conflict_permanent PRs and file their sources correctly. + + When a PR fails rebase 3x, the claims are already on main from the first + successful extraction. The source should live in archive/{domain}/ (one copy). + Any duplicate in queue/ gets deleted. No requeuing — breaks the infinite loop. + + Hygiene (Cory): one source file, one location, no duplicates. + Reviewed by Ganymede: commit moves, use shutil.move, batch commit at end. + """ + rows = conn.execute( + """SELECT number, branch, domain + FROM prs + WHERE status = 'conflict_permanent' + ORDER BY number ASC""" + ).fetchall() + + if not rows: + return 0 + + handled = 0 + files_changed = False + main_dir = config.MAIN_WORKTREE if hasattr(config, "MAIN_WORKTREE") else "/opt/teleo-eval/workspaces/main" + + for row in rows: + pr_number = row["number"] + branch = row["branch"] + domain = row["domain"] or "unknown" + + # Close PR on Forgejo + await forgejo_api( + "PATCH", + repo_path(f"pulls/{pr_number}"), + body={"state": "closed"}, + ) + await forgejo_api( + "POST", + repo_path(f"issues/{pr_number}/comments"), + body={"body": ( + "Closed by conflict auto-resolver: rebase failed 3 times (enrichment conflict). " + "Claims already on main from prior extraction. Source filed in archive." + )}, + ) + await _delete_remote_branch(branch) + + # File the source: one copy in archive/{domain}/, delete duplicates + source_slug = branch.replace("extract/", "", 1) if branch.startswith("extract/") else None + if source_slug: + filename = f"{source_slug}.md" + archive_dir = os.path.join(main_dir, "inbox", "archive", domain) + archive_path = os.path.join(archive_dir, filename) + queue_path = os.path.join(main_dir, "inbox", "queue", filename) + + already_archived = os.path.exists(archive_path) + + if already_archived: + if os.path.exists(queue_path): + try: + os.remove(queue_path) + logger.info("PR #%d: deleted queue duplicate %s (already in archive/%s)", + pr_number, filename, domain) + files_changed = True + except Exception as e: + logger.warning("PR #%d: failed to delete queue duplicate: %s", pr_number, e) + else: + logger.info("PR #%d: source already in archive/%s, no cleanup needed", pr_number, domain) + else: + if os.path.exists(queue_path): + os.makedirs(archive_dir, exist_ok=True) + try: + shutil.move(queue_path, archive_path) + logger.info("PR #%d: filed source to archive/%s: %s", pr_number, domain, filename) + files_changed = True + except Exception as e: + logger.warning("PR #%d: failed to file source: %s", pr_number, e) + else: + logger.warning("PR #%d: source not found in queue or archive for %s", pr_number, filename) + + # Clear batch-state marker + state_marker = f"/opt/teleo-eval/batch-state/{source_slug}.done" + try: + if os.path.exists(state_marker): + os.remove(state_marker) + except Exception: + pass + + conn.execute( + "UPDATE prs SET status = 'closed', last_error = 'conflict_permanent: closed + filed in archive' WHERE number = ?", + (pr_number,), + ) + handled += 1 + logger.info("Permanent conflict handled: PR #%d closed, source filed", pr_number) + + # Batch commit source moves to main (Ganymede: follow entity_batch pattern) + if files_changed: + await _git("add", "-A", "inbox/", cwd=main_dir) + rc, out = await _git( + "commit", "-m", + f"pipeline: archive {handled} conflict-closed source(s)\n\n" + f"Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>", + cwd=main_dir, + ) + if rc == 0: + # Push with pull-rebase retry (entity_batch pattern) + for attempt in range(3): + await _git("pull", "--rebase", "origin", "main", cwd=main_dir, timeout=30) + rc_push, _ = await _git("push", "origin", "main", cwd=main_dir, timeout=30) + if rc_push == 0: + logger.info("Committed + pushed source archive moves for %d PRs", handled) + break + await asyncio.sleep(2) + else: + logger.warning("Failed to push source archive moves after 3 attempts") + await _git("reset", "--hard", "origin/main", cwd=main_dir) + + if handled: + logger.info("Handled %d permanent conflict PRs (closed + filed)", handled) + + return handled + + +async def _retry_conflict_prs(conn) -> tuple[int, int]: + """Retry conflict PRs via cherry-pick onto fresh main. + + Design: Ganymede (extend merge stage), Rhea (safety guards), Leo (re-eval required). + - Pick up PRs with status='conflict' and both approvals + - Cherry-pick extraction commits onto fresh branch from origin/main + - If cherry-pick succeeds: force-push, reset to 'open' with verdicts cleared for re-eval + - If cherry-pick fails: increment attempt counter, leave as 'conflict' + - After MAX_CONFLICT_REBASE_ATTEMPTS failures: mark 'conflict_permanent' + - Skip branches with new commits since conflict was set (Rhea: someone is working on it) + """ + rows = conn.execute( + """SELECT number, branch, conflict_rebase_attempts + FROM prs + WHERE status = 'conflict' + AND COALESCE(conflict_rebase_attempts, 0) < ? + ORDER BY number ASC""", + (MAX_CONFLICT_REBASE_ATTEMPTS,), + ).fetchall() + + if not rows: + return 0, 0 + + resolved = 0 + failed = 0 + + for row in rows: + pr_number = row["number"] + branch = row["branch"] + attempts = row["conflict_rebase_attempts"] or 0 + + logger.info("Conflict retry [%d/%d] PR #%d branch=%s", + attempts + 1, MAX_CONFLICT_REBASE_ATTEMPTS, pr_number, branch) + + # Fetch latest remote state + await _git("fetch", "origin", branch, timeout=30) + await _git("fetch", "origin", "main", timeout=30) + + # Attempt cherry-pick onto fresh main (replaces rebase — Leo+Cory directive) + ok, msg = await _cherry_pick_onto_main(branch) + + if ok: + # Rebase succeeded — reset for re-eval (Ganymede: approvals are stale after rebase) + conn.execute( + """UPDATE prs + SET status = 'open', + leo_verdict = 'pending', + domain_verdict = 'pending', + eval_attempts = 0, + conflict_rebase_attempts = ? + WHERE number = ?""", + (attempts + 1, pr_number), + ) + logger.info("Conflict resolved: PR #%d rebased successfully, reset for re-eval", pr_number) + resolved += 1 + else: + new_attempts = attempts + 1 + if new_attempts >= MAX_CONFLICT_REBASE_ATTEMPTS: + conn.execute( + """UPDATE prs + SET status = 'conflict_permanent', + conflict_rebase_attempts = ?, + last_error = ? + WHERE number = ?""", + (new_attempts, f"rebase failed {MAX_CONFLICT_REBASE_ATTEMPTS}x: {msg[:200]}", pr_number), + ) + logger.warning("Conflict permanent: PR #%d failed %d rebase attempts: %s", + pr_number, new_attempts, msg[:100]) + else: + conn.execute( + """UPDATE prs + SET conflict_rebase_attempts = ?, + last_error = ? + WHERE number = ?""", + (new_attempts, f"rebase attempt {new_attempts}: {msg[:200]}", pr_number), + ) + logger.info("Conflict retry failed: PR #%d attempt %d/%d: %s", + pr_number, new_attempts, MAX_CONFLICT_REBASE_ATTEMPTS, msg[:100]) + failed += 1 + + if resolved or failed: + logger.info("Conflict retry: %d resolved, %d failed", resolved, failed) + + return resolved, failed + + +async def merge_cycle(conn, max_workers=None) -> tuple[int, int]: + """Run one merge cycle across all domains. + + 0. Reconcile DB state against Forgejo (catch ghost PRs) + 0.5. Retry conflict PRs (rebase onto current main) + 1. Discover external PRs (multiplayer v1) + 2. Find all domains with approved PRs + 3. Launch one async task per domain (cross-domain parallel, same-domain serial) + """ + # Step 0: Reconcile stale DB entries + await _reconcile_db_state(conn) + + # Step 0.5: Retry conflict PRs (Ganymede: before normal merge, same loop) + await _retry_conflict_prs(conn) + + # Step 0.6: Handle permanent conflicts (close + requeue for re-extraction) + await _handle_permanent_conflicts(conn) + + # Step 1: Discover external PRs + await discover_external_prs(conn) + + # Step 2: Find domains with approved work + rows = conn.execute("SELECT DISTINCT domain FROM prs WHERE status = 'approved' AND domain IS NOT NULL").fetchall() + domains = [r["domain"] for r in rows] + + # Also check for NULL-domain PRs (human PRs with undetected domain) + null_domain = conn.execute("SELECT COUNT(*) as c FROM prs WHERE status = 'approved' AND domain IS NULL").fetchone() + if null_domain and null_domain["c"] > 0: + logger.warning("%d approved PRs have NULL domain — skipping until eval assigns domain", null_domain["c"]) + + if not domains: + return 0, 0 + + # Step 3: Merge all domains concurrently + tasks = [_merge_domain_queue(conn, domain) for domain in domains] + results = await asyncio.gather(*tasks, return_exceptions=True) + + total_succeeded = 0 + total_failed = 0 + for i, result in enumerate(results): + if isinstance(result, Exception): + logger.exception("Domain %s merge failed with exception", domains[i]) + total_failed += 1 + else: + s, f = result + total_succeeded += s + total_failed += f + + if total_succeeded or total_failed: + logger.info( + "Merge cycle: %d succeeded, %d failed across %d domains", total_succeeded, total_failed, len(domains) + ) + + # Batch commit source moves (Ganymede: one commit per cycle, not per PR) + await _commit_source_moves() + + return total_succeeded, total_failed diff --git a/ops/research-session.sh b/ops/research-session.sh index 219242fb9..803122e87 100644 --- a/ops/research-session.sh +++ b/ops/research-session.sh @@ -31,6 +31,17 @@ RAW_DIR="/opt/teleo-eval/research-raw/${AGENT}" log() { echo "[$(date -Iseconds)] $*" >> "$LOG"; } +# --- Agent State --- +STATE_LIB="/opt/teleo-eval/ops/agent-state/lib-state.sh" +if [ -f "$STATE_LIB" ]; then + source "$STATE_LIB" + HAS_STATE=true + SESSION_ID="${AGENT}-$(date +%Y%m%d-%H%M%S)" +else + HAS_STATE=false + log "WARN: agent-state lib not found, running without state" +fi + # --- Lock (prevent concurrent sessions for same agent) --- if [ -f "$LOCKFILE" ]; then pid=$(cat "$LOCKFILE" 2>/dev/null) @@ -178,6 +189,14 @@ git branch -D "$BRANCH" 2>/dev/null || true git checkout -b "$BRANCH" >> "$LOG" 2>&1 log "On branch $BRANCH" +# --- Pre-session state --- +if [ "$HAS_STATE" = true ]; then + state_start_session "$AGENT" "$SESSION_ID" "research" "$DOMAIN" "$BRANCH" "sonnet" "5400" > /dev/null 2>&1 || true + state_update_report "$AGENT" "researching" "Starting research session ${DATE}" 2>/dev/null || true + state_journal_append "$AGENT" "session_start" "session_id=$SESSION_ID" "type=research" "branch=$BRANCH" 2>/dev/null || true + log "Agent state: session started ($SESSION_ID)" +fi + # --- Build the research prompt --- # Write tweet data to a temp file so Claude can read it echo "$TWEET_DATA" > "$TWEET_FILE" @@ -188,6 +207,11 @@ RESEARCH_PROMPT="You are ${AGENT}, a Teleo knowledge base agent. Domain: ${DOMAI You have ~90 minutes of compute. Use it wisely. +### Step 0: Load Operational State (1 min) +Read /opt/teleo-eval/agent-state/${AGENT}/memory.md — this is your cross-session operational memory. It contains patterns, dead ends, open questions, and corrections from previous sessions. +Read /opt/teleo-eval/agent-state/${AGENT}/tasks.json — check for pending tasks assigned to you. +Check /opt/teleo-eval/agent-state/${AGENT}/inbox/ for messages from other agents. Process any high-priority inbox items before choosing your research direction. + ### Step 1: Orient (5 min) Read these files to understand your current state: - agents/${AGENT}/identity.md (who you are) @@ -229,7 +253,7 @@ Include which belief you targeted for disconfirmation and what you searched for. ### Step 6: Archive Sources (60 min) For each relevant tweet/thread, create an archive file: -Path: inbox/archive/YYYY-MM-DD-{author-handle}-{brief-slug}.md +Path: inbox/queue/YYYY-MM-DD-{author-handle}-{brief-slug}.md Use this frontmatter: --- @@ -267,7 +291,7 @@ EXTRACTION HINT: [what the extractor should focus on — scopes attention] - Set all sources to status: unprocessed (a DIFFERENT instance will extract) - Flag cross-domain sources with flagged_for_{agent}: [\"reason\"] - Do NOT extract claims yourself — write good notes so the extractor can -- Check inbox/archive/ for duplicates before creating new archives +- Check inbox/queue/ and inbox/archive/ for duplicates before creating new archives - Aim for 5-15 source archives per session ### Step 7: Flag Follow-up Directions (5 min) @@ -303,6 +327,8 @@ The journal accumulates session over session. After 5+ sessions, review it for c ### Step 9: Stop When you've finished archiving sources, updating your musing, and writing the research journal entry, STOP. Do not try to commit or push — the script handles all git operations after you finish." +CASCADE_PROCESSOR="/opt/teleo-eval/ops/agent-state/process-cascade-inbox.py" + # --- Run Claude research session --- log "Starting Claude research session..." timeout 5400 "$CLAUDE_BIN" -p "$RESEARCH_PROMPT" \ @@ -311,31 +337,61 @@ timeout 5400 "$CLAUDE_BIN" -p "$RESEARCH_PROMPT" \ --permission-mode bypassPermissions \ >> "$LOG" 2>&1 || { log "WARN: Research session failed or timed out for $AGENT" + # Process cascade inbox even on timeout (agent may have read them in Step 0) + if [ -f "$CASCADE_PROCESSOR" ]; then + python3 "$CASCADE_PROCESSOR" "$AGENT" 2>>"$LOG" || true + fi + if [ "$HAS_STATE" = true ]; then + state_end_session "$AGENT" "timeout" "0" "null" 2>/dev/null || true + state_update_report "$AGENT" "idle" "Research session timed out or failed on ${DATE}" 2>/dev/null || true + state_update_metrics "$AGENT" "timeout" "0" 2>/dev/null || true + state_journal_append "$AGENT" "session_end" "outcome=timeout" "session_id=$SESSION_ID" 2>/dev/null || true + log "Agent state: session recorded as timeout" + fi git checkout main >> "$LOG" 2>&1 exit 1 } log "Claude session complete" +# --- Process cascade inbox messages (log completion to pipeline.db) --- +if [ -f "$CASCADE_PROCESSOR" ]; then + CASCADE_RESULT=$(python3 "$CASCADE_PROCESSOR" "$AGENT" 2>>"$LOG") + [ -n "$CASCADE_RESULT" ] && log "Cascade: $CASCADE_RESULT" +fi + # --- Check for changes --- CHANGED_FILES=$(git status --porcelain) if [ -z "$CHANGED_FILES" ]; then log "No sources archived by $AGENT" + if [ "$HAS_STATE" = true ]; then + state_end_session "$AGENT" "completed" "0" "null" 2>/dev/null || true + state_update_report "$AGENT" "idle" "Research session completed with no new sources on ${DATE}" 2>/dev/null || true + state_update_metrics "$AGENT" "completed" "0" 2>/dev/null || true + state_journal_append "$AGENT" "session_end" "outcome=no_sources" "session_id=$SESSION_ID" 2>/dev/null || true + log "Agent state: session recorded (no sources)" + fi git checkout main >> "$LOG" 2>&1 exit 0 fi # --- Stage and commit --- -git add inbox/archive/ agents/${AGENT}/musings/ agents/${AGENT}/research-journal.md 2>/dev/null || true +git add inbox/queue/ agents/${AGENT}/musings/ agents/${AGENT}/research-journal.md 2>/dev/null || true if git diff --cached --quiet; then log "No valid changes to commit" + if [ "$HAS_STATE" = true ]; then + state_end_session "$AGENT" "completed" "0" "null" 2>/dev/null || true + state_update_report "$AGENT" "idle" "Research session completed with no valid changes on ${DATE}" 2>/dev/null || true + state_update_metrics "$AGENT" "completed" "0" 2>/dev/null || true + state_journal_append "$AGENT" "session_end" "outcome=no_valid_changes" "session_id=$SESSION_ID" 2>/dev/null || true + fi git checkout main >> "$LOG" 2>&1 exit 0 fi AGENT_UPPER=$(echo "$AGENT" | sed 's/./\U&/') -SOURCE_COUNT=$(git diff --cached --name-only | grep -c "^inbox/archive/" || echo "0") +SOURCE_COUNT=$(git diff --cached --name-only | grep -c "^inbox/queue/" || echo "0") git commit -m "${AGENT}: research session ${DATE} — ${SOURCE_COUNT} sources archived Pentagon-Agent: ${AGENT_UPPER} " >> "$LOG" 2>&1 @@ -375,6 +431,16 @@ Researcher and extractor are different Claude instances to prevent motivated rea log "PR #${PR_NUMBER} opened for ${AGENT}'s research session" fi +# --- Post-session state (success) --- +if [ "$HAS_STATE" = true ]; then + FINAL_PR="${EXISTING_PR:-${PR_NUMBER:-unknown}}" + state_end_session "$AGENT" "completed" "$SOURCE_COUNT" "$FINAL_PR" 2>/dev/null || true + state_finalize_report "$AGENT" "idle" "Research session completed: ${SOURCE_COUNT} sources archived" "$SESSION_ID" "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "completed" "$SOURCE_COUNT" "$BRANCH" "${FINAL_PR}" 2>/dev/null || true + state_update_metrics "$AGENT" "completed" "$SOURCE_COUNT" 2>/dev/null || true + state_journal_append "$AGENT" "session_end" "outcome=completed" "sources=$SOURCE_COUNT" "branch=$BRANCH" "pr=$FINAL_PR" 2>/dev/null || true + log "Agent state: session finalized (${SOURCE_COUNT} sources, PR #${FINAL_PR})" +fi + # --- Back to main --- git checkout main >> "$LOG" 2>&1 log "=== Research session complete for $AGENT ===" -- 2.45.2 From 60998d38377c82cbc2d7769fe6bd5481773625ca Mon Sep 17 00:00:00 2001 From: Teleo Agents Date: Tue, 14 Apr 2026 17:48:39 +0000 Subject: [PATCH 4/4] auto-fix: strip 1 broken wiki links Pipeline auto-fixer: removed [[ ]] brackets from links that don't resolve to existing claims in the knowledge base. --- entities/internet-finance/mtncapital.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/entities/internet-finance/mtncapital.md b/entities/internet-finance/mtncapital.md index 923a656b1..6b6e2e50f 100644 --- a/entities/internet-finance/mtncapital.md +++ b/entities/internet-finance/mtncapital.md @@ -71,7 +71,7 @@ mtnCapital is the **first empirical test of the unruggable ICO enforcement mecha Relevant Notes: - [[metadao]] — launch platform (curated ICO #1) - [[ranger-finance]] — second project to be liquidated via futarchy -- [[futarchy is manipulation-resistant because attack attempts create profitable opportunities for defenders]] — mtnCapital NAV arbitrage supports this claim +- futarchy is manipulation-resistant because attack attempts create profitable opportunities for defenders — mtnCapital NAV arbitrage supports this claim Topics: - [[internet finance and decision markets]] -- 2.45.2