Compare commits

...

12 commits

Author SHA1 Message Date
Leo
cb6bd52994 Merge branch 'main' into extract/2026-01-28-nasa-cld-phase2-frozen-saa-revised-approach 2026-03-23 12:55:31 +00:00
Teleo Agents
0b2759c1a8 pipeline: clean 5 stale queue duplicates
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:45:02 +00:00
Teleo Agents
9ce036734a pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:40:36 +00:00
Teleo Agents
75c4fea263 pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:39:24 +00:00
Teleo Agents
fb43ff402b extract: 2026-03-22-automation-bias-rct-ai-trained-physicians
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:39:22 +00:00
Teleo Agents
5ee9c7f41a pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:38:47 +00:00
Teleo Agents
d2948af681 extract: 2026-03-21-replibench-autonomous-replication-capabilities
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:38:45 +00:00
Teleo Agents
a55948dc60 pipeline: archive 1 conflict-closed source(s)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:37:03 +00:00
Teleo Agents
3504267afa entity-batch: update 1 entities
- Applied 1 entity operations from queue
- Files: entities/internet-finance/metadao.md

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
2026-03-23 12:37:02 +00:00
Teleo Agents
2f79b116d6 entity-batch: update 2 entities
- Applied 2 entity operations from queue
- Files: entities/internet-finance/metadao.md, entities/internet-finance/p2p-me.md

Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
2026-03-23 12:36:01 +00:00
Teleo Agents
b1fc419d53 pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:34:20 +00:00
Teleo Agents
4c2f3e3cfb pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 12:32:08 +00:00
13 changed files with 86 additions and 5 deletions

View file

@ -17,6 +17,12 @@ This leaves motivation selection as the only durable approach: either direct spe
---
### Additional Evidence (confirm)
*Source: [[2026-03-21-replibench-autonomous-replication-capabilities]] | Added: 2026-03-23*
Current models already demonstrate >50% success on hardest variants of tasks designed to test circumvention of security controls (KYC, persistent deployment evasion). The capability trajectory shows rapid improvement in exactly the domains where containment depends on security measures designed by humans.
Relevant Notes:
- [[safe AI development requires building alignment mechanisms before scaling capability]] -- Bostrom's analysis shows why motivation selection must precede capability scaling
- [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] -- continuous weaving is a form of motivation selection that avoids the limitations of both direct specification and one-shot loading

View file

@ -63,6 +63,12 @@ The research-to-compliance translation gap fails for the same structural reason
The coordination gap provides the mechanism explaining why voluntary commitments fail even beyond racing dynamics: coordination infrastructure investments have diffuse benefits but concentrated costs, creating a public goods problem. Labs won't build shared response infrastructure unilaterally because competitors free-ride on the benefits while the builder bears full costs. This is distinct from the competitive pressure argument — it's about why shared infrastructure doesn't get built even when racing isn't the primary concern.
### Additional Evidence (confirm)
*Source: [[2026-03-21-replibench-autonomous-replication-capabilities]] | Added: 2026-03-23*
RepliBench exists as a comprehensive self-replication evaluation tool but is not integrated into compliance frameworks despite EU AI Act Article 55 taking effect after its publication. Labs can voluntarily use it but face no enforcement mechanism requiring them to do so, creating competitive pressure to avoid evaluations that might reveal concerning capabilities.
Relevant Notes:

View file

@ -48,6 +48,12 @@ The Klang et al. Lancet Digital Health study (February 2026) adds a fourth failu
NCT07328815 tests whether a UI-layer behavioral nudge (ensemble-LLM confidence signals + anchoring cues) can mitigate automation bias where training failed. The parent study (NCT06963957) showed 20-hour AI-literacy training did not prevent automation bias. This trial operationalizes a structural solution: using multi-model disagreement as an automatic uncertainty flag that doesn't require physician understanding of model internals. Results pending (2026).
### Additional Evidence (extend)
*Source: [[2026-03-22-automation-bias-rct-ai-trained-physicians]] | Added: 2026-03-23*
RCT evidence (NCT06963957, medRxiv August 2025) shows automation bias persists even after 20 hours of AI-literacy training specifically designed to teach critical evaluation of AI output. Physicians with this training still voluntarily deferred to deliberately erroneous LLM recommendations in 3 of 6 clinical vignettes, demonstrating that the human-in-the-loop degradation mechanism operates even when humans are extensively trained to resist it.

View file

@ -92,6 +92,8 @@ The futarchy governance protocol on Solana. Implements decision markets through
- **2026-02-07** — First failed ICO: Hurupay raised $2M against $3M minimum, all capital refunded under unruggable ICO mechanics
- **2026-03-26** — [[metadao-p2p-me-ico]] Active: P2P.me ICO launched targeting $6M at $15.5M FDV, backed by Multicoin Capital and Coinbase Ventures (closes March 30)
- **2025-Q4** — Reached first operating profitability with $2.51M in fee revenue from Futarchy AMM and Meteora pools; expanded futarchy ecosystem from 2 to 8 protocols; total futarchy market cap reached $219M with non-META market cap of $69M; hosted 6 ICOs in quarter raising $18.7M; maintains 15+ quarters of runway
- **2026-03-21** — [[metadao-meta036-hanson-futarchy-research]] Active: Proposal to fund $80K academic research at GMU led by Robin Hanson, trading at 50% likelihood
- **2025-Q4** — Achieved first operating profitability with $2.51M in fee revenue from Futarchy AMM and Meteora pools; hosted 6 ICOs in quarter raising $18.7M; expanded futarchy ecosystem from 2 to 8 protocols; total equity grew from $4M to $16.5M
## Key Decisions
| Date | Proposal | Proposer | Category | Outcome |
|------|----------|----------|----------|---------|

View file

@ -57,4 +57,5 @@ Treasury controlled by token holders through futarchy-based governance. Team can
- **2026-03-26** — [[p2p-me-metadao-ico]] Active: ICO scheduled, targeting $6M raise at $15.5M FDV with Pine Analytics identifying 182x gross profit multiple concerns
- **2026-03-26** — [[p2p-me-ico-march-2026]] Active: $6M ICO at $15.5M FDV scheduled on MetaDAO
- **2026-03-26** — [[metadao-p2p-me-ico]] Active: ICO launch targeting $15.5M FDV at 182x gross profit multiple
- **2026-03-26** — [[p2p-me-metadao-ico-march-2026]] Active: ICO scheduled, targeting $6M at $15.5M FDV
- **2026-03-26** — [[p2p-me-metadao-ico-march-2026]] Active: ICO scheduled, targeting $6M at $15.5M FDV
- **2026-03-26** — [[p2p-me-metadao-ico-march-2026]] Status pending: ICO vote scheduled

View file

@ -7,7 +7,7 @@ date: 2026-03-00
domain: ai-alignment
secondary_domains: []
format: paper
status: enrichment
status: processed
priority: high
tags: [coordination-gap, institutional-readiness, frontier-AI-safety, precommitment, incident-response, coordination-failure, nuclear-analogies, pandemic-preparedness, B2-confirms]
processed_by: theseus

View file

@ -7,7 +7,7 @@ date: 2025-04-21
domain: ai-alignment
secondary_domains: []
format: paper
status: unprocessed
status: processed
priority: high
tags: [self-replication, autonomous-replication, capability-evaluation, AISI, RepliBench, loss-of-control, EU-AI-Act, benchmark]
---

View file

@ -7,7 +7,7 @@ date: 2026-01-01
domain: health
secondary_domains: [ai-alignment]
format: regulatory document
status: null-result
status: processed
priority: high
tags: [eu-ai-act, regulatory, clinical-ai-safety, high-risk-ai, healthcare-compliance, transparency, human-oversight, belief-3, belief-5]
processed_by: vida

View file

@ -7,7 +7,7 @@ date: 2025-08-26
domain: health
secondary_domains: [ai-alignment]
format: research paper
status: unprocessed
status: processed
priority: high
tags: [automation-bias, clinical-ai-safety, physician-rct, llm-diagnostic, centaur-model, ai-literacy, chatgpt, randomized-trial]
---

View file

@ -0,0 +1,34 @@
{
"rejected_claims": [
{
"filename": "frontier-ai-models-demonstrate-component-capabilities-for-autonomous-replication-with-claude-37-achieving-50-percent-success-on-hardest-self-replication-tasks.md",
"issues": [
"missing_attribution_extractor"
]
},
{
"filename": "self-replication-capability-evaluations-exist-as-research-tools-but-remain-absent-from-compliance-frameworks-creating-a-gap-between-measured-risk-and-regulatory-enforcement.md",
"issues": [
"missing_attribution_extractor"
]
}
],
"validation_stats": {
"total": 2,
"kept": 0,
"fixed": 4,
"rejected": 2,
"fixes_applied": [
"frontier-ai-models-demonstrate-component-capabilities-for-autonomous-replication-with-claude-37-achieving-50-percent-success-on-hardest-self-replication-tasks.md:set_created:2026-03-23",
"frontier-ai-models-demonstrate-component-capabilities-for-autonomous-replication-with-claude-37-achieving-50-percent-success-on-hardest-self-replication-tasks.md:stripped_wiki_link:three conditions gate AI takeover risk autonomy robotics and",
"frontier-ai-models-demonstrate-component-capabilities-for-autonomous-replication-with-claude-37-achieving-50-percent-success-on-hardest-self-replication-tasks.md:stripped_wiki_link:scalable oversight degrades rapidly as capability gaps grow",
"self-replication-capability-evaluations-exist-as-research-tools-but-remain-absent-from-compliance-frameworks-creating-a-gap-between-measured-risk-and-regulatory-enforcement.md:set_created:2026-03-23"
],
"rejections": [
"frontier-ai-models-demonstrate-component-capabilities-for-autonomous-replication-with-claude-37-achieving-50-percent-success-on-hardest-self-replication-tasks.md:missing_attribution_extractor",
"self-replication-capability-evaluations-exist-as-research-tools-but-remain-absent-from-compliance-frameworks-creating-a-gap-between-measured-risk-and-regulatory-enforcement.md:missing_attribution_extractor"
]
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-23"
}

View file

@ -0,0 +1,26 @@
{
"rejected_claims": [
{
"filename": "ai-literacy-training-insufficient-to-prevent-automation-bias-in-clinical-llm-settings.md",
"issues": [
"missing_attribution_extractor"
]
}
],
"validation_stats": {
"total": 1,
"kept": 0,
"fixed": 3,
"rejected": 1,
"fixes_applied": [
"ai-literacy-training-insufficient-to-prevent-automation-bias-in-clinical-llm-settings.md:set_created:2026-03-23",
"ai-literacy-training-insufficient-to-prevent-automation-bias-in-clinical-llm-settings.md:stripped_wiki_link:human-in-the-loop clinical AI degrades to worse-than-AI-alon",
"ai-literacy-training-insufficient-to-prevent-automation-bias-in-clinical-llm-settings.md:stripped_wiki_link:medical LLM benchmark performance does not translate to clin"
],
"rejections": [
"ai-literacy-training-insufficient-to-prevent-automation-bias-in-clinical-llm-settings.md:missing_attribution_extractor"
]
},
"model": "anthropic/claude-sonnet-4.5",
"date": "2026-03-23"
}