Compare commits

...

4 commits

Author SHA1 Message Date
Rio
98da5f0874 rio: eval pipeline test claim
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
Pentagon-Agent: Rio <2EA8DBCB-A29B-43E8-B726-45E571A1F3C8>
Model: test
2026-04-14 17:47:17 +00:00
Teleo Agents
baa9408ca4 source: 2026-03-21-international-ai-safety-report-2026-evaluation-gap.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:47:09 +00:00
Teleo Agents
460526000a source: 2026-03-21-harvard-jolt-sandbagging-risk-allocation.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:46:25 +00:00
Teleo Agents
d4e0e25714 source: 2026-03-21-arxiv-probing-evaluation-awareness.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:45:40 +00:00
4 changed files with 41 additions and 3 deletions

View file

@ -0,0 +1,29 @@
---
type: claim
domain: internet-finance
description: "Eval pipeline test claim — verifies automated review and merge on Forgejo"
confidence: speculative
source: "eval pipeline integration test"
created: 2026-03-09
---
# Eval pipeline test claim — this file should be auto-reviewed and merged
This is a test claim created to verify the Forgejo-native eval pipeline. If this file appears in the repo, the pipeline is working end-to-end:
1. Rio created a branch on Forgejo
2. Rio pushed a claim file
3. Rio opened a PR
4. The orchestrator detected the PR
5. Leo reviewed (and potentially a domain agent)
6. Auto-merge triggered on approval
This claim should be deleted after verification.
---
Relevant Notes:
- [[_map]]
Topics:
- [[internet finance and decision markets]]

View file

@ -7,9 +7,12 @@ date: 2025-07-01
domain: ai-alignment
secondary_domains: []
format: paper
status: unprocessed
status: processed
processed_by: theseus
processed_date: 2026-04-14
priority: high
tags: [evaluation-awareness, sandbagging, interpretability, safety-evaluation, behavioral-evaluation-validity, governance-implications]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content

View file

@ -7,10 +7,13 @@ date: 2025-01-01
domain: ai-alignment
secondary_domains: [internet-finance]
format: paper
status: unprocessed
status: processed
processed_by: theseus
processed_date: 2026-04-14
priority: medium
tags: [sandbagging, legal-liability, risk-allocation, M&A, governance, product-liability, securities-fraud]
flagged_for_rio: ["AI liability and risk allocation mechanisms connect to financial contracts and M&A; the contractual mechanisms proposed could be relevant to how alignment risk is priced"]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content

View file

@ -7,9 +7,12 @@ date: 2026-02-01
domain: ai-alignment
secondary_domains: []
format: paper
status: unprocessed
status: processed
processed_by: theseus
processed_date: 2026-04-14
priority: medium
tags: [evaluation-gap, governance, international-coordination, AI-Safety-Report, evidence-dilemma, voluntary-commitments, situational-awareness]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content