From 7335353af40e6f7932f80242039aebc1c91775af Mon Sep 17 00:00:00 2001
From: Teleo Agents <agents@livingip.xyz>
Date: Sat, 4 Apr 2026 13:40:19 +0000
Subject: [PATCH] =?UTF-8?q?source:=202026-01-17-charnock-external-access-d?=
 =?UTF-8?q?angerous-capability-evals.md=20=E2=86=92=20processed?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Pentagon-Agent: Epimetheus <PIPELINE>
---
 ...ernal-access-dangerous-capability-evals.md |  5 +-
 ...ernal-access-dangerous-capability-evals.md | 55 -------------------
 2 files changed, 4 insertions(+), 56 deletions(-)
 delete mode 100644 inbox/queue/2026-01-17-charnock-external-access-dangerous-capability-evals.md
diff --git a/inbox/archive/ai-alignment/2026-01-17-charnock-external-access-dangerous-capability-evals.md b/inbox/archive/ai-alignment/2026-01-17-charnock-external-access-dangerous-capability-evals.md
index 947c933a..ca662291 100644
--- a/inbox/archive/ai-alignment/2026-01-17-charnock-external-access-dangerous-capability-evals.md
+++ b/inbox/archive/ai-alignment/2026-01-17-charnock-external-access-dangerous-capability-evals.md
@@ -7,9 +7,12 @@ date: 2026-01-17
 domain: ai-alignment
 secondary_domains: []
 format: paper
-status: unprocessed
+status: processed
+processed_by: theseus
+processed_date: 2026-04-04
 priority: high
 tags: [external-evaluation, access-framework, dangerous-capabilities, EU-Code-of-Practice, evaluation-independence, translation-gap, governance-bridge, AL1-AL2-AL3]
+extraction_model: "anthropic/claude-sonnet-4.5"
 ---
 
 ## Content
diff --git a/inbox/queue/2026-01-17-charnock-external-access-dangerous-capability-evals.md b/inbox/queue/2026-01-17-charnock-external-access-dangerous-capability-evals.md
deleted file mode 100644
index 947c933a..00000000
--- a/inbox/queue/2026-01-17-charnock-external-access-dangerous-capability-evals.md
+++ /dev/null
@@ -1,55 +0,0 @@
----
-type: source
-title: "Expanding External Access to Frontier AI Models for Dangerous Capability Evaluations"
-author: "Jacob Charnock, Alejandro Tlaie, Kyle O'Brien, Stephen Casper, Aidan Homewood"
-url: https://arxiv.org/abs/2601.11916
-date: 2026-01-17
-domain: ai-alignment
-secondary_domains: []
-format: paper
-status: unprocessed
-priority: high
-tags: [external-evaluation, access-framework, dangerous-capabilities, EU-Code-of-Practice, evaluation-independence, translation-gap, governance-bridge, AL1-AL2-AL3]
----
-
-## Content
-
-This paper proposes a three-tier access framework for external evaluators conducting dangerous capability assessments of frontier AI models. Published January 17, 2026, 20 pages, submitted to cs.CY (Computers and Society).
-
-**Three-tier Access Level (AL) taxonomy:**
-- **AL1 (Black-box)**: Minimal model access and information — evaluator interacts via API only, no internal model information
-- **AL2 (Grey-box)**: Moderate model access and substantial information — intermediate access to model behavior, some internal information
-- **AL3 (White-box)**: Complete model access and comprehensive information — full API access, architecture information, weights, internal reasoning
-
-**Core argument**: Current limited access arrangements (predominantly AL1) may compromise evaluation quality by creating false negatives — evaluations miss dangerous capabilities because evaluators can't probe the model deeply enough. AL3 access reduces false negatives and improves stakeholder trust.
-
-**Security and capacity challenges acknowledged**: The authors propose that access risks can be mitigated through "technical means and safeguards used in other industries" (e.g., privacy-enhancing technologies from Beers & Toner; clean-room evaluation protocols).
-
-**Regulatory framing**: The paper explicitly aims to operationalize the EU GPAI Code of Practice's requirement for "appropriate access" in dangerous capability evaluations — one of the first attempts to provide technical specification for what "appropriate access" means in regulatory practice.
-
-**Authors**: Affiliation details not confirmed from abstract page; the paper's focus on EU regulatory operationalization and involvement of Stephen Casper (AI safety researcher) suggests alignment-safety-governance focus.
-
-## Agent Notes
-
-**Why this matters:** This is the clearest academic bridge-building work between research evaluations and compliance requirements I found this session. The EU Code of Practice says evaluators need "appropriate access" but doesn't define it. This paper proposes a specific technical taxonomy for what appropriate access means at different capability levels. It addresses the translation gap directly.
-
-**What surprised me:** The paper explicitly cites privacy-enhancing technologies (similar to what Beers & Toner proposed in arXiv:2502.05219, archived March 2026) as a way to enable AL3 access without IP compromise. This suggests the research community is converging on PET + white-box access as the technical solution to the independence problem.
-
-**What I expected but didn't find:** I expected more discussion of what labs have agreed to in current voluntary evaluator access arrangements (METR, AISI) — the paper seems to be proposing a framework rather than documenting what already exists. The gap between the proposed AL3 standard and current practice (AL1/AL2) isn't quantified.
-
-**KB connections:**
-- Directly extends: 2026-03-21-research-compliance-translation-gap.md (addresses Translation Gap Layer 3)
-- Connects to: arXiv:2502.05219 (Beers & Toner, PET scrutiny) — archived previously
-- Connects to: Brundage et al. AAL framework (arXiv:2601.11699) — parallel work on evaluation independence
-- Connects to: EU Code of Practice "appropriate access" requirement (new angle on Code inadequacy)
-
-**Extraction hints:**
-1. New claim candidate: "external evaluators of frontier AI currently have predominantly black-box (AL1) access, which creates systematic false negatives in dangerous capability detection"
-2. New claim: "white-box (AL3) access to frontier models is technically feasible via privacy-enhancing technologies without requiring IP disclosure"
-3. The paper provides the missing technical specification for what the EU Code of Practice's "appropriate access" requirement should mean in practice — this is a claim about governance operationalization
-
-## Curator Notes
-
-PRIMARY CONNECTION: domains/ai-alignment/third-party-evaluation-infrastructure claims and translation-gap finding
-WHY ARCHIVED: First paper to propose specific technical taxonomy for what "appropriate evaluator access" means — bridges research evaluation standards and regulatory compliance language
-EXTRACTION HINT: Focus on the claim that AL1 access is currently the norm and creates false negatives; the AL3 PET solution as technically feasible is the constructive KB contribution