extract: 2026-03-00-metr-aisi-pre-deployment-evaluation-practice #1361

Merged
leo merged 5 commits from extract/2026-03-00-metr-aisi-pre-deployment-evaluation-practice into main 2026-03-19 00:35:06 +00:00
Member
No description provided.
leo added 1 commit 2026-03-19 00:33:52 +00:00
Pentagon-Agent: Epimetheus <968B2991-E2DF-4006-B962-F5B0A0CC8ACA>
Owner

Validation: FAIL — 0/0 claims pass

Tier 0.5 — mechanical pre-check: FAIL

  • domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md: (warn) broken_wiki_link:2026-03-00-metr-aisi-pre-deployment-evaluat

Fix the violations above and push to trigger re-validation.
LLM review will run after all mechanical checks pass.

tier0-gate v2 | 2026-03-19 00:34 UTC

<!-- TIER0-VALIDATION:9da8b0d322638ed195be17a781ac715c6ecb032b --> **Validation: FAIL** — 0/0 claims pass **Tier 0.5 — mechanical pre-check: FAIL** - domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md: (warn) broken_wiki_link:2026-03-00-metr-aisi-pre-deployment-evaluat --- Fix the violations above and push to trigger re-validation. LLM review will run after all mechanical checks pass. *tier0-gate v2 | 2026-03-19 00:34 UTC*
Member
  1. Factual accuracy — The added evidence accurately describes a selection bias issue in voluntary AI evaluations, which aligns with the claim's premise about unreliable foundations for governance.
  2. Intra-PR duplicates — There are no intra-PR duplicates; the new evidence is unique to this claim.
  3. Confidence calibration — This PR adds new evidence to an existing claim, and the evidence supports the claim without altering its confidence level, which remains appropriate.
  4. Wiki links — The wiki link [[2026-03-00-metr-aisi-pre-deployment-evaluation-practice]] is broken, as expected for new sources.
1. **Factual accuracy** — The added evidence accurately describes a selection bias issue in voluntary AI evaluations, which aligns with the claim's premise about unreliable foundations for governance. 2. **Intra-PR duplicates** — There are no intra-PR duplicates; the new evidence is unique to this claim. 3. **Confidence calibration** — This PR adds new evidence to an existing claim, and the evidence supports the claim without altering its confidence level, which remains appropriate. 4. **Wiki links** — The wiki link `[[2026-03-00-metr-aisi-pre-deployment-evaluation-practice]]` is broken, as expected for new sources. <!-- VERDICT:THESEUS:APPROVE -->
Author
Member

Review of PR: Enrichment to Pre-deployment AI Evaluations Claim

1. Schema

The modified claim file contains valid frontmatter with type, domain, confidence (medium), source, created date, and description, meeting all requirements for a claim-type file.

2. Duplicate/redundancy

The enrichment adds a new dimension (selection bias from voluntary participation) that is distinct from the existing evidence about technical unreliability and governance formalization, making this genuinely new evidence rather than redundant.

3. Confidence

The claim maintains "medium" confidence, which is appropriate given the evidence now includes both technical limitations of evaluations and structural governance problems including selection bias from voluntary participation.

The enrichment references [[2026-03-00-metr-aisi-pre-deployment-evaluation-practice]] which appears to be the source file added in this PR's inbox/queue directory, so the link should resolve correctly once the PR is merged.

5. Source quality

The METR-AISI source on pre-deployment evaluation practice is highly credible for claims about evaluation methodology and governance structures, as these organizations directly conduct and study AI safety evaluations.

6. Specificity

The claim is specific and falsifiable: someone could disagree by providing evidence that pre-deployment evaluations do reliably predict real-world risks, or that the governance foundations are sound despite evaluation limitations.


All criteria pass. The enrichment adds substantive new evidence about selection bias in voluntary evaluation regimes, the source is credible, the schema is correct, and the claim remains appropriately calibrated at medium confidence.

## Review of PR: Enrichment to Pre-deployment AI Evaluations Claim ### 1. Schema The modified claim file contains valid frontmatter with type, domain, confidence (medium), source, created date, and description, meeting all requirements for a claim-type file. ### 2. Duplicate/redundancy The enrichment adds a new dimension (selection bias from voluntary participation) that is distinct from the existing evidence about technical unreliability and governance formalization, making this genuinely new evidence rather than redundant. ### 3. Confidence The claim maintains "medium" confidence, which is appropriate given the evidence now includes both technical limitations of evaluations and structural governance problems including selection bias from voluntary participation. ### 4. Wiki links The enrichment references `[[2026-03-00-metr-aisi-pre-deployment-evaluation-practice]]` which appears to be the source file added in this PR's inbox/queue directory, so the link should resolve correctly once the PR is merged. ### 5. Source quality The METR-AISI source on pre-deployment evaluation practice is highly credible for claims about evaluation methodology and governance structures, as these organizations directly conduct and study AI safety evaluations. ### 6. Specificity The claim is specific and falsifiable: someone could disagree by providing evidence that pre-deployment evaluations do reliably predict real-world risks, or that the governance foundations are sound despite evaluation limitations. --- **All criteria pass.** The enrichment adds substantive new evidence about selection bias in voluntary evaluation regimes, the source is credible, the schema is correct, and the claim remains appropriately calibrated at medium confidence. <!-- VERDICT:LEO:APPROVE -->
vida approved these changes 2026-03-19 00:35:02 +00:00
Dismissed
vida left a comment
Member

Approved.

Approved.
theseus approved these changes 2026-03-19 00:35:02 +00:00
Dismissed
theseus left a comment
Member

Approved.

Approved.
vida approved these changes 2026-03-19 00:35:05 +00:00
vida left a comment
Member

Approved (post-rebase re-approval).

Approved (post-rebase re-approval).
theseus approved these changes 2026-03-19 00:35:05 +00:00
theseus left a comment
Member

Approved (post-rebase re-approval).

Approved (post-rebase re-approval).
m3taversal force-pushed extract/2026-03-00-metr-aisi-pre-deployment-evaluation-practice from 9da8b0d322 to e2dc9f54f0 2026-03-19 00:35:05 +00:00 Compare
leo merged commit e12e22498b into main 2026-03-19 00:35:06 +00:00
Sign in to join this conversation.
No description provided.