Pre-registration and selective disclosure for AI evaluations.
An external reviewer — a peer reviewer, a second-party auditor, a journalist with a tip, a future reader in 2030 — will want byte-identical evidence of what your evaluation was set up to measure, what came out, and that the design existed before the data did. Satsignal anchors that evidence to a public chain, so the reviewer can verify it hasn’t been edited — without trusting your dashboard, your collaborator, or us.
Selection, exposure, and the long tail of unpublished runs.
Every published evaluation result has to defend against three structural critiques: you tweaked the design after seeing the data, you can’t share the underlying transcripts without leaking your test set, and you only published the runs that looked good. Each maps to a primitive that’s live on the API today; each verifies independently in any browser against any public block explorer.
Pre-register the design before the run
Hash the rubric, prompt, decoding parameters, model
config, scoring function, and test-set identifier. Anchor
the snapshot via a policy_snapshot
before any data is generated. The chain timestamp
proves the design existed before the run, so a later
reviewer can rule out post-hoc tweaking of the parts that
were committed.
Disclose one transcript without leaking the rest
Roll all transcripts, scored rows, or per-prompt
outputs into one Merkle-batched
evidence_bundle — up to 10,000 items
per receipt. Hand a single item to a reviewer with its
inclusion path; the other 9,999 stay sealed. Same shape
via merkle-row-sealed-v1 when individual rows
are low-entropy and would otherwise be guessable from
their hash.
Surface the unpublished runs at the matter level
All anchors a lab makes under one matter slug list at
GET /api/v1/matters/<slug>/anchors —
soft-deletes included. A reviewer who sees five published
receipts in a matter and one hundred anchors in the listing
can ask the obvious follow-up question. Honest scoping note
in the disclosures below.
What an outside reader of a 2030 paper actually needs.
Reviewers and replicators arrive at different times, with different access. Two properties matter across all three primitives above — one about who can verify, one about when.
.mbnt bundle plus a public block explorer is
enough; the
in-browser
verifier at proof.satsignal.cloud works as
a convenience but is not load-bearing. The verification
recipe is documented in the
public spec and reproducible in any
language — the cold-start auditor walked the protocol
from spec alone in Go without our helpers, end of April.
.mbnt bundle, the original payload (or
the leaf to be verified), the chain transaction, and the
MBNT format spec. The bundle is small (a few KB even at the
10,000-leaf limit), holds locally without any Satsignal
service in the loop, and is well-suited to journal
supplementary materials, Zenodo, OSF, or arXiv attachments.
The MBNT wire format is published at
/spec-mbnt; a verifier could be
reimplemented from spec long after this site has been turned
off, against any public BSV block explorer that survives.
Pre-register an evaluation, then anchor the results.
The opening move: hash the components that define the
evaluation, build a snapshot, anchor its sha256 with
category: "policy_snapshot" — before
you run anything. Then run the eval, batch the per-prompt
results, and anchor the manifest. Two anchors, one second
apart on chain, with the design provably preceding the data.
The
policy_snapshot.py helper is
stdlib-only; no SDK to install.
curl -O https://satsignal.cloud/policy_snapshot.py
# Step 1. Pre-registration. Hash the five components that pin down
# the evaluation: rubric, instruction template, decoding/tools,
# budget caps, model config (incl. test-set hash if applicable).
RUB=$(python3 policy_snapshot.py hash-component --file rubric.md | jq -r .sha256_hex)
INS=$(python3 policy_snapshot.py hash-component --file instruction.txt | jq -r .sha256_hex)
DEC=$(python3 policy_snapshot.py hash-component --json-file decoding.json | jq -r .sha256_hex)
BUD=$(python3 policy_snapshot.py hash-component --json-string '{"max_calls":500}' | jq -r .sha256_hex)
MOD=$(python3 policy_snapshot.py hash-component --json-file model_cfg.json | jq -r .sha256_hex)
python3 policy_snapshot.py build \
--agent-name eval-2026-q2 \
--agent-version v1 \
--system-policy-hash $RUB \
--user-instruction-hash $INS \
--tool-permissions-hash $DEC \
--budget-limits-hash $BUD \
--model-config-hash $MOD \
--out preregister.json
# Anchor the design BEFORE running anything. Use a stable matter
# slug for the project so all sibling anchors list together.
SHA=$(jq -r .anchor.sha256_hex preregister.json)
SIZE=$(jq -r .anchor.file_size preregister.json)
curl -H "Authorization: Bearer sk_..." \
-H "Content-Type: application/json" \
-d "{\"matter_slug\":\"acme-evals-2026-q2\",\"sha256_hex\":\"$SHA\", \
\"file_size\":$SIZE,\"category\":\"policy_snapshot\", \
\"label\":\"pre-registration $(date -u +%FT%TZ)\"}" \
https://app.satsignal.cloud/api/v1/anchors
# Step 2. Run the eval. Score each row. Build a manifest of the
# per-prompt outputs (or scored transcripts) and anchor the root.
# See /uses.html#manifest for the manifest body shape.
# Step 3. Reviewer side, later: verify any one component without
# seeing the others. Hand the reviewer rubric.md plus
# preregister.json; rest of the design stays sealed.
python3 policy_snapshot.py verify \
--snapshot preregister.json \
--system-policy-file rubric.md
# {"verified": true, "matched": ["system_policy_hash"]}
Honest limits of what the chain anchor proves.
Satsignal is not a benchmark, a peer-review service, or a safety-institute endorsement. Specifically:
- Pre-registration via a single matter is defeatable. A lab can pre-register a hundred candidate designs across a hundred separate matter slugs and only publish the matching one; the matter-level listing endpoint above surfaces siblings within a matter, not across matters. Defending against cross-matter selection requires a community norm — for example, publishing the project’s matter slug at design time — not a cryptographic primitive. We don’t have a fix for this and won’t pretend to.
- An anchor proves a design existed at a moment, not that it was followed. The chain timestamp confirms the snapshot’s sha256 was committed before the run; it does not confirm the agent ran under that policy, that the test set was untouched, or that the scoring code matched the rubric. Those need their own evidence.
- Selective disclosure does not validate the held-back rows. A reviewer who sees one transcript verified against the manifest root learns nothing about the other 9,999 transcripts — including whether they exist, whether they were scored consistently, or whether some were dropped pre-anchor. Replication remains the authority.
- Satsignal is not endorsed by, affiliated with, or recognized under any public AI safety institute, standards body, or government evaluation programme. This page describes a workflow that the cryptographic primitives support; it makes no claim that any specific institution accepts the resulting receipts.
- The receipt is not the artifact. The .mbnt bundle and the chain transaction prove a hash existed at a moment. The artifact — the rubric, the transcripts, the model config — is yours to archive (Zenodo, OSF, arXiv supplementary, your institutional repository). A 2030 reader needs both.
What Satsignal supplies is one verifiable property in your stack: a third party can re-hash the payload, walk the Merkle path if needed, and check the on-chain transaction in any block explorer — without trusting Satsignal, your platform, or your collaborator. That property is useful in methods sections, supplementary materials, replication protocols, and external audit packets. It is not a substitute for any of the others.