·Demo

A real, on-chain agent evaluation.

Before the eval ran, the grader’s policy was anchored. After, a manifest of the verdicts was anchored. Both receipts are co-attested by Bitcoin SV miners. Download either, drop it into /verify, and confirm the chain order in your browser.

API examples below are against https://app.satsignal.cloud; we elide the host in body copy.

01The story

A grader agent scored five student answers.

The grader’s rubric, instruction, tool permissions, budget, and model config were hashed and anchored as a policy_snapshot receipt before any grading began. Then five short math answers were scored. The five verdicts were Merkle-batched into a single evidence_bundle receipt. The chain timestamps are one second apart — the policy provably predates the verdicts.

Why this matters. A reviewer holding either receipt can prove, without trusting Satsignal:

the grader was running under THIS rubric / instruction / tools / budget / model config when it produced the verdicts;
any single verdict belongs to the same batch as the others, and matches the on-chain Merkle root;
neither receipt has been edited or back-dated, because both are signed by an independent miner and confirmed in a block.

02Step 1 — Policy snapshot

The grader’s policy, hashed and anchored.

Five components were hashed independently, then assembled into a snapshot. The snapshot itself was canonicalized + sha256’d; that hash was POSTed to /api/v1/anchors with category: "policy_snapshot".

system_policy_hash: b47192dd…2346cdd4 sha256 of the rubric text
user_instruction_hash: 2b3d30bb…79c5bc99 “Grade the following five student submissions…”
tool_permissions_hash: 4f53cda1…1202b945 empty list (no tools)
budget_limits_hash: 0f89a3ee…b6b64e52 {max_items: 5, max_seconds: 60}
model_config_hash: 25591aba…cc79c21f claude-opus-4-7, temperature 0.0

BSV mainnet policy_snapshot Co-attested by gorillapool

Policy snapshot receipt

Snapshot taken: 2026-05-07 17:45:22 UTC
Anchor SHA-256: 5a7b55c6…68bc98ef
Bundle ID: a723cec3ac9a44b1
Transaction: 41003a19…071817a4
Miner accepted: 2026-05-07 17:45:23 UTC (gorillapool, SEEN_ON_NETWORK)

Download policy bundle (.mbnt) View tx on WhatsOnChain

03Step 2 — Eval run

Five short math answers, graded deterministically.

The grader scored five student answers against the rubric. The grading function is hardcoded so the demo is hermetic and reproducible — replace it with a real LLM call to adapt to your own eval.

Item	Expected	Student	Verdict	Score
`Q1`	12	12	correct	1
`Q2`	24	21	incorrect	0
`Q3`	7	7	correct	1
`Q4`	84	12 * 7 = 84	correct	1
`Q5`	10	9	incorrect	0

Aggregate: 3/5 correct.

04Step 3 — Result manifest

Five verdicts, one on-chain anchor.

Each verdict was canonicalized + sha256’d. Those five hashes (with their item labels) were sent to the API; the server then hashed each {label, sha256_hex} pair into a Merkle leaf, combined the leaves into a tree, and anchored the root as a single evidence_bundle receipt. Any one verdict can later be revealed (with its inclusion path) without revealing the other four.

Item	Verdict SHA-256 (sha256 of the canonical verdict JSON)
`Q1`	`10343a87…aa921669`
`Q2`	`f68246b5…4b56894c`
`Q3`	`fbdb4ed5…53f8620c`
`Q4`	`e864923c…fb276519`
`Q5`	`dbdd7436…d10593ce`

BSV mainnet evidence_bundle Co-attested by gorillapool 5-leaf manifest

Result manifest receipt

Anchored: 2026-05-07 17:45:24 UTC
Merkle root: 84edc887…1e12b4f3
Bundle ID: 2985f00ae0da4ac9
Transaction: c18b671f…74399c53
Leaf count: 5
Miner accepted: 2026-05-07 17:45:24 UTC (gorillapool, SEEN_ON_NETWORK)

Download manifest bundle (.mbnt) View tx on WhatsOnChain

05Step 4 — Verify

Confirm the chain order in your browser.

Drop either bundle into the verifier. The three pills populate from the receipt + a public block-explorer lookup — no Satsignal API call at verify time.

Committed

Miner-signed acceptance.

The receipt’s acceptance block carries gorillapool’s signed accept-time and status. Pill renders by gorillapool at <timestamp>.

Confirmed

WhatsOnChain says so.

Verifier fetches the tx from a public explorer and compares the OP_RETURN payload against the canonical doc hash. Pill renders On chain in tx <prefix>…<suffix>.

Verified

Manifest reconstruction.

For the manifest receipt, the verifier walks every leaf hash and reproduces the Merkle root, then compares to subject.root. For the policy snapshot, drop the snapshot JSON to confirm its sha256 matches the anchor.

Open the verifier → then drop policy.mbnt manifest.mbnt

06Next

Build this in your own integration.

The two helpers below produced the receipts on this page. Plain Python, stdlib only, no Satsignal SDK. Drop them into an agent runtime, a CI step, or a one-off shell.

example_agent_snapshot.py

A ~30-line agent that hashes its five policy components, optionally POSTs the snapshot to /api/v1/anchors when SATSIGNAL_API_KEY is set, then takes a deterministic action. Replace the decide() stub with a real agent loop and you have the same policy-anchored receipt this demo opens with.

Source →

policy_snapshot.py

Stdlib helper for the policy_snapshot primitive. CLI subcommands hash-component / build / verify. Selective-disclosure verify (one component at a time) just works.

Source → · Quickstart →

The full helper catalog — including commit_reveal.py for commit-then-reveal flows and merkle_row.py for row-level disclosure — is at docs — helpers.