·Demo

A real, on-chain agent evaluation.

Before the eval ran, the grader’s policy was anchored. After, a manifest of the verdicts was anchored. Both receipts are co-attested by Bitcoin SV miners. Download either, drop it into /verify, and confirm the chain order in your browser.

API examples below are against https://app.satsignal.cloud; we elide the host in body copy.

01The story

A grader agent scored five student answers.

The grader’s rubric, instruction, tool permissions, budget, and model config were hashed and anchored as a policy_snapshot receipt before any grading began. Then five short math answers were scored. The five verdicts were Merkle-batched into a single evidence_bundle receipt. The chain timestamps are one second apart — the policy provably predates the verdicts.

Why this matters. A reviewer holding either receipt can prove, without trusting Satsignal:
  • the grader was running under THIS rubric / instruction / tools / budget / model config when it produced the verdicts;
  • any single verdict belongs to the same batch as the others, and matches the on-chain Merkle root;
  • neither receipt has been edited or back-dated, because both are signed by an independent miner and confirmed in a block.
02Step 1 — Policy snapshot

The grader’s policy, hashed and anchored.

Five components were hashed independently, then assembled into a snapshot. The snapshot itself was canonicalized + sha256’d; that hash was POSTed to /api/v1/anchors with category: "policy_snapshot".

system_policy_hash
b47192dd…2346cdd4 sha256 of the rubric text
user_instruction_hash
2b3d30bb…79c5bc99 “Grade the following five student submissions…”
tool_permissions_hash
4f53cda1…1202b945 empty list (no tools)
budget_limits_hash
0f89a3ee…b6b64e52 {max_items: 5, max_seconds: 60}
model_config_hash
25591aba…cc79c21f claude-opus-4-7, temperature 0.0
BSV mainnet policy_snapshot Co-attested by gorillapool

Policy snapshot receipt

Snapshot taken
2026-05-07 17:45:22 UTC
Anchor SHA-256
5a7b55c6…68bc98ef
Bundle ID
a723cec3ac9a44b1
Transaction
41003a19…071817a4
Miner accepted
2026-05-07 17:45:23 UTC (gorillapool, SEEN_ON_NETWORK)
03Step 2 — Eval run

Five short math answers, graded deterministically.

The grader scored five student answers against the rubric. The grading function is hardcoded so the demo is hermetic and reproducible — replace it with a real LLM call to adapt to your own eval.

ItemExpectedStudentVerdictScore
Q11212correct1
Q22421incorrect0
Q377correct1
Q48412 * 7 = 84correct1
Q5109incorrect0

Aggregate: 3/5 correct.

04Step 3 — Result manifest

Five verdicts, one on-chain anchor.

Each verdict was canonicalized + sha256’d. Those five hashes (with their item labels) were sent to the API; the server then hashed each {label, sha256_hex} pair into a Merkle leaf, combined the leaves into a tree, and anchored the root as a single evidence_bundle receipt. Any one verdict can later be revealed (with its inclusion path) without revealing the other four.

ItemVerdict SHA-256 (sha256 of the canonical verdict JSON)
Q110343a87…aa921669
Q2f68246b5…4b56894c
Q3fbdb4ed5…53f8620c
Q4e864923c…fb276519
Q5dbdd7436…d10593ce
BSV mainnet evidence_bundle Co-attested by gorillapool 5-leaf manifest

Result manifest receipt

Anchored
2026-05-07 17:45:24 UTC
Merkle root
84edc887…1e12b4f3
Bundle ID
2985f00ae0da4ac9
Transaction
c18b671f…74399c53
Leaf count
5
Miner accepted
2026-05-07 17:45:24 UTC (gorillapool, SEEN_ON_NETWORK)
05Step 4 — Verify

Confirm the chain order in your browser.

Drop either bundle into the verifier. The three pills populate from the receipt + a public block-explorer lookup — no Satsignal API call at verify time.

Committed

Miner-signed acceptance.

The receipt’s acceptance block carries gorillapool’s signed accept-time and status. Pill renders by gorillapool at <timestamp>.

Confirmed

WhatsOnChain says so.

Verifier fetches the tx from a public explorer and compares the OP_RETURN payload against the canonical doc hash. Pill renders On chain in tx <prefix>…<suffix>.

Verified

Manifest reconstruction.

For the manifest receipt, the verifier walks every leaf hash and reproduces the Merkle root, then compares to subject.root. For the policy snapshot, drop the snapshot JSON to confirm its sha256 matches the anchor.

06Next

Build this in your own integration.

The two helpers below produced the receipts on this page. Plain Python, stdlib only, no Satsignal SDK. Drop them into an agent runtime, a CI step, or a one-off shell.

PY

example_agent_snapshot.py

A ~30-line agent that hashes its five policy components, optionally POSTs the snapshot to /api/v1/anchors when SATSIGNAL_API_KEY is set, then takes a deterministic action. Replace the decide() stub with a real agent loop and you have the same policy-anchored receipt this demo opens with.

Source →

PY

policy_snapshot.py

Stdlib helper for the policy_snapshot primitive. CLI subcommands hash-component / build / verify. Selective-disclosure verify (one component at a time) just works.

Source →  ·  Quickstart →

The full helper catalog — including commit_reveal.py for commit-then-reveal flows and merkle_row.py for row-level disclosure — is at docs — helpers.