Attestation protocol.

Wire format, canonical JSON, Merkle construction, signature scheme, on-chain submission. Everything needed to build a compatible attestor or verifier.

1. Run lifecycle

INIT → EXECUTE → COMMIT → PROVE → SUBMIT → VERIFIED
  ↓       ↓         ↓        ↓        ↓
run.json run.json  run.json run.json run.json
+seed    +tx       +commit  +proof   +verif

2. Canonical JSON

All hashing is over the canonical JSON: UTF-8, keys sorted, no whitespace, no trailing newline. Use canonicaljson-spec.

def canon(obj):
    return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode('utf-8')

digest = sha256(canon(run)).hex()

3. Merkle tree over transcripts

Each (prompt, response, judgement) tuple is a leaf. Leaves are hashed with sha256(0x00 || canon(tuple)). Internal nodes with sha256(0x01 || left || right). Odd layers duplicate the last node.

leaf_i  = sha256(0x00 || canon({ "i": i, "prompt": ..., "response": ..., "judge": ... }))
node    = sha256(0x01 || left || right)
root    = sha256(0x01 || ... top of tree ...)

4. Commitment

The commitment is the input to the ZK proof. Binds score to dataset, methodology, transcripts.

commitment = sha256(
  datasetHash ||
  methodologyHash ||
  transcriptMerkleRoot ||
  u64_be(score_fixed_point)  // score × 1e6, to avoid floats
)

5. Signature

Attestors sign the commitment with Ed25519.

sig = ed25519_sign(attestor_sk, commitment)
// 64-byte signature; publish hex-encoded in run.attestorSignature

6. Proof systems

The ZK proof asserts: "given dataset D and methodology M, applying the pinned scoring function to the committed transcripts yields score S". Supported systems:

7. Aligned Layer submission

Proofs are submitted via Aligned SDK. The batcher aggregates proofs and submits a Merkle root of batched proofs to the ServiceManager contract on Ethereum L1.

from aligned_sdk import AlignedClient

client = AlignedClient(network="ethereum")
batch = client.submit_proof(
  proof_bytes=proof,
  public_input=commitment,
  proving_system="sp1",
  verifier_identifier="benchlist-v1"
)
# batch.id is the credential we show on the listing

8. Verification

Anyone can verify:

  1. Fetch the run from Benchlist or directly from /api/runs/:id.json
  2. Compute the commitment from datasetHash, methodologyHash, transcriptMerkleRoot, and score
  3. Query Aligned's BatchVerifier for the batch, it returns the list of verified public inputs
  4. Assert your commitment is in the list
  5. Optionally: rerun the benchmark locally, verify the Merkle root, verify the score

9. Replay requirements

Every run's replay.command MUST produce a score within the reported σ on the reference hardware. Disputes are won by violating this promise.

10. Versioning

Breaking changes to the wire format require a new top-level spec_version in run.json. Old versions remain verifiable forever.

Reference implementation

All of the above, working, at github.com/benchlist/runner. MIT, 2100 lines of Rust + 400 lines of Python glue.