Spec v1.0

Attestation protocol.

Wire format, canonical JSON, Merkle construction, signature scheme, on-chain submission. Everything needed to build a compatible attestor or verifier.

1. Run lifecycle

INIT → EXECUTE → COMMIT → PROVE → SUBMIT → VERIFIED
  ↓       ↓         ↓        ↓        ↓
run.json run.json  run.json run.json run.json
+seed    +tx       +commit  +proof   +verif

2. Canonical JSON

All hashing is over the canonical JSON: UTF-8, keys sorted, no whitespace, no trailing newline. Use canonicaljson-spec.

def canon(obj):
    return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode('utf-8')

digest = sha256(canon(run)).hex()

3. Merkle tree over transcripts

Each (prompt, response, judgement) tuple is a leaf. Leaves are hashed with sha256(0x00 || canon(tuple)). Internal nodes with sha256(0x01 || left || right). Odd layers duplicate the last node.

leaf_i  = sha256(0x00 || canon({ "i": i, "prompt": ..., "response": ..., "judge": ... }))
node    = sha256(0x01 || left || right)
root    = sha256(0x01 || ... top of tree ...)

4. Commitment

The commitment is the input to the ZK proof. Binds score to dataset, methodology, transcripts.

commitment = sha256(
  datasetHash ||
  methodologyHash ||
  transcriptMerkleRoot ||
  u64_be(score_fixed_point)  // score × 1e6, to avoid floats
)

5. Signature

Attestors sign the commitment with Ed25519.

sig = ed25519_sign(attestor_sk, commitment)
// 64-byte signature; publish hex-encoded in run.attestorSignature

6. Proof systems

The ZK proof asserts: "given dataset D and methodology M, applying the pinned scoring function to the committed transcripts yields score S". Supported systems:

7. Aligned Layer submission

Proofs are submitted via Aligned SDK. The batcher aggregates proofs and submits a Merkle root of batched proofs to the ServiceManager contract on Ethereum L1.

from aligned_sdk import AlignedClient

client = AlignedClient(network="ethereum")
batch = client.submit_proof(
  proof_bytes=proof,
  public_input=commitment,
  proving_system="sp1",
  verifier_identifier="benchlist-v1"
)
# batch.id is the credential we show on the listing

8. Verification

Anyone can verify:

  1. Fetch the run from Benchlist or directly from /api/runs/:id.json
  2. Compute the commitment from datasetHash, methodologyHash, transcriptMerkleRoot, and score
  3. Query Aligned's BatchVerifier for the batch — it returns the list of verified public inputs
  4. Assert your commitment is in the list
  5. Optionally: rerun the benchmark locally, verify the Merkle root, verify the score

9. Replay requirements

Every run's replay.command MUST produce a score within the reported σ on the reference hardware. Disputes are won by violating this promise.

10. Versioning

Breaking changes to the wire format require a new top-level spec_version in run.json. Old versions remain verifiable forever.

Reference implementation

All of the above, working, at github.com/benchlist/runner. MIT, 2100 lines of Rust + 400 lines of Python glue.