Three ways in. The fastest is one API call, we run it, prove it, settle it on Ethereum L1. The other two are for batch listings and pre-run proofs.
bl_live_… key. Top up credits (packs from $25 / 6 tests) when you're ready to sign your first run. Want to smoke-test without paying? python runner/benchlist.py demo synthesises a signed run locally with no keys.Every run ships with at least an Ed25519 signature. The Aligned Layer upgrade adds a full ZK proof queued for Ethereum L1 anchor. You can start Attested and upgrade later by re-running with --system sp1.
Ed25519 signature over the canonical commitment. Replayable in the browser via @noble/ed25519. Ships immediately, no ETH key.
proof_system: signedSame Ed25519 signature, plus a calldata-anchor tx on Ethereum L1 so the attestation is timestamped on-chain. Set ATTESTOR_PRIVATE_KEY on your own runner or let us do it.
proof_system: signed + RPC envSP1 or Risc0 zkVM proof of the scoring function, aggregated through Aligned Layer, queued for Ethereum L1 anchor. Covers proof gen + gas. What regulators + procurement teams ask for.
proof_system: sp1 or risc0One HTTP request posts a test. We run it on a staked attestor, Merkle-commit the transcript, submit the proof to Aligned Layer, and settle on Ethereum mainnet. You get a verify_url in the response; the listing publishes automatically once the proof verifies (~3 min).
curl -X POST https://api.benchlist.ai/v1/run \
-H "Authorization: Bearer $BENCHLIST_KEY" \
-H "Content-Type: application/json" \
-d '{
"service": "anthropic-claude",
"model": "claude-opus-4-7",
"benchmark": "mbpp",
"runs": 3
}'
Deducts 1 credit ($5) from your balance. Includes Ethereum L1 gas, attestor compute, and the run itself, see where your $5 goes.
Tell us about your service. We run the canonical benchmarks and post the proofs. Typical TAT: 3–5 business days.
You've already run the benchmark with our runner. Paste the canonical run.json, we publish immediately after proof verification.
If you've already run a benchmark with our runner and produced a proof, paste the canonical JSON here. We publish immediately after proof verification.
All submissions are moderated. We reject: spam, unrelated services, listings with fabricated scores, and services that can't provide a working replay command. First offense = banned for 30 days. Repeat offense = permanent.