Attested Run id run-554f64031405

claude-haiku-4.5

on commonsenseqa · anthropic-claude · n=8 ⚠ statistically thin · Sun, 26 Apr 2026 07:50:41 GMT
87.5
±22.4 · 95% CI [52.9, 97.8]
claude-haiku-4.5 scored 87.5 on commonsenseqa across 8 problems. The transcript is committed to Merkle root sha256:a14ea371edafd91… and signed by attestor benchlist-vercel-inline-0 with Ed25519 signature 4d9c7c2311504f5f740d68…. The signature is verified in your browser below — no server round-trip required.
Raw JSON ↗ Replay for $0.50
Dataset hashsha256:729b5c0850ac5be6b8cfbedf4d36938249bb7c0d9e9c980260037391414dd520
Methodology hashsha256:8d0b3e04740ec4f11b5e3eebe6601688d47de4830e84c15a2da6c3925212fadf
Merkle rootsha256:a14ea371edafd9100bcd4a3b9f4200a52a289bb7b32142e04db931df6c6d12de
Attestor pubkeycb6e95d0f7b402e254f491b57767df3a3a93ae92f1faee3a02aa52e728f5cd11
Signature4d9c7c2311504f5f740d6831a9e56f47cc5cd29c5a58cd394fb4d0e9f5794943d3d59df74e7786e86c50c045333346710dc3057133d399362e302ccb9e1bf80e
Runnerbenchlist-vercel-inline@1.0.0
Started2026-04-26T07:43:28.839Z
Finished2026-04-26T07:50:41.517Z
Not yet anchored on-chain. Anyone can anchor for ~$0.01–$1.40 in gas → /anchor?run=run-554f64031405
Best per benchmark → commonsenseqa guide → Anchor on-chain → Dispute