Attested Run id run-7c78931c0662

claude-sonnet-4.5

on commonsenseqa · anthropic-claude · n=5 ⚠ statistically thin · Sun, 26 Apr 2026 08:04:24 GMT
60.0
±32.6 · 95% CI [23.1, 88.2]
claude-sonnet-4.5 scored 60.0 on commonsenseqa across 5 problems. The transcript is committed to Merkle root sha256:7cdfb32ce44d403… and signed by attestor benchlist-vercel-inline-0 with Ed25519 signature 45b6b4f8aa07d9c454dc00…. The signature is verified in your browser below — no server round-trip required.
Raw JSON ↗ Replay for $0.50
Dataset hashsha256:729b5c0850ac5be6b8cfbedf4d36938249bb7c0d9e9c980260037391414dd520
Methodology hashsha256:8d0b3e04740ec4f11b5e3eebe6601688d47de4830e84c15a2da6c3925212fadf
Merkle rootsha256:7cdfb32ce44d403caa8f8e38d6c072a0559c5ec0fb512093555994e9e779e822
Attestor pubkeycb6e95d0f7b402e254f491b57767df3a3a93ae92f1faee3a02aa52e728f5cd11
Signature45b6b4f8aa07d9c454dc0047d43ec47c191ce105b5a85f1884a9cb2cd123f391d4a64d1ffc836a96cc26d2aab7a216a1918f7b728ed2de2af37dbd78a3189102
Runnerbenchlist-vercel-inline@1.0.0
Started2026-04-26T08:01:34.894Z
Finished2026-04-26T08:04:24.459Z
Not yet anchored on-chain. Anyone can anchor for ~$0.01–$1.40 in gas → /anchor?run=run-7c78931c0662
Best per benchmark → commonsenseqa guide → Anchor on-chain → Dispute