Attested Run id run-gsm8k-290568d1f2

claude-sonnet-4-5-20250929

on gsm8k · anthropic-claude · n=20 ⚠ statistically thin · Fri, 24 Apr 2026 16:42:43 GMT
94.1
±11.8 · 95% CI [75.2, 98.8]
claude-sonnet-4-5-20250929 scored 94.1 on gsm8k across 20 problems. The transcript is committed to Merkle root sha256:471f008f8116bc3… and signed by attestor benchlist-runner-0 with Ed25519 signature ce52785a7713bb66cec09d…. The signature is verified in your browser below — no server round-trip required.
Raw JSON ↗ Replay for $0.50
Dataset hashsha256:09a35a0a0a48f13840457c82e2c2da6a7884ec21b51154139867843c2e4da5c7
Methodology hashsha256:3a6a6b8897f86450fae044fb6291eecbc48fbc9d92905718366b16b895ee773e
Merkle rootsha256:471f008f8116bc3b8c852edea65aff4b630f205aef4c35bde29d3401dfc48177
Attestor pubkeyf38712fae5f11a2fc2fe3f7541264f04cd90974affdf1cce05163ecdaf35d457
Signaturece52785a7713bb66cec09d340473e09c3ba43b6960d9ece2f235bf1c97df92a34f80d3f7e0563c6b285c7a7f2e8dae9823b417187cd01bfc6300642848620c06
Runnerbenchlist-runner@1.0.0
Started2026-04-24T16:40:43Z
Finished2026-04-24T16:42:43Z
Not yet anchored on-chain. Anyone can anchor for ~$0.01–$1.40 in gas → /anchor?run=run-gsm8k-290568d1f2
Best per benchmark → gsm8k guide → Anchor on-chain → Dispute