Attested Run id run-humaneval-e342703942

gpt-5-4-mini

on humaneval · openai-chatgpt · n=20 ⚠ statistically thin · Fri, 24 Apr 2026 18:42:43 GMT
87.8
±14.5 · 95% CI [67.2, 96.2]
gpt-5-4-mini scored 87.8 on humaneval across 20 problems. The transcript is committed to Merkle root sha256:a86b16f17926ac9… and signed by attestor benchlist-runner-0 with Ed25519 signature d160721731a3acc4f9af67…. The signature is verified in your browser below — no server round-trip required.
Raw JSON ↗ Replay for $0.50
Dataset hashsha256:3e1eb278fb45e71a150b896866387eae8c5bf42c0618c1a543fd5bb03cd3edaf
Methodology hashsha256:ce08d4538a6db120acb39fa9c8102cd61be51040a3cc735f9be0952fe445a1db
Merkle rootsha256:a86b16f17926ac9fa2f82a68cb6a0cc9f5803fa261b794a55e72296789faea34
Attestor pubkeyf38712fae5f11a2fc2fe3f7541264f04cd90974affdf1cce05163ecdaf35d457
Signatured160721731a3acc4f9af67c0489f31e4f7236783265d1691922c3463d62567b221e22ff46067803776d96c544eac4dafe5ded4d64f48513f6741c45858b50c01
Runnerbenchlist-runner@1.0.0
Started2026-04-24T18:40:43Z
Finished2026-04-24T18:42:43Z
Not yet anchored on-chain. Anyone can anchor for ~$0.01–$1.40 in gas → /anchor?run=run-humaneval-e342703942
Best per benchmark → humaneval guide → Anchor on-chain → Dispute