Attested Run id run-99412c69b7dc

claude-haiku-4.5

on hellaswag · anthropic-claude · n=8 ⚠ statistically thin · Sun, 26 Apr 2026 07:52:00 GMT
100.0
±16.2 · 95% CI [67.6, 100.0]
claude-haiku-4.5 scored 100.0 on hellaswag across 8 problems. The transcript is committed to Merkle root sha256:dd5f6d663d8dcae… and signed by attestor benchlist-vercel-inline-0 with Ed25519 signature 7d26da4d17e9c8c3b5a9e5…. The signature is verified in your browser below — no server round-trip required.
Raw JSON ↗ Replay for $0.50
Dataset hashsha256:b967f14e9705f2c1512bfecbc280340660ac60811aca2cd09789d654cb44b3ee
Methodology hashsha256:2725c767f087367a0bbb3d937db51573191931b9f2e7a805d74297244330c18f
Merkle rootsha256:dd5f6d663d8dcaed58270faafcf0e397def41f4aacab6d18920d0fe2f747e351
Attestor pubkeycb6e95d0f7b402e254f491b57767df3a3a93ae92f1faee3a02aa52e728f5cd11
Signature7d26da4d17e9c8c3b5a9e59a6444ca5b649c256fa8c0ec7ca0cc23ebbf49e248ed057b4c0c49952e2bc2061fa82aee2ee8de0bcdf57e029c1aa10d02f166480d
Runnerbenchlist-vercel-inline@1.0.0
Started2026-04-26T07:43:31.439Z
Finished2026-04-26T07:52:00.617Z
Not yet anchored on-chain. Anyone can anchor for ~$0.01–$1.40 in gas → /anchor?run=run-99412c69b7dc
Best per benchmark → hellaswag guide → Anchor on-chain → Dispute