VL
vLLM (self-hosted)
UC Berkeley, Anyscale · self-hosted · est. 2023
High-throughput open-source inference engine. PagedAttention, AWQ, FP8.
self-hosted open-source
Attested runs
0
on this provider
Models
0
served
Benchmarks
0
covered
Avg drift
0.0pp
vs canonical (n=0)
Default quant
FP16
precision
Attested models on this provider
No attestations yet.
No models served by this provider have been attested. Run a benchmark →
Provider Verified · $499/mo
Are you vLLM (self-hosted)? Subscribe and own this page.
Unlimited multi-model attestations across your hosted catalog, drift alerts to Slack/webhook, customer-facing badge widget, and this vllm-self page populated daily. See the full pitch →