VL

vLLM (self-hosted)

UC Berkeley, Anyscale · self-hosted · est. 2023

High-throughput open-source inference engine. PagedAttention, AWQ, FP8.

self-hosted open-source
Provider site ↗ Docs ↗
Attested runs
0
on this provider
Models
0
served
Benchmarks
0
covered
Avg drift
0.0pp
vs canonical (n=0)
Default quant
FP16
precision

Attested models on this provider

No attestations yet.

No models served by this provider have been attested. Run a benchmark →

Provider Verified · $499/mo
Are you vLLM (self-hosted)? Subscribe and own this page.

Unlimited multi-model attestations across your hosted catalog, drift alerts to Slack/webhook, customer-facing badge widget, and this vllm-self page populated daily. See the full pitch →

Or start 30-day pilot Email us