vLLM (self-hosted)

UC Berkeley, Anyscale · self-hosted · est. 2023

High-throughput open-source inference engine. PagedAttention, AWQ, FP8.

self-hosted open-source

Provider site ↗ Docs ↗

Attested runs

on this provider

Models

served

Benchmarks

covered

Avg drift

0.0pp

vs canonical (n=0)

Default quant

FP16

precision

Attested models on this provider

No attestations yet.

No models served by this provider have been attested. Run a benchmark →

Provider Verified · $499/mo

Are you vLLM (self-hosted)? Subscribe and own this page.

Unlimited multi-model attestations across your hosted catalog, drift alerts to Slack/webhook, customer-facing badge widget, and this vllm-self page populated daily. See the full pitch →

Or start 30-day pilot Email us