Pinned
Happy that InferenceMAX is here because it signals a milestone for vLLM's SOTA performance on NVIDIA Blackwell! 🥳
It has been a pleasure to deeply collaborate with @nvidia in @vllm_project, and we have much more to do
Read about the work we did here:
Today we are launching InferenceMAX!
We have support from Nvidia, AMD, OpenAI, Microsoft, Pytorch, SGLang, vLLM, Oracle, CoreWeave, TogetherAI, Nebius, Crusoe, HPE, SuperMicro, Dell
It runs every day on the latest software (vLLM, SGLang, etc) across hundreds of GPUs, $10Ms of


















