Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters

  • Free Endpoint
    77
  • Partner Endpoint
    47
  • Download Available
    107
  • Drug Discovery
    13
  • Image-to-Text
    10
  • Retrieval Augmented Generation
    9
  • Speech-to-Text
    9
  • Code Generation
    8
  • Deepinfra
    35
  • OpenRouter
    27
  • Together AI
    25
  • GMI Cloud
    15
  • Bitdeer
    12
  • NVIDIA
    75
  • Meta
    11
  • Google
    6
  • Mistral AI
    6
  • Qwen
    5
  • B200
    19
  • H100 80GB HBM3
    19
  • H200
    18
  • L40S
    16
  • A100 SXM4 80GB
    14
  • 140 models
    NVIDIA
    Downloadable

    nemotron-ocr-v2

    Nemotron OCR v2 is a state-of-the-art multilingual text recognition model designed for robust end-to-end optical character recognition (OCR) on complex real-world images.
    Table Extraction
    151
    1d
    Items per page
    of 6 pages
    Minimaxai
    Free Endpoint

    minimax-m3

    MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.
    coding
    4M
    14d
    Google
    DownloadableFree Endpoint

    diffusiongemma-26b-a4b-it

    Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps
    diffusion-llm
    2M
    15d
    NVIDIA
    DownloadableFree Endpoint

    nemotron-3-ultra-550b-a55b

    Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
    Agent
    8M
    22d
    Resemble.AI
    Downloadable

    chatterbox-multilingual-tts

    Natural and expressive voices in 23 languages. For voice agents and brand ambassadors.
    TTS
    7K
    22d
    NVIDIA
    DownloadableFree Endpoint

    nemotron-3.5-content-safety

    Multilingual, multimodal model for detecting unsafe and toxic content.
    llm safety
    1M
    23d
    NVIDIA
    Free Endpoint

    cosmos3-nano

    Generates physics-aware videos from text prompts or an image prompt for physical AI development.
    autonomous vehicles
    2K
    25d
    NVIDIA
    DownloadableFree Endpoint

    cosmos3-nano-reasoner

    Vision language model that excels in understanding the physical world using structured reasoning on videos or images.
    video understanding
    2K
    25d
    Stepfun-ai
    DownloadableFree Endpoint

    step-3.7-flash

    A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.
    Coding
    4M
    28d
    Moonshotai
    DownloadableFree Endpoint

    kimi-k2.6

    1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.
    Multimodal
    7M
    1mo
    Qwen
    Downloadable

    qwen-image

    Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.
    Text-to-Image
    1mo
    Qwen
    Downloadable

    qwen-image-edit

    Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.
    Text-to-Image
    1mo
    Mistral AI
    DownloadableFree Endpoint

    mistral-medium-3.5-128b

    A high performing model for text generation, coding and agentic use cases
    coding
    4M
    1mo
    NVIDIA
    DownloadableFree Endpoint

    nemotron-3-nano-omni-30b-a3b-reasoning

    Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.
    Image-to-Text
    8M
    1mo
    DeepSeek AI
    DownloadableFree Endpoint

    deepseek-v4-flash

    DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.
    MoE
    15M
    2mo
    DeepSeek AI
    DownloadableFree Endpoint

    deepseek-v4-pro

    DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.
    Moe
    8M
    2mo
    Z.ai
    DownloadableFree Endpoint

    glm-5.1

    GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
    Agentic AI
    28M
    2mo
    NVIDIA
    Downloadable

    Relighting

    Re-illuminate people in video to match target lighting from a 360 HDRI environment map.
    HDRI
    227
    2mo
    NVIDIA
    Free Endpoint

    nemotron-3-content-safety

    Multilingual, multimodal model for detecting unsafe and toxic content.
    llm safety
    230K
    2mo
    NVIDIA
    DownloadableFree Endpoint

    synthetic-video-detector

    NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.
    broadcast
    90K
    2mo
    NVIDIA
    DownloadableFree Endpoint

    Active Speaker Detection

    Detect and track speaker identities across video frames.
    broadcast
    473
    2mo
    NVIDIA
    Downloadable

    LipSync

    Generative lip dubbing that syncs lips in a video to input audio.
    broadcast
    2mo
    NVIDIA
    DownloadableFree Endpoint

    ising-calibration-1-35b-a3b

    Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.
    Quantum
    332K
    2mo
    Minimaxai
    DownloadableFree Endpoint

    minimax-m2.7

    MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
    reasoning
    14M
    2mo