Confidential Inference Cloud

Private AI you can cryptographically trust.

Run frontier open models inside hardware-secured enclaves. Every request is attested end to end, so you can prove your data was never seen, not just hope it.

Intel TDX + NVIDIA GPU TEE Open source & auditable OpenAI-compatible API
near-ai, zsh, verified inference

Trusted by teams building privacy-first AI

Member of the Confidential Computing Consortium · Part of the NVIDIA Inception Program

How it works

Trust that’s verified, not assumed

Most “private” AI asks you to take their word for it. NEAR AI Cloud gives you a mathematical proof on every single request.

enclave://near-ai-cloud verifying
Your model + prompts
running · memory encrypted
Host OS···
GPU operator···
NEAR AI···
generating attestation…
01

You encrypt the request

Prompts are sent over TLS and only decrypted inside the enclave. Keys never touch our infrastructure.

02

It runs inside a TEE

Inference executes in an Intel TDX + NVIDIA confidential-GPU enclave. Memory is sealed from the host and operators.

03

The hardware signs a proof

The TEE produces a cryptographic attestation quote binding the exact model and code that ran.

04

You verify the result

Check the attestation yourself, in code or with our open verifier. Trust the math, not a promise.

Why teams trust it

Privacy guaranteed by hardware, proven by math

Confidential GPU compute

Models run on NVIDIA confidential computing inside Intel TDX. Data stays encrypted in use, not just at rest and in transit.

Attestation on every call

Each response carries a verifiable hardware quote binding the exact code and weights that served it.

Open source & auditable

The runtime, verifier, and enclave images are open. Reproduce the build and check it yourself.

OpenAI-compatible API

Drop-in replacement for the OpenAI SDK. Swap the base URL and your stack keeps working.

No vendor lock-in

Open weights and open models. Bring your own, or pick from a growing catalog, portable by design.

You hold the keys

Client-held encryption keys. We never see plaintext, and there is no operator backdoor, by construction.

47
models available
16
confidential, in TEEs
100%
requests attested
0
operator access, by design
Why NEAR AI Cloud

Privacy of on-prem.
Power of cloud.

NEAR AI Cloud runs models inside secure hardware enclaves, the privacy of running locally, with the scale and convenience of the cloud.

OpenAIGPT models, closedOn-PremAny open-source or custom modelNEAR AIAny open-source or custom model
Data privacy
Provable zero data retention
Confidential compute (TEE)
Hardware-signed attestation
Features
Cloud convenience
Setup costLowHighLow
ComplexityLowHighLow
ScalabilityGoodPoorGood
No vendor lock-in
Zero trust
Private observability
Models

Open models, running confidentially

A curated catalog of open-weight models, each served inside an attested enclave, with live pricing from the NEAR AI Cloud API.

ModelContextPrice / 1M tokens (in / out)License
ImageDeepSeek V4 FlashDeepSeek-V4-Flash
1M$0.17 / $0.35 outOpen
Imagedeepseek-v3.2deepseek-v3.2
125K$1.10 / $1.10 outOpen
ImageGemma 4 31B Instructgemma-4-31B-it
256K$0.13 / $0.40 outOpen
ImageGLM 5.1GLM-5.1-FP8
198K$0.85 / $3.30 outOpen
ImageGLM 5.2glm-5.2
488K$1.40 / $4.40 outOpen
Imageglm-5glm-5
125K$1.05 / $2.81 outOpen
ImageGPT OSS 120Bgpt-oss-120b
128K$0.15 / $0.55 outOpen
ImageKimi K2.6kimi-k2.6
256K$0.81 / $3.85 outOpen
Imagekimi-k2.5kimi-k2.5
125K$0.48 / $2.20 outOpen
Imageminimax-m2.5minimax-m2.5
125K$0.17 / $1.32 outOpen
ImageQwen 3.6 27B FP8Qwen3.6-27B-FP8
256K$0.33 / $3.25 outOpen
ImageQwen 3.6 35B A3B FP8Qwen3.6-35B-A3B-FP8
256K$0.17 / $1.10 outOpen
Imageqwen3-32bqwen3-32b
125K$0.11 / $0.46 outOpen
ImageQwen3-VL-30B-A3B-InstructQwen3-VL-30B-A3B-Instruct
16K$0.15 / $0.55 outOpen
ImageQwen3.5 122B A10BQwen3.5-122B-A10B
256K$0.40 / $3.20 outOpen
Imageqwen3.5-397b-a17bqwen3.5-397b-a17b
125K$0.50 / $3.30 outOpen
Quickstart

If you’ve used OpenAI, you’re done in 30 seconds

No new SDK to learn. Change the base URL, keep your code, and gain end-to-end attestation for free.

# pip install openai
import os
from openai import OpenAI
client = OpenAI(
base_url="https://cloud-api.near.ai/v1",
api_key=os.environ["NEAR_AI_KEY"],
)
resp = client.chat.completions.create(
model="Qwen/Qwen3.5-122B-A10B",
messages=[{"role": "user", "content": "Hi"}],
)
# verify the response ran in a real enclave
resp.verify_attestation() # -> True
Pricing

Confidential compute, pay-as-you-go

Pay only for the tokens you use, see live per-model pricing in the catalog above. Enterprise plans add reserved capacity and private models.

Pay-as-you-go

Most popular
Per token no minimums

Only pay for the tokens you use, every open model, attested on every request.

  • Per-token pricing on all open models
  • Hardware-signed attestation on every request
  • OpenAI-compatible, no new SDK
  • No subscriptions or commitments
Get an API key

Enterprise

Custom

Reserved capacity, private models, and procurement on your terms.

  • Reserved GPU capacity
  • Custom & private models
  • Volume pricing & SLAs
  • On-prem / VPC deployment options
  • Solutions engineering
Contact sales
FAQ

Questions, answered

A Trusted Execution Environment is a hardware-isolated region of a CPU/GPU where code and data are encrypted even while running. It means your prompts and the model can be processed without the host machine, the operator, or NEAR AI ever being able to read them.
Most providers promise privacy in their terms of service. NEAR AI Cloud gives you a cryptographic attestation, a hardware-signed proof of exactly what code ran and that it ran inside a sealed enclave. You verify it yourself; you don’t take our word for it.
No. The API is OpenAI-compatible. Point your existing OpenAI SDK at our base URL, add your key, and you keep attestation as a bonus you can optionally verify.
Yes. The enclave runtime, the attestation verifier, and the model images are open and reproducible, so you can audit the full path your data takes.
A growing catalog of open-weight models such as Llama, Qwen, DeepSeek and Mixtral, all served confidentially. Bring-your-own and private models are available on Team and Enterprise plans.

Start building on AI you can prove.

Get an API key in minutes. Keep your OpenAI code. Add verifiable privacy for free.

$curl https://cloud-api.near.ai/v1/model/list