Confidential Inference Cloud

Private AI you can cryptographically trust.

Run frontier open models inside hardware-secured enclaves. Every request is attested end to end, so you can prove your data was never seen, not just hope it.

Start building Read the docs

Intel TDX + NVIDIA GPU TEE Open source & auditable OpenAI-compatible API

near-ai, zsh, verified inference

Trusted by teams building privacy-first AI

Venice

Government of Bermuda Abound

Member of the Confidential Computing Consortium · Part of the NVIDIA Inception Program

How it works

Trust that’s verified, not assumed

Most “private” AI asks you to take their word for it. NEAR AI Cloud gives you a mathematical proof on every single request.

enclave://near-ai-cloud verifying

Your model + prompts

running · memory encrypted

Host OS···

GPU operator···

NEAR AI···

generating attestation…

You encrypt the request

Prompts are sent over TLS and only decrypted inside the enclave. Keys never touch our infrastructure.

It runs inside a TEE

Inference executes in an Intel TDX + NVIDIA confidential-GPU enclave. Memory is sealed from the host and operators.

The hardware signs a proof

The TEE produces a cryptographic attestation quote binding the exact model and code that ran.

You verify the result

Check the attestation yourself, in code or with our open verifier. Trust the math, not a promise.

Why teams trust it

Privacy guaranteed by hardware, proven by math

Confidential GPU compute

Models run on NVIDIA confidential computing inside Intel TDX. Data stays encrypted in use, not just at rest and in transit.

Attestation on every call

Each response carries a verifiable hardware quote binding the exact code and weights that served it.

Open source & auditable

The runtime, verifier, and enclave images are open. Reproduce the build and check it yourself.

OpenAI-compatible API

Drop-in replacement for the OpenAI SDK. Swap the base URL and your stack keeps working.

No vendor lock-in

Open weights and open models. Bring your own, or pick from a growing catalog, portable by design.

You hold the keys

Client-held encryption keys. We never see plaintext, and there is no operator backdoor, by construction.

models available

confidential, in TEEs

100%

requests attested

operator access, by design

Why NEAR AI Cloud

Privacy of on-prem.
Power of cloud.

NEAR AI Cloud runs models inside secure hardware enclaves, the privacy of running locally, with the scale and convenience of the cloud.

OpenAIGPT models, closedOn-PremAny open-source or custom modelNEAR AIAny open-source or custom model

Data privacy

Provable zero data retention

Confidential compute (TEE)

Hardware-signed attestation

Features

Cloud convenience

Setup costLowHighLow

ComplexityLowHighLow

ScalabilityGoodPoorGood

No vendor lock-in

Zero trust

Private observability

Models

Open models, running confidentially

A curated catalog of open-weight models, each served inside an attested enclave, with live pricing from the NEAR AI Cloud API.

ModelContextPrice / 1M tokens (in / out)License

DeepSeek V4 FlashDeepSeek-V4-Flash

1M$0.17 / $0.35 outOpen

deepseek-v3.2deepseek-v3.2

125K$1.10 / $1.10 outOpen

Gemma 4 31B Instructgemma-4-31B-it

256K$0.13 / $0.40 outOpen

GLM 5.1GLM-5.1-FP8

198K$0.85 / $3.30 outOpen

GLM 5.2glm-5.2

488K$1.40 / $4.40 outOpen

glm-5glm-5

125K$1.05 / $2.81 outOpen

GPT OSS 120Bgpt-oss-120b

128K$0.15 / $0.55 outOpen

Kimi K2.6kimi-k2.6

256K$0.81 / $3.85 outOpen

kimi-k2.5kimi-k2.5

125K$0.48 / $2.20 outOpen

minimax-m2.5minimax-m2.5

125K$0.17 / $1.32 outOpen

Qwen 3.6 27B FP8Qwen3.6-27B-FP8

256K$0.33 / $3.25 outOpen

Qwen 3.6 35B A3B FP8Qwen3.6-35B-A3B-FP8

256K$0.17 / $1.10 outOpen

qwen3-32bqwen3-32b

125K$0.11 / $0.46 outOpen

Qwen3-VL-30B-A3B-InstructQwen3-VL-30B-A3B-Instruct

16K$0.15 / $0.55 outOpen

Qwen3.5 122B A10BQwen3.5-122B-A10B

256K$0.40 / $3.20 outOpen

qwen3.5-397b-a17bqwen3.5-397b-a17b

125K$0.50 / $3.30 outOpen

Quickstart

If you’ve used OpenAI, you’re done in 30 seconds

No new SDK to learn. Change the base URL, keep your code, and gain end-to-end attestation for free.

Get your API key View on GitHub

# pip install openai

import os

from openai import OpenAI

client = OpenAI(

base_url="https://cloud-api.near.ai/v1",

api_key=os.environ["NEAR_AI_KEY"],

)

resp = client.chat.completions.create(

model="Qwen/Qwen3.5-122B-A10B",

messages=[{"role": "user", "content": "Hi"}],

)

# verify the response ran in a real enclave

resp.verify_attestation() # -> True

Pricing

Confidential compute, pay-as-you-go

Pay only for the tokens you use, see live per-model pricing in the catalog above. Enterprise plans add reserved capacity and private models.

Pay-as-you-go

Enterprise

Custom

Reserved capacity, private models, and procurement on your terms.

Reserved GPU capacity
Custom & private models
Volume pricing & SLAs
On-prem / VPC deployment options
Solutions engineering

Contact sales

FAQ

Questions, answered

A Trusted Execution Environment is a hardware-isolated region of a CPU/GPU where code and data are encrypted even while running. It means your prompts and the model can be processed without the host machine, the operator, or NEAR AI ever being able to read them.

Most providers promise privacy in their terms of service. NEAR AI Cloud gives you a cryptographic attestation, a hardware-signed proof of exactly what code ran and that it ran inside a sealed enclave. You verify it yourself; you don’t take our word for it.

No. The API is OpenAI-compatible. Point your existing OpenAI SDK at our base URL, add your key, and you keep attestation as a bonus you can optionally verify.

Yes. The enclave runtime, the attestation verifier, and the model images are open and reproducible, so you can audit the full path your data takes.

A growing catalog of open-weight models such as Llama, Qwen, DeepSeek and Mixtral, all served confidentially. Bring-your-own and private models are available on Team and Enterprise plans.

Start building on AI you can prove.

Get an API key in minutes. Keep your OpenAI code. Add verifiable privacy for free.

$curl https://cloud-api.near.ai/v1/model/list

Start building Read the docs

Private AI you can cryptographically trust.

Trust that’s verified, not assumed

You encrypt the request

It runs inside a TEE

The hardware signs a proof

You verify the result

Privacy guaranteed by hardware, proven by math

Confidential GPU compute

Attestation on every call

Open source & auditable

OpenAI-compatible API

No vendor lock-in

You hold the keys

Privacy of on-prem.Power of cloud.

Open models, running confidentially

If you’ve used OpenAI, you’re done in 30 seconds

Confidential compute, pay-as-you-go

Pay-as-you-go

Enterprise

Questions, answered

Start building on AI you can prove.

Privacy of on-prem.
Power of cloud.