Confidential AIYou Can Verify

Every response runs inside a TEE and ships with an attestation report and a signed receipt that bind your request and response to the attested workload. Evidence you can verify yourself.

Get API Key Try the Chat

The Confidential AI Stack

Run every request through an ACI gateway, bind the channel to attested keys, verify the upstream, and keep signed receipts for each response.

ACI Gateway

A confidential gateway that emits evidence artifacts: nonce-bound attestation reports, workload keysets, and user-visible receipt IDs.

Channel Binding

Downstream TLS SPKI or ACI E2EE public keys are published in the attested keyset, so clients can reject mismatched connections.

Receipt Evidence

Signed receipts bind request hash, selected route, upstream verification result, response wire hash, and attested audit session.

Verified Upstreams

Provider adapters verify Phala, Tinfoil, NEAR AI, Chutes, and ACI/DCAP targets, then enforce TLS-SPKI or provider E2EE bindings before forwarding.

Up and running in minutes

YOUR SDK.
VERIFIABLE AI.

Keep the OpenAI client you already use. Change the base URL, choose a confidential model, then verify the gateway attestation, the response receipt, and the attested session.

View Full Docs

chat.py

import osfrom openai import OpenAIclient = OpenAI(    base_url="https://api.redpill.ai/v1",    api_key=os.environ["REDPILL_API_KEY"],)response = client.chat.completions.create(    model="openai/gpt-oss-120b",    messages=[        {"role": "user", "content": "Hello, confidential world"}    ],)print(response.id)print(response.choices[0].message.content)

Built for How You Work

Build confidential products, protect regulated workflows, and keep private chat available as a proof-of-product.

One API, 200+ Models

Build privacy-first applications without juggling provider credentials. Switch in minutes: just change the base URL.

Full API surface: chat, embeddings, audio, images, files, batches, and models
E2EE routing: never downgrades from a TEE backend to a non-TEE backend
Virtual keys: per-key budgets, RPM/TPM limits, and routing strategy

Get API Key

Ship Faster, Stay Compliant

Building healthcare, legal, or fintech apps? RedPill handles confidential inference, attestation, and E2EE routing so you can focus on your product.

Attestation APIs for nonce-bound hardware reports
Open source gateway and verifier
Verifier CLI for independent proof checks

View Documentation

Explore AI Models

From private models in GPU TEE to all your favorites.

Z.ai: GLM 5.2

NewGPU TEE

GLM-5.2 is Z.ai's flagship model for the era of long-horizon tasks. With a truly usable 1M-token context window, it can handle project-level engineering context and execute long-running tasks more reliably. Served as a text-only TEE deployment via Phala.

by Phala|1M context|$1.40/M input|$4.40/M output|$0.70/M cache read

Intel TDXNVIDIA CC

Phala: Gemma-4 26B-A4B Uncensored (Heretic)

NewGPU TEE

Uncensored "Heretic" variant of google/gemma-4-26B-A4B-it created using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method and row-norm preservation. Refusals drop from 100/100 to 11/100 with KL divergence 0.0499 vs the base model. The base Gemma 4 26B A4B is a Mixture-of-Experts model with 25.2B total / 3.8B active parameters (8 active / 128 total experts), 30-layer transformer with hybrid local sliding (1024) + global attention, supporting a 256K context window. Natively multimodal (text + images, variable aspect ratios). Strong on coding, reasoning, function calling, with native system prompt support across 35+ languages. Served on Phala in TDX-attested H200 enclave with end-to-end ECDSA response signing; vLLM-compatible FP8-Static quantization by cloud19 (router excluded from quantization).

by Phala|66K context|$0.15/M input|$0.70/M output

Intel TDXNVIDIA CC

Phala: Qwen3.6 35B-A3B Uncensored (Aggressive)

NewGPU TEE

Uncensored "Aggressive" variant of Qwen3.6-35B-A3B from Alibaba's Qwen team. The fine-tune by HauhauCS removes refusal behaviors (0/465 refusals) without modifying datasets or core capabilities. The base architecture is a 35B-parameter Mixture-of-Experts model with 256 experts routing 8 per token (~3B active params), 40 layers, and a hybrid linear+full-softmax attention mechanism (3:1 ratio). Supports a native 262K context and is natively multimodal across text, images, and video. Served on Phala in TDX-attested H200 enclave with end-to-end ECDSA response signing; FP8 quantization by lamianlbe.

by Phala|131K context|$0.30/M input|$1.50/M output

Intel TDXNVIDIA CC

Qwen: Qwen3.5-27B

GPU TEE

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.

by Phala|262K context|$0.30/M input|$2.40/M output

Intel TDXNVIDIA CC

Z.AI: GLM 4.7 Flash

GPU TEE

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

by Phala|203K context|$0.10/M input|$0.43/M output

Intel TDXNVIDIA CCBETA

Deprecated

Phala: Venice Uncensored 24B

GPU TEE

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving user control over alignment, system prompts, and behavior. Intended for advanced and unrestricted use cases, Venice Uncensored emphasizes steerability and transparent behavior, removing default safety and alignment layers typically found in mainstream assistant models.

by Phala|33K context|$0.20/M input|$0.90/M output

Intel TDXNVIDIA CC

View All Models

Confidential AI Models

No memory. No traces. The model knows nothing about you.

Confidential AI OFF - showing data exposure risks

Confidential AI ON - showing secure, encrypted processing

Trust-Me AI vs Prove-It AI

Privacy policies ask for trust. Confidential AI gives developers encryption, attestation reports, signed receipts, and audit sessions.

OpenAIAnthropicGooglexAIOracle

Trust-Me AI

Privacy depends on policy language
Closed workloads, limited external proof
No nonce-bound attestation endpoint
No signed receipt or audit session

PhalaTinfoilNEAR AIChutes

Prove-It AI

Confidential GPU-TEE model routing
Cryptographic proof of privacy
x-receipt-id and attested audit sessions
Open-source gateway and verifier

Verify It

Trusted by Industry Leaders

From AI infrastructure providers to law firms and security companies. RedPill powers privacy-first AI for organizations that can't compromise on confidentiality.

Try RedPill Free

Our Partners & Integrations

Cameron, Director of Near AI

"RedPill's confidential computing approach aligns perfectly with our vision for decentralized AI. Their TEE infrastructure sets a new standard for privacy-preserving inference at scale."

Elizabeth Leon Gonzalez, Milligan, Beswick, Levine & Knox LLP

"Attorney-client privilege is non-negotiable. RedPill is the only AI platform our firm trusts for case research and contract analysis. The cryptographic guarantees give us confidence no other tool can."

Vin Sharma, Founder at Vijil AI

"As a security-focused AI company, we evaluated every private AI solution on the market. RedPill's end-to-end encryption and TEE architecture passed our most rigorous security audits."

Chris Were, CEO at Blue Nexus AI

"We built Blue Nexus on the belief that personal AI should be truly personal. RedPill's infrastructure lets us deliver that promise: your AI assistant that never shares your secrets."

Start with Confidential AI

Developer-first plans for ACI, enterprises, and private chat users.

API First

Developer

Build with confidential GPU models, smart routing, and a drop-in API. Pay with card or crypto.

Pay as you go

200+ models through one endpoint
17 GPU-TEE models across 4 networks
Virtual keys with budget and rate limits
Attestation reports, receipts, and audit sessions
Stripe and Coinbase Commerce payments

Get API Key

Enterprise

For teams needing SSO, admin controls, audit workflows, and dedicated confidential inference support.

Custom Pricing

Everything in Developer
Enterprise virtual-key policies
Compliance support and audit evidence
Dedicated support and invoicing
Deployment architecture review

Talk to Sales

Pro

For professionals who want a private chat experience backed by confidential AI infrastructure.

$35/month (billed annually)

Confidential chat access
Broad model catalog
Encrypted conversation workflows
File upload in the chat app
Verification UX in product

Get Pro

Free

Try the chat product and see confidential model access before adding payment.

Get Started

Private chat access
Limited daily usage
E2EE-supported workflows
Web access

Try the Chat

Ready to Build AI People Trust?

Schedule a demo to see how RedPill can secure your AI use cases.

Schedule a DEMO View Pricing

Frequently Asked Questions

Practical details for building with verifiable confidential inference

Enterprise Ready

Complete Compliance & Security Readiness

RedPill meets the highest security and compliance standards for regulated industries. Our TEE-based architecture ensures your data stays protected while meeting enterprise requirements.

SOC 2 Type II Certified

Annual audits verify our security controls for data protection, availability, and confidentiality meet enterprise standards.

HIPAA Compliant

Healthcare organizations can safely use RedPill with protected health information. BAAs available for enterprise customers.

GDPR & ISO 27001

Full compliance with EU data protection regulations and internationally recognized information security management.

Get an API key in 2 minutes

Start with the OpenAI SDK, choose a confidential model, and add verifier checks when you need audit-grade proof.

Get API Key

Private Chat

TEE model

Hi. Ask me anything and I'll open the confidential chat with openai/gpt-oss-120b selected.

Zero data retentionVerifier ready

The confidential AI cloud: verifiable inference with attestation reports, signed receipts, audit sessions, and E2EE paths.

Confidential AIYou Can Verify

The Confidential AI Stack

ACI Gateway

Channel Binding

Receipt Evidence

Verified Upstreams

YOUR SDK.
VERIFIABLE AI.

Built for How You Work

One API, 200+ Models

Ship Faster, Stay Compliant

Explore AI Models

Confidential AI Models

Trust-Me AI vs Prove-It AI

Trust-Me AI

Prove-It AI

Trusted by Industry Leaders

Start with Confidential AI

Developer

Pay as you go

Enterprise

Custom Pricing

Pro

$35/month (billed annually)

Free

Get Started

Ready to Build AI People Trust?

Frequently Asked Questions

Complete Compliance & Security Readiness

SOC 2 Type II Certified

HIPAA Compliant

GDPR & ISO 27001

Get an API key in 2 minutes

Products

Developers

Resources

Confidential AIYou Can Verify

The Confidential AI Stack

ACI Gateway

Channel Binding

Receipt Evidence

Verified Upstreams

YOUR SDK.VERIFIABLE AI.

Built for How You Work

One API, 200+ Models

Ship Faster, Stay Compliant

Explore AI Models

Confidential AI Models

Trust-Me AI vs Prove-It AI

Trust-Me AI

Prove-It AI

Trusted by Industry Leaders

Start with Confidential AI

Developer

Pay as you go

Enterprise

Custom Pricing

Pro

$35/month (billed annually)

Free

Get Started

Ready to Build AI People Trust?

Frequently Asked Questions

How do I integrate RedPill into my app?

How do I verify a response?

What does the signed receipt prove?

What does ACI E2EE add beyond TLS?

Can I set budgets and rate limits per API key?

How does pricing work for cached tokens?

Can I pay with crypto?

Complete Compliance & Security Readiness

SOC 2 Type II Certified

HIPAA Compliant

GDPR & ISO 27001

Get an API key in 2 minutes

Products

Developers

Resources

YOUR SDK.
VERIFIABLE AI.