REST API and webhook service for extracting text, metadata, tables, and code intelligence from documents. Wraps the Kreuzberg extraction core with multi-tenant project isolation, presigned uploads, signed webhook delivery, and Stripe-metered billing.
10,000 free pages on signup at kreuzberg.dev/login.
The kreuzberg-cloud plugin is available via the kreuzberg-dev/plugins marketplace.
/plugin marketplace add kreuzberg-dev/plugins
/plugin install kreuzberg-cloud@kreuzberg
v0.1.0 ships skills only; MCP tool support arrives in v0.2.0. Works with Claude Code, Codex, Cursor, Gemini CLI, Factory Droid, GitHub Copilot CLI, and opencode. See the marketplace README for harness-specific install instructions.
- REST API with presigned uploads, polling, and bulk job submission.
- RAG: document collections, ingest pipelines, hybrid retrieval with reranking, and embedding management.
- Signed webhook delivery (HMAC-SHA256) for completion and failure events.
- Multi-tenant project isolation enforced at the PostgreSQL row level.
- Code-aware extraction across 306 programming languages via tree-sitter.
- BUSL-1.1 source available; self-host on Kubernetes via Helm or use the managed service.
curl -X POST https://api.kreuzberg.dev/v1/extract \
-H "Authorization: Bearer kz_..." \
-F "file=@document.pdf" \
-F 'webhook={"url":""}'pip install kreuzberg-cloud-sdk # Python
pnpm add @kreuzberg/cloud # TypeScript
go get github.com/kreuzberg-dev/kreuzberg-cloud-sdk/go # Go
pub add kreuzberg_cloud_sdk # DartAPI reference and language guides: docs.kreuzberg.cloud.
Kreuzberg Cloud is available as a managed service at kreuzberg.dev/login, and as open-source software under the BUSL-1.1 license for on-premise and bring-your-own-cloud deployments. This repository contains everything needed to run Kreuzberg Cloud in your infrastructure: Helm charts, microservices, database schema, and observability dashboards. See docs/self-hosting.md for prerequisites, quick-start deployment instructions, authentication options, and customization checkpoints.
| Service | Purpose |
|---|---|
| api | Public REST API — job submission, polling, results, usage, thin RAG publisher |
| backend | Management API — projects, members, API keys, webhooks |
| billing | Stripe integration — metered billing and quota enforcement |
| rag | RAG ingest + retrieve consumer; owns reranker ONNX runtime and LLM egress |
| worker | Document processing via Kreuzberg; scales to zero with KEDA |
| webhook | Signed HTTP delivery of completion/failure events |
Details: docs/concepts/architecture.md. OpenAPI spec: services/api/spec/openapi.json. Security policy: SECURITY.md.
- Kreuzberg — document intelligence: text, tables, metadata from 91+ formats with optional OCR. Self-host counterpart to Kreuzberg Cloud.
- kreuzcrawl — web crawling and scraping with HTML→Markdown and headless-Chrome fallback.
- html-to-markdown — fast, lossless HTML→Markdown engine.
- liter-llm — universal LLM API client with native bindings for 14 languages and 143 providers.
- tree-sitter-language-pack — tree-sitter grammars and code-intelligence primitives.
- Discord — community, roadmap, announcements.
Business Source License 1.1. The source is available for review and non-production use. Production use requires a commercial license or use of the Kreuzberg-operated managed service.