THE CODEBASE INTELLIGENCE LAYER FOR THE AI ERA · OPEN SOURCE · HOSTED

Context your AI agent can use.
Signals your team can trust.

Index any repo once. Your AI agents get architecture, ownership, and decisions they can actually query. Your team gets the health, risk, and provenance to trust what those agents ship. Open source and self-hostable.

See repowise dogfooded on itself→View all repos→

2.3k

GitHub stars

2.1k

Repos indexed

MCP tools

<30s

Incremental update

-96%tokens to load context

2,391 vs 64,039 on the same task. ~27x fewer. Answer quality at parity.

2.3xmore defects caught per review

vs a leading commercial tool under the same review budget, same 2,770 files.

0.74cross-project ROC AUC

The only code-health score validated against real defects. Up to 0.90 per repo.

Measured on real repositories. Every number is reproducible on your own codebase.

WHO IT'S FOR

One index. Every team it serves.

repowise indexes your repo once and serves the agents writing your code and the humans accountable for it. Find the path built for you.

developers

Give Claude Code, Cursor, and any MCP client a queryable model of your repo.

Explore

team leads

Flag the risky PRs, the hotspots, and the hidden coupling, on every pull request.

Explore

engineering leaders

See how much of your code AI wrote, whether it is healthy, and who owns it.

Explore

security

CVE triage that knows whether you actually call the vulnerable code.

Explore

ONE INDEX, TWO AUDIENCES

Built for the agents writing your code,
and the humans accountable for it.

repowise indexes your repo once. The same intelligence makes your agents smarter and your code trustworthy.

FOR YOUR AI AGENTS

Context they can use

AI context (MCP)

Nine task-shaped tools. Curated answers, not raw file dumps.

Auto wiki

A documented codebase that rebuilds itself on every commit.

Architecture (C4)

System context to components, from the real dependency graph.

Decisions

The why behind the code, mined from eight sources.

FOR YOUR TEAM

Signals you can trust

Code health

A 1 to 10 score per file, proven to predict real bugs.

Change risk

Know which PRs will break things before you merge.

Agent provenance

How much of your code AI wrote, and whether it is healthy.

Git intelligence

Hotspots, ownership, hidden coupling, and bus factor.

01HOW IT WORKS

One engine, three interfaces

Install once. Choose the interface that fits your workflow — or use all three. They share the same data, the same intelligence, the same stores.

GETTING STARTED

CLI

For the solo developer

GETTING STARTED

$ pip install repowise

$ repowise init .

$ repowise update

$ repowise serve

MCP Server

For AI-native workflows

GETTING STARTED

$ repowise mcp

$ get_overview()

$ get_context(["src/auth"])

$ get_risk(["payments.py"])

Web UI

For the whole team

GETTING STARTED

$ repowise serve

$ localhost:7337/wiki

$ localhost:7337/graph

$ localhost:7337/hotspots

02INTELLIGENCE LAYERS

Most tools answer one question.
repowise answers five.

Graph structure, code health, git history, generated documentation, and architectural decisions — five layers that compound into genuine understanding.

GRAPH

Every dependency, ranked and traced

Tree-sitter ASTs across 10+ languages → directed dependency graph
PageRank and betweenness centrality surface critical symbols
Edge types: imports, calls, inherits, implements, co-changes
Scales to 30K+ nodes with automatic SQLite-backed graph

CODE HEALTH

It knows which file breaks next — before it does

State-of-the-art accuracy: ~73% accurate at calling which files are headed for a bug — and on the same code, the same real defects, it matches or beats the best commercial tools and published academic models.
One 1–10 score from 25 deterministic signals: tangled complexity, hidden coupling, missing tests, runaway churn, fragile ownership. No LLM, no cloud — under 30 seconds on a 3,000-file repo.
The weights are learned from a real defect corpus, not hand-tuned — so it out-predicts “what changed recently” and “what broke before” by 10+ points, and matches published academic defect models on benchmarks it never saw.
Ranks what to fix first by impact-for-effort, and alerts you the moment a file's health starts sliding.

payments/processor.py⚠ LIKELY TO BREAK NEXT

3.1/ 10 health

brain methoduntested hotspotchurn 96%ile3 owners

auth/middleware.py

4.6

api/routes.py

7.2

core/models.py

8.9

~73%state-of-the-art accuracy at calling the file a bug lands in
proven on 21 real projects across 9 languages

GIT

History that writes the documentation

Hotspot detection — top 25% churn + complexity files flagged
Co-change partners: files that change together without imports
Ownership from git blame — primary owner + top 3 contributors
Significant commits filtered into generation prompts

DOCS

Wiki pages that stay fresh

9-level hierarchical generation: symbols → files → modules → repo
Confidence scoring with git-informed decay — stale pages auto-regenerate
RAG context via LanceDB or pgvector — each page knows its imports
Resumable, crash-safe, idempotent — checkpoint after every page

DECISIONS

The why behind your architecture

4 capture sources: inline markers, git archaeology, README mining, CLI
Staleness tracking — decisions age when governed files get commits
get_why() searches decisions before you change anything
Health dashboard: stale decisions, ungoverned hotspots, proposed reviews

THE AI-DEBT RADAR

AI writes half your code.
Can you trust it?

repowise attributes commits to the agents that wrote them, then shows which of that code is a low-health hotspot owned by a single person. From your git history alone. No IDE plugins, no developer surveillance. It is the one view no other tool puts on one screen.

See agent provenance For engineering leaders

~42%industry

of code is AI-written industry-wide in 2026

~1.7xindustry

more issues in AI code than human code

0.74

ROC AUC: the health score that flags the risky parts

04ON EVERY PR

Code health intelligence on every PR

The Repowise PR Bot is a GitHub App that posts one deterministic comment per pull request — hotspots, hidden coupling, declining health, dead code. Zero LLM calls. Green PRs stay silent. Free for public/OSS repos; private repos require the Pro plan.

One comment per PR. Edited in place on re-pushes.
Silence rule. Stays quiet unless health degrades, a hotspot is touched, a co-change partner is missing, or dead code shifts.
Zero LLM cost. Pure tree-sitter, NetworkX, the 12-biomarker scorer.
Free forever for OSS. Private repos unlock with the Pro plan.

See the bot Install on GitHub

repowise-bot · commented on PR #215

⚠️ Health: 7.0 → 6.8 (-0.2)

graph.py3.3 → 2.2▼ -1.1

untested hotspot, brain method, nested complexity

🔥 Hotspot touched

graph.py — 21 commits/90d, 13 dependents
primary owner: Raghav (62%)

🔗 Hidden coupling

graph.py co-changes with orchestrator.py (8×)
— not in this PR.

📊 Full report · Star Repowise

03MCP INTEGRATION

9 tools your AI agent already knows how to call

get_overview()— Architecture summary, module map, entry points, tech stack.
get_answer()— One-call RAG Q&A. Retrieves over the wiki, gates on confidence, returns a cited 2–5 sentence answer.
get_context()— Docs, ownership, history, decisions, freshness for files, modules, or symbols. Pass multiple targets in one call.
get_symbol()— Raw source bytes for one indexed symbol with exact line bounds — cheaper and safer than Read + offset math.
search_codebase()— Semantic search over the full wiki using LanceDB or pgvector. Natural language queries.
get_risk()— Hotspot score, dependents, co-change partners, risk summary. Also returns top 5 global hotspots.
get_why()— Three modes: natural language search over decisions, path-based lookup, or health dashboard.
get_dead_code()— Unreachable files, unused exports, zombie packages — sorted by confidence and cleanup impact.
get_health()— Per-file health scores from 25 deterministic biomarkers, the worst files, and ranked refactoring targets.

SUPPORTED EDITORS

Claude Code|Cursor|Cline|Any MCP client

04EDITOR INTELLIGENCE

CLAUDE.MD

CLAUDE.md that writes itself

Architecture overview from the real dependency graph
Hotspot warnings with churn metrics and owners
Key design decisions and architectural constraints
Dead code summary with confidence scores
Entry points, build commands, and tech stack
Also generates cursor.md — same data, different format

CLAUDE.md

repowise update

✦AUTO-GENERATED BY REPOWISE

1# CLAUDE.md

3## Architecture

4Monorepo · Python + Next.js

53 packages · 0 circular deps

7## Hotspots

8graph-flow.tsx 99th %ile

9init_cmd.py 97th %ile

11## Commands

12Build: npm run build

13Test: npm run test

05HOW WE COMPARE

The full picture, side by side

Auto-generated docs, git intelligence, decision records, and MCP tools — one package
Open-source (AGPL-3.0) and fully self-hostable
17/17 features vs 4-6/17 for any single competitor

Feature	repowise	Google Code Wiki	DeepWiki	CodeScene	Sourcegraph
Self-hostable OSS	✓	—	—	—	—
Works with private repos	✓	—	✓	✓	✓
Auto-generated wiki (LLM)	✓	✓	✓	—	—
Git intelligence (hotspots / ownership / co-changes)	✓	—	—	✓	—
Dead code detection	✓	—	—	—	—
Architectural decision records	✓	—	—	—	—
MCP server for AI agents	✓	—	—	—	—
Semantic search	✓	✓	✓	—	✓
Doc freshness / confidence scoring	✓	—	—	—	—
CLAUDE.md auto-generation	✓	—	—	—	—
Codebase chat (agentic)	✓	✓	✓	—	—
Dependency graph visualization	✓	✓	✓	✓	✓
PR review bot (code-health comments)	✓	—	—	✓	—
Code-health score (per file)	✓	—	—	✓	—
AI code provenance (agent attribution)	✓	—	—	—	—
Provider choice (4 LLM providers)	✓	—	—	—	—
Privacy (code never leaves your infra)	✓	—	—	✓	✓

repowise: 17/17 · Google Code Wiki: 4/17 · DeepWiki: 5/17 · CodeScene: 6/17 · Sourcegraph: 4/17

Self-assessed against publicly documented features as of May 2026. Vendor capabilities change — please verify before committing to any tool.

See full side-by-side comparisons→

BUILT TO BE TRUSTED

No black box. No hand-waving.

Open source

AGPL-3.0. Every heuristic, biomarker, and scoring rule is public and inspectable.

Self-hosted, private

Run it on your own infrastructure. Bring your own API key or go fully offline. Code never leaves your machine.

Deterministic

Zero LLM in scoring. The same input produces the same output, every time. EU AI Act high-risk obligations do not apply.

Reproducible

Every benchmark on this site is measured on real repositories and reproducible on your own.

FREQUENTLY ASKED

Questions, answered

What is repowise?

repowise is a codebase intelligence layer for the AI era. It indexes your repo once and serves both your AI coding agents (an architecture-aware wiki, dependency graph, decisions, and nine MCP tools) and the humans accountable for the code (a defect-validated code-health score, change risk, git intelligence, and agent provenance). It is open source and self-hostable.

Is repowise free and open source?

Yes. The core engine is open source under AGPL-3.0 and runs 100% locally: pip install repowise, bring your own API key, or run fully offline with a local model. There are paid hosted tiers for teams that want zero-ops hosting, private-repo PR comments, and managed re-indexing.

Which AI agents and editors does it work with?

repowise exposes your codebase over the Model Context Protocol, so it works with Claude Code, Cursor, Cline, and Codex, plus any other MCP-compatible client. One index serves every agent.

How much does repowise reduce my agent's token usage?

On paired benchmarks on real repositories, loading context through repowise used 96% fewer tokens (2,391 vs 64,039, about 27 times fewer), with 89% fewer file reads and 70% fewer tool calls, at answer quality on par with raw file exploration.

Does my source code leave my machine?

No. repowise is self-hosted with zero telemetry. Source is processed transiently and never persisted, and you can bring your own LLM key or run fully offline via a local model. What is stored is the graph, non-reversible embeddings, generated wiki pages, and git metadata.

Does the code-health score actually predict bugs?

Yes, and it is validated. Across 21 repos and 9 languages the cross-project ROC AUC is 0.74 (up to 0.90 per repo), and ranking by repowise health surfaces 2.3x the defects of a leading commercial tool under the same review budget. Every heuristic is open source so you can reproduce it on your own repo.

Which languages are supported?

Fifteen languages across the headline tiers, with full pipeline depth for Python, TypeScript, JavaScript, Java, Kotlin, Go, Rust, C++, and C#, including framework-aware route-to-handler edges for the major web frameworks.

How does repowise stay up to date as I keep committing?

It updates incrementally. A post-commit hook, file watcher, or webhook re-indexes only what changed, typically a handful of wiki pages in seconds, and every MCP response carries a staleness envelope that warns when the index has diverged from HEAD.

06FROM THE BLOG

Guides, comparisons, and deep dives

engineering11 min read

Is AI-written code buggier than human code? We blamed 112,000 commits to find out

We git-blamed 112,382 commits across 28 repos to test whether AI-agent code introduces more bugs than human code. After controlling for size, it doesn't, and its lines last longer.

engineering14 min read

Does our code-health score actually predict bugs? A leakage-free benchmark

We scored 21 repos six months before their bugs landed to test whether a deterministic code-health score predicts defects. AUC 0.737, and the honest caveats.

engineering12 min read

Process metrics beat structural metrics for predicting defects

Complexity and code smells are the metrics everyone reaches for. Across 25 biomarkers and 21 repos, the strongest defect predictors were evolutionary, not structural. Here's the data.

View all posts→

07GET IN TOUCH

Three paths to codebase intelligence

Self-host — free, forever
- pip install repowise — your machine, your server, your CI
- AGPL-3.0 · full feature set · code never leaves your infra
Hosted SaaS — live now
- Managed indexing · team workspaces · semantic chat
- Pro at $15/mo with LLM credits included · Sign up free →
Enterprise
- On-prem · SSO · role-based access · dedicated support · SLAs

Context your AI agent can use.Signals your team can trust.

One index. Every team it serves.

developers

team leads

engineering leaders

security

Built for the agents writing your code,and the humans accountable for it.

Context they can use

Signals you can trust

One engine, three interfaces

CLI

MCP Server

Web UI

Most tools answer one question.repowise answers five.

Every dependency, ranked and traced

It knows which file breaks next — before it does

History that writes the documentation

Wiki pages that stay fresh

The why behind your architecture

AI writes half your code.Can you trust it?

Code health intelligence on every PR

9 tools your AI agent already knows how to call

CLAUDE.md that writes itself

The full picture, side by side

No black box. No hand-waving.

Open source

Self-hosted, private

Deterministic

Reproducible

Questions, answered

What is repowise?

Is repowise free and open source?

Which AI agents and editors does it work with?

How much does repowise reduce my agent's token usage?

Does my source code leave my machine?

Does the code-health score actually predict bugs?

Which languages are supported?

How does repowise stay up to date as I keep committing?

Guides, comparisons, and deep dives

Is AI-written code buggier than human code? We blamed 112,000 commits to find out

Does our code-health score actually predict bugs? A leakage-free benchmark

Process metrics beat structural metrics for predicting defects

Three paths to codebase intelligence

Context your AI agent can use.
Signals your team can trust.

Built for the agents writing your code,
and the humans accountable for it.

Most tools answer one question.
repowise answers five.

AI writes half your code.
Can you trust it?