Faultlines documentation
One scan maps every feature from git history, scores it, and serves it to your team and your AI agent. Here's how to install it, run it, wire it into your editor, and read every number it emits.
Getting started
Faultlines is two pieces that share one artifact. The engine scans a repo and writes a feature-map JSON; the MCP server serves that JSON to your AI agent. Install both, scan once, point your editor at the MCP, and your agent stops guessing about your code.
Install the MCP
The MCP server is now a standalone, engine-free package — install it on its own when all you need is the agent toolkit:
pip install faultlines-mcpWant the scanner too? The engine ships the MCP as an extra, so a single install gives you both the CLI and the server:
pip install 'faultlines[mcp]'Run a scan
Point the engine at any local repo. The full flag set emits coverage, flow participants, and classifications alongside the feature map:
faultlines analyze ~/my-project --llm --flows --symbols --trace-flowsThe result lands at ~/.faultline/feature-map-<slug>.json — the same artifact the MCP reads. After every push an incremental scan
reruns in about 5 seconds for cents in LLM cost, so your agent's context
never goes stale (PR branches get their own current scan, separate from
main).
Connect your agent
Any MCP-compatible client works — Cursor, Claude Code, Cline, Aider,
Continue. Drop the server into the client's MCP config and restart it.
For Cursor that's ~/.cursor/mcp.json:
{
"mcpServers": {
"faultlines": {
"command": "faultlines-mcp"
}
}
}Then ask your agent “what features touch checkout?” or “what's the regression risk of this diff?” and it answers from the feature map instead of grepping blind.
Deployment modes
The tool API is identical across three deployment modes — only the data path changes. Pick once at the org level; switch any time as your compliance bar shifts.
The MCP server runs in our cloud; your agent calls it over HTTPS with an org-scoped token. Lowest setup friction, and the only mode that works with hosted agents like Claude.ai web. Sentry + PostHog joined server-side.
The server runs as a local process on the developer's machine. It pulls encrypted scans from us, decrypts with your org key in memory, and serves the agent rich data we never see in plaintext. The org key never leaves the machine.
The entire stack — engine, dashboard, MCP — deploys as a Docker image inside your infrastructure. Agents call your internal endpoint; faultlines.dev only sees a license-check ping. Air-gapped and BYO-LLM-key supported.
Features vs Flows
Faultlines describes a codebase with two primitives. Every metric on this page is computed against one or both of them.
A capability of the product. Faultlines emits two grains. An engineering feature is a code-grounded module or package boundary — “auth-middleware”, “webhook-processor”, “qr-code-generator” — what an engineer would name from the repo's structure. A product feature is the customer-facing capability those roll up into — “Multi-factor authentication”, “Stripe Checkout” — what a PM would name from the marketing site.
A complete user-facing action threaded through the code — “Create Project”, “Reset Password”, “Cancel Subscription”. Each flow has an entry-point file and line, its contributing files, and is attached to one or more features. A feature is what the code is; a flow is what a user does with it.
Metrics
One scan produces all of these. Each is explained the same way: what it means, why it matters (with the gradation thresholds), and how it's computed.
Health score
0–100 · per feature & flowWhat it meansA single composite score that rolls bug-fix ratio, churn, and recency-weighted regression density into one number. Recent bug fixes count 2× because last-quarter pain predicts next-quarter pain.
Why it mattersIt's the one number an EM can sort a backlog by. ≥70 is healthy (quietly works — deprioritise), 50–70 is caution (watch list), below 50 is firefighting (actively bleeding velocity). Below 50 sits in the top decile of bug-fix density across every feature we've ever observed.
How it's computedEmpirically calibrated against ~2500 dev features across 17 trained repos. Pure git history — no LLM, no test runs.
Bug-fix ratio
0–1 · rolling 100 commitsWhat it meansThe fraction of recent commits on a feature classified as fixes — matched by commit-message regex (fix:, bug, hotfix, regression).
Why it mattersUnder 0.15 is normal maintenance noise; 0.15–0.30 means something is structurally off; above 0.30 means most commits here are firefighting. Industry-typical SaaS runs 8–15%, so anything sustained above that is structural, not transient.
How it's computedPure git over the rolling window of recent commits per feature. No LLM.
Churn
commits / 100 · per featureWhat it meansHow much engineering attention a feature absorbs, expressed as commits per 100 commits of total repo history scoped to that feature.
Why it mattersUnder 2 is cold (stable or dead code), 2–8 is normal, above 8 is a hotspot — a disproportionate cost centre. The top 10% of features absorb 40%+ of commits, the classic Pareto curve. High churn + low health = split; high churn + high health = freeze the API and protect with contract tests.
How it's computedScale-invariant percentile cut across every corpus repo, so the thresholds hold regardless of repo size.
Impact score
0–100 · blast radiusWhat it meansHow many other features depend on this one via imports, shared files, and co-change. Touching high-impact code ripples downstream.
Why it matters≥70 is critical (a change here ripples across the product), 40–70 is connective (touches multiple flows), under 40 is a safe-to-refactor leaf. Impact ≥70 + low coverage is the case for mandatory senior review — wire it to CODEOWNERS.
How it's computedGraph-centrality percentiles across the feature dependency graph. The top quartile is the "Stripe webhook handler" shape — touched by everything.
Coverage
0–100 · behavioral + lcov mergeWhat it meansPer-feature / per-flow test reach. It answers "is this feature tested?" — not "did some line execute?". Most OSS repos ship no lcov report, so we estimate behavioral coverage from git history with zero setup, then merge real lcov line coverage on top if you upload it.
Why it matters≥70 is covered (regressions surface in CI), 40–70 is partial (happy path only — edge cases ship to prod), under 40 is uncovered. Features below 40 correlate with 3× higher post-deploy incident rate. Don't refactor a feature below 40 — write tests first.
How it's computedA 7-signal composite with zero LLM calls: co_change (0.35) + bug_fix_test (0.25) + freshness (0.15) + density (0.10) + co_author (0.10) + ci_workflow (0.05) + commit_msg (0.05). Each signal and the composite are clamped to [0, 1]. Every number carries a confidence band — high (5+ signals fired or lcov merged), medium (3–4), low (≤2) — so a noisy read can't masquerade as ground truth.
Ownership / bus factor
distinct authors · 90dWhat it meansThe count of distinct authors who touched a feature in the last 90 days. Bus factor 1 means one person knows this code — they leave, you bleed.
Why it matters≥3 is resilient (safe to lose any single author), 2 is fragile (pair-program the next change), 1 is critical (single point of failure). Sub-3 ownership correlates with 2–4 week onboarding delays per incident on the affected feature — and "8 critical features owned by one person" is the conversation that gets headcount approved.
How it's computedDerived from git blame / author history over the trailing 90 days, scoped per feature. No LLM.
MCP tools
The MCP server registers 13 tools. They read precomputed scan fields — the engine does the work at scan time, the server just serves it. The same 13 names exist in every deployment mode.
list_featuresEvery feature with name, display name, path count, health, and coverage.find_featureFuzzy-match a single feature by name or alias; returns its paths, flows, health, coverage, impact, and description.get_feature_filesThe file paths that make up a feature.get_flow_filesThe file paths that make up one flow inside a feature.get_repo_summaryRepo-level counts (features, flows, files), top hotspots, average coverage, and the scan timestamp.get_hotspotsThe highest bug-fix-ratio / churn files or features (limit defaults to 5).get_feature_ownersOwners and bus factor for a feature.analyze_change_impactBlast radius for a set of changed files — which features and flows they touch, co-changed-but-missing files, and recommendations. Engine-free path overlap.get_regression_riskA low / medium / high / critical risk level from bug-fix ratio weighted by path overlap of the touched features.find_symbols_in_flowPrecise functions and classes per file for a flow, with line ranges and deeplinks (falls back to file paths when symbols are absent).find_symbols_for_featureThe same symbol-level detail, aggregated across all of a feature's flows.get_feature_errorsProduction errors mapped to a feature via Sentry (hosted); returns a graceful unavailable result on local MCP.get_feature_pageviewsPostHog traffic and usage for a feature (hosted); graceful unavailable result on local MCP.Integrations
Optional integrations enrich every feature with how it actually behaves in production. Read-only tokens, aggregate data, no SDK changes.
Joins production errors to features. We see per-feature error counts over 24h and 14d, plus a regression flag that fires when the 7-day rate is ≥ 2× the prior 7 days. Zero SDK changes — we read stack-frame filenames + the release commit SHA and join through the scan's path index. Counts and event IDs only, never bodies, never PII.
Joins product usage to features. Page views and URL-tagged events map to features via the same route extractors that build the feature map, giving per-feature traffic share and a 14-day event count. Aggregate only — no per-user data, no PII. One env var enables it: NEXT_PUBLIC_POSTHOG_APP_VERSION=$VERCEL_GIT_COMMIT_SHA.
Pushes the signal where the team already is. A weekly digest summarises top-risk features, coverage gaps, hotspots, and runtime regressions; an instant alert fires when a 🔴 high-risk PR lands. Wire the Sentry regression flag to the digest to auto-tag the on-call.
Security & privacy
Three privacy modes (Standard, Private, Sovereign), per-org KMS envelope encryption, and customer-managed keys. In Private and Sovereign modes feature names and paths are encrypted, and your source code is never written to our disk in plaintext. Runtime integrations stay aggregate — counts and event IDs, never request bodies or PII.
Read the full security & privacy page