Code Indexing MCP Server

Rust 100%

Find a file

Dirk Hoyer 3e7461c6f8 All checks were successful CI / cargo fmt (push) Successful in 15s Details CI / cargo test (push) Successful in 1m49s Details CI / cargo clippy (push) Successful in 49s Details chore: gitignore .mcp.json, keep .mcp.json.example .mcp.json is a per-developer MCP-client config file. Some contributors add MCP servers that require API tokens (e.g. forgejo-mcp), and those end up as plaintext secrets inside the JSON. Move .mcp.json out of git and keep the tracked content as .mcp.json.example so new contributors still get the code-index wiring template: cp .mcp.json.example .mcp.json Rotate any token that previously appeared in a tracked .mcp.json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-13 07:19:35 +02:00
.forgejo/workflows	ci: switch to secrets.SNOTR_FORGEJO_TOKEN	2026-05-12 09:58:18 +02:00
_prdoc	release: 0.3.0 — non-blocking warmup startup (I015)	2026-05-12 09:08:59 +02:00
crates	tests: wait for OnceCell fill inside initialize() helper	2026-05-12 10:04:00 +02:00
tests/fixtures	workspace: [[links]]-driven multi-project routing + Windows lockfile fix	2026-04-30 23:18:55 +02:00
.gitignore	chore: gitignore .mcp.json, keep .mcp.json.example	2026-05-13 07:19:35 +02:00
.mcp.json.example	chore: gitignore .mcp.json, keep .mcp.json.example	2026-05-13 07:19:35 +02:00
ARCHITECTURE.md	docs: update README + ARCHITECTURE for v0.3.0 / I015	2026-05-13 07:13:12 +02:00
Cargo.lock	release: 0.3.0 — non-blocking warmup startup (I015)	2026-05-12 09:08:59 +02:00
Cargo.toml	release: 0.3.0 — non-blocking warmup startup (I015)	2026-05-12 09:08:59 +02:00
CONTRIBUTING.md	M5: doctor command, README, ARCHITECTURE, CONTRIBUTING, closure record	2026-04-28 21:51:09 +02:00
README.md	docs: update README + ARCHITECTURE for v0.3.0 / I015	2026-05-13 07:13:12 +02:00

README.md

code-index

Fast structural code index for AI coding agents, exposed over MCP.

Five languages indexed today: Rust, Python, TypeScript / JavaScript, C#, PHP. Live updates via filesystem watcher. Symbols, references, imports, full-text search. Local SQLite, no network, no service.

Quick start

Grab a release archive from git.h-dv.de/h-dv/code-index/releases (linux-x86_64, linux-aarch64, windows-x86_64; bundles the three binaries plus a per-platform README and .code-index.toml.example), or build from source:

# build
cargo build --release

# initialise a project (one-off)
./target/release/code-index init
./target/release/code-index index

# health check
./target/release/code-index doctor

Wire into Claude Code (or any MCP client):

// claude_desktop_config.json or equivalent
{
  "mcpServers": {
    "code-index": {
      "command": "/path/to/code-index-mcp",
      "args": ["--root", "/path/to/your/project"]
    }
  }
}

That's it. The MCP server auto-spawns the daemon on first connect and initialize returns in milliseconds (see I015) regardless of project size; the daemon runs its initial reconciliation in the background and lives until 30 minutes of idle (configurable) or your editor session ends. All ten MCP tools, four resources, and two prompts are available immediately — calls made during the cold-start window either succeed, return state: "reconciling", or return a structured warming_up error the agent retries automatically.

Project-scoped `.mcp.json`: gotcha

If you drop a per-project .mcp.json (instead of the user-scope claude_desktop_config.json shown above), Claude Code will not register the server retroactively in an already-running session. Symptom: code-index tools don't appear in the agent's tool list, and the agent silently does without them. Two paths out:

Restart Claude Code in the project dir and approve the prompt. When a project-scoped .mcp.json is detected for the first time, Claude Code asks the user to approve each declared server. Decline once and it stays declined.
Or pin code-index at user scope so it's available everywhere without per-project approval:
```
claude mcp add code-index code-index-mcp
claude mcp list   # verify
```

code-index doctor includes an mcp registration check that surfaces the most common cause ("project hasn't been opened in Claude Code yet") with a pointer to claude mcp list.

What the agent gets

Handles, not blobs. Symbol responses carry IDs, locations, and signatures — never embedded source. The agent calls read_code only when it actually needs bytes.
Paginated, token-bounded responses. Lists return {results, total, next_cursor}; read_code is hard-capped at 4000 tokens with a truncated: true flag on overflow.
Structured errors. {error, query?, did_you_mean[], hint?} — not free-text strings.
Live freshness. Edits, creates, renames, and deletes in your editor flow into the DB within a few hundred milliseconds. Atomic saves are coalesced.

The five languages

Language	Extensions	Plugin
Rust	`.rs`	tree-sitter-rust
Python	`.py`, `.pyi`, `.pyw`	tree-sitter-python
TypeScript / JavaScript	`.ts .tsx .js .jsx .mjs .cjs`	tree-sitter-typescript / -javascript
C#	`.cs` (skips `.Designer.cs`, `.g.cs`)	tree-sitter-c-sharp
PHP	`.php .php3 .php4 .php5 .phtml`	tree-sitter-php

Default-skipped directories: node_modules, vendor, __pycache__, .venv, .tox, dist, build, .next, .code-index, plus target/ (Rust roots only) and bin//obj/ (.NET roots only). Override via .code-index-ignore (same syntax as .gitignore).

Binaries

Binary	Purpose
`code-index`	CLI for one-off indexing, watching, doctor
`code-index-daemon`	Long-lived watcher + RPC server
`code-index-mcp`	MCP stdio bridge for AI agents

You normally only invoke code-index-mcp; it spawns the daemon for you.

CLI reference

code-index init                           # write a default .code-index.toml
code-index index                          # one-shot index of the current project
code-index watch                          # initial index + live updates (foreground)
code-index doctor                         # diagnose: DB, schema, daemon, inotify, plugins
code-index link add <name> <path>         # add a [[links]] entry (workspace links, I011)
code-index link list [--json]             # list configured links + live daemon status
code-index link remove <name>             # remove a [[links]] entry

Add --root <path> to operate on a project other than the current dir.

The link subcommands manage the [[links]] array in .code-index.toml without you having to hand-edit TOML. Comments and whitespace in the file are preserved; writes are atomic; validation matches the MCP runtime exactly (reserved name, duplicates, self-cycle, link-link path collisions all caught at write time).

Auto-detected project root

If you don't pass --root, code-index walks up from your CWD looking for:

.code-index.toml — innermost wins. Drop one in any directory to pin that as the root regardless of any marker above it. code-index init creates one for you.
.git/ — innermost wins. Stops the walk: a workspace at repo/ with sub-crates at repo/crates/foo/ is correctly detected as repo/ even when you run code-index from inside a sub-crate. A stray ~/.git (dotfiles repo) won't trip detection because a closer .git/ always wins.
Outermost Cargo.toml / package.json / pyproject.toml / composer.json / *.csproj / *.sln — only when no .git/ is found anywhere up the chain. Catches non-git workspaces.

The walk is bounded at $HOME so a marker far above your home directory can't poison detection.

Workspace links — querying multiple projects from one MCP entry

A customer install that extends a base product, a fork that needs to compare against upstream, an app that sits on a vendored library — common patterns where one Claude / Cursor / Cline session needs to query two indices at once.

Drop a [workspace] + [[links]] block into the primary project's .code-index.toml:

[workspace]
name = "h-dv"

[[links]]
name         = "timeline"                       # routing key for tool calls
path         = "../../code/timeline/16.0/source"  # absolute or relative to this file
relationship = "base"                           # base | dependency | fork_source | sibling
description  = "Timeline 16.0 — base product"

That's it. Your existing MCP client config keeps one entry pointing at the primary root; the server attaches a daemon for each link at startup. Tools route across projects three different ways:

// (a) Name/phrase searches FAN OUT by default — primary + every
//     available link. Each hit carries a `project` tag.
search_symbols { "query": "TimelineService" }
// → [{ name: "TimelineService", project: "primary", … },
//    { name: "TimelineService", project: "timeline", … }]

// Pin a single project by passing its name:
search_symbols { "query": "TimelineService", "project": "timeline" }

// (b) Path-shaped tools (file_outline, read_code, get_dependencies)
//     AUTO-ROUTE: an absolute path picks the owning project by
//     longest root-prefix; relative paths default to primary.
file_outline { "path": "/abs/path/under/timeline/foo.cs" }   // → timeline
file_outline { "path": "src/foo.rs" }                         // → primary

// (c) Symbol-id tools (get_symbol, find_references, find_callers,
//     find_callees) are project-local: pass the `project` from
//     the row that produced the id.
find_references { "symbol_id": 42, "project": "timeline" }

project_overview on primary lists every linked project with its index health, and the MCP info.instructions block names each project + canonical root on initialize — so a fresh agent discovers the topology without any extra calls.

What it does: fan-out + path auto-routing + isolation, structured errors when a project name is unknown (project_not_found with did_you_mean), configured-but-unavailable (project_not_available — daemon failed to spawn or path was unreachable), or a path lives outside every indexed root (path_outside_known_roots with the known roots in did_you_mean).

What it doesn't do (yet): cross-project ref resolution (find_callers on a Timeline symbol won't surface h-dv callers). Phase 2 adds find_base_definition / find_overrides / find_call_into_base via qualified_name joins over ATTACH DATABASE.

See _prdoc/missions/I011-workspace-links.md for the full design.

Architecture

See ARCHITECTURE.md for the design rationale and the diagrams. See CONTRIBUTING.md to build, test, and add a language plugin. Specs and adaptation reports live under _prdoc/.

Status

Current release: v0.3.0 (changelog).

Milestone	Scope	Shipped in
M1	Workspace, schema, Rust plugin, CLI	pre-0.1.0
M2	Parse pool, writer thread, layered change detection	pre-0.1.0
I003	C#, Python, TS/JS, PHP plugins	pre-0.1.0
M4	MCP server: 10 tools, 4 resources, 2 prompts	0.1.0
I004	notify watcher	0.1.0
I005	Daemon, RPC, lockfile, auto-spawn	0.1.0
M5	doctor, docs, metrics	0.1.0
I011	Workspace links, multi-project routing	0.2.0
I013	CLI link management (`link add\|list\|remove`)	0.2.0
I015	Non-blocking startup, async daemon attach, `warming_up`	0.3.0

License

Apache-2.0 OR MIT