Code Indexing MCP Server
Find a file
Dirk Hoyer 3e7461c6f8
All checks were successful
CI / cargo fmt (push) Successful in 15s
CI / cargo test (push) Successful in 1m49s
CI / cargo clippy (push) Successful in 49s
chore: gitignore .mcp.json, keep .mcp.json.example
.mcp.json is a per-developer MCP-client config file. Some contributors
add MCP servers that require API tokens (e.g. forgejo-mcp), and those
end up as plaintext secrets inside the JSON. Move .mcp.json out of git
and keep the tracked content as .mcp.json.example so new contributors
still get the code-index wiring template:

    cp .mcp.json.example .mcp.json

Rotate any token that previously appeared in a tracked .mcp.json.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 07:19:35 +02:00
.forgejo/workflows ci: switch to secrets.SNOTR_FORGEJO_TOKEN 2026-05-12 09:58:18 +02:00
_prdoc release: 0.3.0 — non-blocking warmup startup (I015) 2026-05-12 09:08:59 +02:00
crates tests: wait for OnceCell fill inside initialize() helper 2026-05-12 10:04:00 +02:00
tests/fixtures workspace: [[links]]-driven multi-project routing + Windows lockfile fix 2026-04-30 23:18:55 +02:00
.gitignore chore: gitignore .mcp.json, keep .mcp.json.example 2026-05-13 07:19:35 +02:00
.mcp.json.example chore: gitignore .mcp.json, keep .mcp.json.example 2026-05-13 07:19:35 +02:00
ARCHITECTURE.md docs: update README + ARCHITECTURE for v0.3.0 / I015 2026-05-13 07:13:12 +02:00
Cargo.lock release: 0.3.0 — non-blocking warmup startup (I015) 2026-05-12 09:08:59 +02:00
Cargo.toml release: 0.3.0 — non-blocking warmup startup (I015) 2026-05-12 09:08:59 +02:00
CONTRIBUTING.md M5: doctor command, README, ARCHITECTURE, CONTRIBUTING, closure record 2026-04-28 21:51:09 +02:00
README.md docs: update README + ARCHITECTURE for v0.3.0 / I015 2026-05-13 07:13:12 +02:00

code-index

Fast structural code index for AI coding agents, exposed over MCP.

Five languages indexed today: Rust, Python, TypeScript / JavaScript, C#, PHP. Live updates via filesystem watcher. Symbols, references, imports, full-text search. Local SQLite, no network, no service.

Quick start

Grab a release archive from git.h-dv.de/h-dv/code-index/releases (linux-x86_64, linux-aarch64, windows-x86_64; bundles the three binaries plus a per-platform README and .code-index.toml.example), or build from source:

# build
cargo build --release

# initialise a project (one-off)
./target/release/code-index init
./target/release/code-index index

# health check
./target/release/code-index doctor

Wire into Claude Code (or any MCP client):

// claude_desktop_config.json or equivalent
{
  "mcpServers": {
    "code-index": {
      "command": "/path/to/code-index-mcp",
      "args": ["--root", "/path/to/your/project"]
    }
  }
}

That's it. The MCP server auto-spawns the daemon on first connect and initialize returns in milliseconds (see I015) regardless of project size; the daemon runs its initial reconciliation in the background and lives until 30 minutes of idle (configurable) or your editor session ends. All ten MCP tools, four resources, and two prompts are available immediately — calls made during the cold-start window either succeed, return state: "reconciling", or return a structured warming_up error the agent retries automatically.

Project-scoped .mcp.json: gotcha

If you drop a per-project .mcp.json (instead of the user-scope claude_desktop_config.json shown above), Claude Code will not register the server retroactively in an already-running session. Symptom: code-index tools don't appear in the agent's tool list, and the agent silently does without them. Two paths out:

  1. Restart Claude Code in the project dir and approve the prompt. When a project-scoped .mcp.json is detected for the first time, Claude Code asks the user to approve each declared server. Decline once and it stays declined.
  2. Or pin code-index at user scope so it's available everywhere without per-project approval:
    claude mcp add code-index code-index-mcp
    claude mcp list   # verify
    

code-index doctor includes an mcp registration check that surfaces the most common cause ("project hasn't been opened in Claude Code yet") with a pointer to claude mcp list.

What the agent gets

  • Handles, not blobs. Symbol responses carry IDs, locations, and signatures — never embedded source. The agent calls read_code only when it actually needs bytes.
  • Paginated, token-bounded responses. Lists return {results, total, next_cursor}; read_code is hard-capped at 4000 tokens with a truncated: true flag on overflow.
  • Structured errors. {error, query?, did_you_mean[], hint?} — not free-text strings.
  • Live freshness. Edits, creates, renames, and deletes in your editor flow into the DB within a few hundred milliseconds. Atomic saves are coalesced.

The five languages

Language Extensions Plugin
Rust .rs tree-sitter-rust
Python .py, .pyi, .pyw tree-sitter-python
TypeScript / JavaScript .ts .tsx .js .jsx .mjs .cjs tree-sitter-typescript / -javascript
C# .cs (skips *.Designer.cs, *.g.cs) tree-sitter-c-sharp
PHP .php .php3 .php4 .php5 .phtml tree-sitter-php

Default-skipped directories: node_modules, vendor, __pycache__, .venv, .tox, dist, build, .next, .code-index, plus target/ (Rust roots only) and bin//obj/ (.NET roots only). Override via .code-index-ignore (same syntax as .gitignore).

Binaries

Binary Purpose
code-index CLI for one-off indexing, watching, doctor
code-index-daemon Long-lived watcher + RPC server
code-index-mcp MCP stdio bridge for AI agents

You normally only invoke code-index-mcp; it spawns the daemon for you.

CLI reference

code-index init                           # write a default .code-index.toml
code-index index                          # one-shot index of the current project
code-index watch                          # initial index + live updates (foreground)
code-index doctor                         # diagnose: DB, schema, daemon, inotify, plugins
code-index link add <name> <path>         # add a [[links]] entry (workspace links, I011)
code-index link list [--json]             # list configured links + live daemon status
code-index link remove <name>             # remove a [[links]] entry

Add --root <path> to operate on a project other than the current dir.

The link subcommands manage the [[links]] array in .code-index.toml without you having to hand-edit TOML. Comments and whitespace in the file are preserved; writes are atomic; validation matches the MCP runtime exactly (reserved name, duplicates, self-cycle, link-link path collisions all caught at write time).

Auto-detected project root

If you don't pass --root, code-index walks up from your CWD looking for:

  1. .code-index.toml — innermost wins. Drop one in any directory to pin that as the root regardless of any marker above it. code-index init creates one for you.
  2. .git/ — innermost wins. Stops the walk: a workspace at repo/ with sub-crates at repo/crates/foo/ is correctly detected as repo/ even when you run code-index from inside a sub-crate. A stray ~/.git (dotfiles repo) won't trip detection because a closer .git/ always wins.
  3. Outermost Cargo.toml / package.json / pyproject.toml / composer.json / *.csproj / *.sln — only when no .git/ is found anywhere up the chain. Catches non-git workspaces.

The walk is bounded at $HOME so a marker far above your home directory can't poison detection.

Workspace links — querying multiple projects from one MCP entry

A customer install that extends a base product, a fork that needs to compare against upstream, an app that sits on a vendored library — common patterns where one Claude / Cursor / Cline session needs to query two indices at once.

Drop a [workspace] + [[links]] block into the primary project's .code-index.toml:

[workspace]
name = "h-dv"

[[links]]
name         = "timeline"                       # routing key for tool calls
path         = "../../code/timeline/16.0/source"  # absolute or relative to this file
relationship = "base"                           # base | dependency | fork_source | sibling
description  = "Timeline 16.0 — base product"

That's it. Your existing MCP client config keeps one entry pointing at the primary root; the server attaches a daemon for each link at startup. Tools route across projects three different ways:

// (a) Name/phrase searches FAN OUT by default — primary + every
//     available link. Each hit carries a `project` tag.
search_symbols { "query": "TimelineService" }
// → [{ name: "TimelineService", project: "primary", … },
//    { name: "TimelineService", project: "timeline", … }]

// Pin a single project by passing its name:
search_symbols { "query": "TimelineService", "project": "timeline" }

// (b) Path-shaped tools (file_outline, read_code, get_dependencies)
//     AUTO-ROUTE: an absolute path picks the owning project by
//     longest root-prefix; relative paths default to primary.
file_outline { "path": "/abs/path/under/timeline/foo.cs" }   // → timeline
file_outline { "path": "src/foo.rs" }                         // → primary

// (c) Symbol-id tools (get_symbol, find_references, find_callers,
//     find_callees) are project-local: pass the `project` from
//     the row that produced the id.
find_references { "symbol_id": 42, "project": "timeline" }

project_overview on primary lists every linked project with its index health, and the MCP info.instructions block names each project + canonical root on initialize — so a fresh agent discovers the topology without any extra calls.

What it does: fan-out + path auto-routing + isolation, structured errors when a project name is unknown (project_not_found with did_you_mean), configured-but-unavailable (project_not_available — daemon failed to spawn or path was unreachable), or a path lives outside every indexed root (path_outside_known_roots with the known roots in did_you_mean).

What it doesn't do (yet): cross-project ref resolution (find_callers on a Timeline symbol won't surface h-dv callers). Phase 2 adds find_base_definition / find_overrides / find_call_into_base via qualified_name joins over ATTACH DATABASE.

See _prdoc/missions/I011-workspace-links.md for the full design.

Architecture

See ARCHITECTURE.md for the design rationale and the diagrams. See CONTRIBUTING.md to build, test, and add a language plugin. Specs and adaptation reports live under _prdoc/.

Status

Current release: v0.3.0 (changelog).

Milestone Scope Shipped in
M1 Workspace, schema, Rust plugin, CLI pre-0.1.0
M2 Parse pool, writer thread, layered change detection pre-0.1.0
I003 C#, Python, TS/JS, PHP plugins pre-0.1.0
M4 MCP server: 10 tools, 4 resources, 2 prompts 0.1.0
I004 notify watcher 0.1.0
I005 Daemon, RPC, lockfile, auto-spawn 0.1.0
M5 doctor, docs, metrics 0.1.0
I011 Workspace links, multi-project routing 0.2.0
I013 CLI link management (link add|list|remove) 0.2.0
I015 Non-blocking startup, async daemon attach, warming_up 0.3.0

License

Apache-2.0 OR MIT