Code Indexing MCP Server
Find a file
Dirk Hoyer 2a1bae77c9
All checks were successful
Release Build / Generate Version (push) Successful in 15s
CI / cargo fmt (push) Successful in 16s
CI / cargo clippy (push) Successful in 1m11s
CI / cargo test (push) Successful in 2m55s
Release Build / Build windows-x86_64 (push) Successful in 5m20s
Release Build / Build linux-x86_64-musl (push) Successful in 5m27s
Release Build / Build linux-x86_64 (push) Successful in 4m33s
Release Build / Build linux-aarch64 (push) Successful in 5m47s
Release Build / Create Forgejo Release (push) Successful in 1m56s
chore(release): v0.5.5
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 10:24:20 +02:00
.forgejo/workflows ci(release): disable macOS targets until a macOS runner exists 2026-06-04 16:21:10 +02:00
_prdoc feat(daemon): periodic safety-net reconcile to heal missed watcher events 2026-05-20 12:19:04 +02:00
crates docs(mcp): strengthen tool-choice marketing for linked sources 2026-06-17 10:24:20 +02:00
tests/fixtures feat(plugins): Ruby language support (Rails-aware) + parser recursion hardening 2026-06-04 15:29:54 +02:00
.gitignore chore: gitignore .mcp.json, keep .mcp.json.example 2026-05-13 07:19:35 +02:00
.mcp.json.example chore: gitignore .mcp.json, keep .mcp.json.example 2026-05-13 07:19:35 +02:00
ARCHITECTURE.md fix: harden daemon RPC, MCP server, and CLI per ultradeep review 2026-06-11 21:55:22 +02:00
Cargo.lock chore(release): v0.5.5 2026-06-17 10:24:20 +02:00
Cargo.toml chore(release): v0.5.5 2026-06-17 10:24:20 +02:00
CONTRIBUTING.md M5: doctor command, README, ARCHITECTURE, CONTRIBUTING, closure record 2026-04-28 21:51:09 +02:00
README.md fix: harden daemon RPC, MCP server, and CLI per ultradeep review 2026-06-11 21:55:22 +02:00

code-index

Fast structural code index for AI coding agents, exposed over MCP.

Six languages indexed today: Rust, Python, TypeScript / JavaScript, C#, PHP, Ruby. Live updates via filesystem watcher. Symbols, references, imports, full-text search. Local SQLite, no network, no service.

Quick start

Grab a release archive from git.h-dv.de/h-dv/code-index/releases (linux-x86_64, linux-aarch64, windows-x86_64; bundles the three binaries plus a per-platform README and .code-index.toml.example), or build from source:

# build
cargo build --release

# initialise a project (one-off)
./target/release/code-index init
./target/release/code-index index

# health check
./target/release/code-index doctor

Wire into Claude Code (or any MCP client):

// claude_desktop_config.json or equivalent
{
  "mcpServers": {
    "code-index": {
      "command": "/path/to/code-index-mcp",
      "args": ["--root", "/path/to/your/project"]
    }
  }
}

That's it. The MCP server auto-spawns the daemon on first connect and initialize returns in milliseconds (see I015) regardless of project size; the daemon runs its initial reconciliation in the background and lives until 30 minutes of idle (configurable) or your editor session ends. All ten MCP tools, four resources, and two prompts are available immediately — calls made during the cold-start window either succeed, return state: "reconciling", or return a structured warming_up error the agent retries automatically.

Project-scoped .mcp.json: gotcha

If you drop a per-project .mcp.json (instead of the user-scope claude_desktop_config.json shown above), Claude Code will not register the server retroactively in an already-running session. Symptom: code-index tools don't appear in the agent's tool list, and the agent silently does without them. Two paths out:

  1. Restart Claude Code in the project dir and approve the prompt. When a project-scoped .mcp.json is detected for the first time, Claude Code asks the user to approve each declared server. Decline once and it stays declined.
  2. Or pin code-index at user scope so it's available everywhere without per-project approval:
    claude mcp add code-index code-index-mcp
    claude mcp list   # verify
    

code-index doctor includes an mcp registration check that surfaces the most common cause ("project hasn't been opened in Claude Code yet") with a pointer to claude mcp list.

What the agent gets

  • Handles, not blobs. Symbol responses carry IDs, locations, and signatures — never embedded source. The agent calls read_code only when it actually needs bytes.
  • Paginated, token-bounded responses. Lists return {results, total, next_cursor}; read_code is hard-capped at 4000 tokens with a truncated: true flag on overflow.
  • Structured errors. {error, query?, did_you_mean[], hint?} — not free-text strings.
  • Live freshness. Edits, creates, renames, and deletes in your editor flow into the DB within a few hundred milliseconds. Atomic saves are coalesced.

The six languages

Language Extensions Plugin
Rust .rs tree-sitter-rust
Python .py, .pyi, .pyw tree-sitter-python
TypeScript / JavaScript .ts .tsx .js .jsx .mjs .cjs tree-sitter-typescript / -javascript
C# .cs (skips *.Designer.cs, *.g.cs) tree-sitter-c-sharp
PHP .php .php3 .php4 .php5 .phtml tree-sitter-php
Ruby .rb .rake .gemspec, Rakefile, Gemfile tree-sitter-ruby

Ruby extraction is Rails-aware: association macros (has_many, belongs_to, has_one, has_and_belongs_to_many) emit a type reference to the associated model class (honoring class_name: and singularizing plural names), and mixins (include/extend/prepend) reference the mixed-in module.

Default-skipped directories: node_modules, vendor, __pycache__, .venv, .tox, dist, build, .next, .code-index, plus target/ (Rust roots only) and bin//obj/ (.NET roots only). Override via .code-index-ignore (same syntax as .gitignore).

Binaries

Binary Purpose
code-index CLI for one-off indexing, watching, doctor
code-index-daemon Long-lived watcher + RPC server
code-index-mcp MCP stdio bridge for AI agents

You normally only invoke code-index-mcp; it spawns the daemon for you.

CLI reference

code-index init                           # write a default .code-index.toml
code-index index                          # one-shot index of the current project
code-index watch                          # initial index + live updates (foreground)
code-index doctor                         # diagnose: DB, schema, daemon, inotify, plugins
code-index link add <name> <path>         # add a [[links]] entry (workspace links, I011)
code-index link list [--json]             # list configured links + live daemon status
code-index link remove <name>             # remove a [[links]] entry

Add --root <path> to operate on a project other than the current dir.

The link subcommands manage the [[links]] array in .code-index.toml without you having to hand-edit TOML. Comments and whitespace in the file are preserved; writes are atomic; validation matches the MCP runtime exactly (reserved name, duplicates, self-cycle, link-link path collisions all caught at write time).

Auto-detected project root

If you don't pass --root, code-index walks up from your CWD looking for:

  1. .code-index.toml — innermost wins. Drop one in any directory to pin that as the root regardless of any marker above it. code-index init creates one for you.
  2. .git/ — innermost wins. Stops the walk: a workspace at repo/ with sub-crates at repo/crates/foo/ is correctly detected as repo/ even when you run code-index from inside a sub-crate. A stray ~/.git (dotfiles repo) won't trip detection because a closer .git/ always wins.
  3. Outermost Cargo.toml / package.json / pyproject.toml / composer.json / *.csproj / *.sln — only when no .git/ is found anywhere up the chain. Catches non-git workspaces.

The walk is bounded at $HOME so a marker far above your home directory can't poison detection.

Workspace links — querying multiple projects from one MCP entry

A customer install that extends a base product, a fork that needs to compare against upstream, an app that sits on a vendored library — common patterns where one Claude / Cursor / Cline session needs to query two indices at once.

Drop a [workspace] + [[links]] block into the primary project's .code-index.toml:

[workspace]
name = "h-dv"

[[links]]
name         = "timeline"                       # routing key for tool calls
path         = "../../code/timeline/16.0/source"  # absolute or relative to this file
relationship = "base"                           # base | dependency | fork_source | sibling
description  = "Timeline 16.0 — base product"

That's it. Your existing MCP client config keeps one entry pointing at the primary root; the server attaches a daemon for each link at startup. Tools route across projects three different ways:

// (a) Name/phrase searches FAN OUT by default — primary + every
//     available link. Each hit carries a `project` tag.
search_symbols { "query": "TimelineService" }
// → [{ name: "TimelineService", project: "primary", … },
//    { name: "TimelineService", project: "timeline", … }]

// Pin a single project by passing its name:
search_symbols { "query": "TimelineService", "project": "timeline" }

// (b) Path-shaped tools (file_outline, read_code, get_dependencies)
//     AUTO-ROUTE: an absolute path picks the owning project by
//     longest root-prefix; relative paths default to primary.
file_outline { "path": "/abs/path/under/timeline/foo.cs" }   // → timeline
file_outline { "path": "src/foo.rs" }                         // → primary

// (c) Symbol-id tools (get_symbol, find_references, find_callers,
//     find_callees) are project-local: pass the `project` from
//     the row that produced the id.
find_references { "symbol_id": 42, "project": "timeline" }

project_overview on primary lists every linked project with its index health, and the MCP info.instructions block names each project + canonical root on initialize — so a fresh agent discovers the topology without any extra calls.

What it does: fan-out + path auto-routing + isolation, structured errors when a project name is unknown (project_not_found with did_you_mean), configured-but-unavailable (project_not_available — daemon failed to spawn or path was unreachable), or a path lives outside every indexed root (path_outside_known_roots with the known roots in did_you_mean).

What it doesn't do (yet): cross-project ref resolution (find_callers on a Timeline symbol won't surface h-dv callers). Phase 2 adds find_base_definition / find_overrides / find_call_into_base via qualified_name joins over ATTACH DATABASE.

See _prdoc/missions/I011-workspace-links.md for the full design.

Architecture

See ARCHITECTURE.md for the design rationale and the diagrams. See CONTRIBUTING.md to build, test, and add a language plugin. Specs and adaptation reports live under _prdoc/.

Status

Current release: v0.5.0 (changelog).

Milestone Scope Shipped in
M1 Workspace, schema, Rust plugin, CLI pre-0.1.0
M2 Parse pool, writer thread, layered change detection pre-0.1.0
I003 C#, Python, TS/JS, PHP plugins pre-0.1.0
M4 MCP server: 10 tools, 4 resources, 2 prompts 0.1.0
I004 notify watcher 0.1.0
I005 Daemon, RPC, lockfile, auto-spawn 0.1.0
M5 doctor, docs, metrics 0.1.0
I011 Workspace links, multi-project routing 0.2.0
I013 CLI link management (link add|list|remove) 0.2.0
I015 Non-blocking startup, async daemon attach, warming_up 0.3.0
I014 Periodic safety-net reconcile (heals missed watcher events) 0.3.1
I016 Ruby plugin (Rails-aware) + parser recursion-depth guards 0.4.0
I016b Recursion guards extended to py/cs/ts type-expression helpers 0.4.1
I017 CI green + release pipeline fixed; multi-platform binaries (linux x86_64/musl/aarch64, windows) 0.4.2
I018 Self-healing daemon reconnect + root-identity handshake + reconnect tracing (issue #7) 0.5.0

Support this project

If code-index MCP saves you time, consider a donation:

PayPal

One-off: paypal.me/hdvde

Please add code-index MCP as the payment reference — helps with bookkeeping. Donations are voluntary; no rewards, support guarantees, or feature promises attached.

License

Apache-2.0