fix(daemon): self-healing reconnect + root-identity handshake (issue #7) #8
No reviewers
Labels
No labels
Kind/Breaking
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
h-dv/code-index!8
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/issue-7-daemon-reconnect-identity"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fixes #7 — a workspace-links user reported the primary MCP server "attaching to a linked project's daemon", with every tool call then failing
-32603 sending <op>until the MCP server was restarted.A deep multi-agent review against current code found the headline was a misread (per-root lockfile discovery is structurally incapable of attaching to a link's port — and the attach log carried no project tag, so the link's own attach line read identically to the primary's). But two real, confirmed defects underlie the symptom.
Root causes
RpcIndexnever reconnects. It cached a singleTcpStreamwith no health re-check, so once its daemon died / was killed / restarted onto a new port, every subsequent call wrote to a dead socket forever. This is the actual cause of the reported outage.Statsand the lockfile payload carried noroot, so attach/reconnect could in principle latch onto any code-index daemon on a given port.Fixes (all wire-compatible — new fields are
Option+skip_if_none)RpcIndex::connect_with_reconnect(root, port): on a connection-level failure, drop the dead channel, re-read the lockfile, reconnect to the current port, verify the daemon servesroot, and retry once. Bareconnect(port)keeps the old no-reconnect behavior (tests, lifetime owners). Retry is connection-errors-only (never anRpcErrorfrom dispatch) and capped at one attempt.rootin the stats response (stamped by the daemon dispatch layer), verified on attach (main.rs) and reconnect (rpc_index.rs). A daemon that omitsroot(pre-0.5.0) is accepted — unverifiable, not wrong — so no respawn churn against old daemons.rootin the lockfile payload;Lockfile::readrejects a payload describing a different project (aliased/symlinked.code-index/).warnbefore mapping the ~19 MCP tool-handler internal errors (was baree.to_string(), nothing logged); tag attach logs withroot(the missing project tag that drove the misdiagnosis); raise the daemon transport-failure log towarn.resolve_linkswarns when a link is nested under the primary root.Compatibility
Every wire change is backward/forward compatible: new MCP + old daemon (no
root→ accepted), old MCP + new daemon (extra field ignored by serde). Rejection fires only on a confirmed mismatch, never on an absent field — so a new server won't needlessly respawn healthy old daemons.Tests
reconnect_aware_client_survives_daemon_restart— kill daemon under a live connection, bring a fresh one up, assert the same client heals (this reproduced the bug).plain_connect_does_not_reconnect— guards the opt-in boundary.reconnect_rejects_a_foreign_daemon— stale lockfile points reconnect at a different live daemon; must reject.Stats/Lockfilewire-compat round-trips + root-mismatch unit tests.Full workspace suite green (270+ tests), clippy clean, rustfmt clean. Targets release v0.5.0.
Closes #7.
🤖 Generated with Claude Code