Replication: empty limited_get_log_entries() workaround hides storage corruption and can stall catch-up indefinitely #3
Labels
No labels
Kind/Breaking
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
h-dv/openraft#3
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
ReplicationCore::send_log_entries()treats an emptylimited_get_log_entries(start, end)result for a non-empty range as a heartbeat plus 10ms sleep. This avoids the old panic, but it also hides aRaftLogReadercontract violation and can stall replication forever if the storage layer has a hole or a buggy reader.The current behavior turns a correctness/storage problem into slow liveness degradation with only a warning.
Code pointers
openraft/src/replication/mod.rs: whenstart != endandlogs.is_empty(), the code logs a warning, sleeps 10ms, and returnsLogIdRange::new(rng.prev, rng.prev)as if it were a heartbeat.openraft/src/storage/v2.rs:RaftLogStoragerequires no holes and serialized writes.RaftLogReader::limited_get_log_entries()says it must not return empty for non-empty input.SqliteLogStoredoes not overridelimited_get_log_entries()and itstry_get_log_entries()can return empty for a bounded non-empty range if the first requested key is missing.Why this matters
A leader trying to catch up a follower can repeatedly request a range that the log reader should provide. If storage returns empty, OpenRaft now sends a heartbeat-sized AppendEntries and sleeps, never advancing the follower and never surfacing a fatal storage error. This is especially risky because IXT's fork documentation explicitly says the branch was backported as a panic workaround.
Expected behavior
For production builds, a non-empty requested log range returning empty should be treated as a storage/contract error, not as successful heartbeat progress. It should be visible in metrics and fail loudly enough that the node/operator can repair storage.
Suggested fixes
ReplicationError::StorageErroror a new explicit contract-violation error.limited_get_log_entries()for present, partially missing, purged, and hole ranges.Suggested regression test
Use a test
RaftLogReaderwhoselimited_get_log_entries(1, 2)returnsOk(vec![]); assert replication reports a storage/contract error instead of retrying forever or reporting heartbeat-like progress.Fixed on
ixt-stable(CI green, run #2179, 14/14)Implemented the "treat empty-for-non-empty-range as a storage/contract error" behavior, plus the suggested feature gate and conformance suite.
3d6da188— emptylimited_get_log_entries()→ loudStorageErrorReplicationCore::send_log_entries(). A non-empty requested range that returns empty is now aReplicationError::StorageErrorby default — the node fails loudly instead of silently stalling catch-up.tolerate-empty-log-range(OFF by default, and in no CI feature combo — so CI never compiles it; verify manually if you enable it).Suite::limited_get_log_entriesstorage-conformance test (openraft/src/testing/suite.rs) to exercise present / partially-missing ranges, so a non-conforming reader is caught by the conformance suite.Why flipping the default to FATAL is safe (verified in code, not assumed):
ProgressEntry::next_sendclampsstartto the purge boundary and routes below-purge reads to snapshot, andtry_purge_logpostpones purging in-flight ranges. So an empty read for a non-empty range is a genuineRaftLogReadercontract violation, never a benign purge race.0789a571— follow-up: vote-gate theStorageError(regression fix)Flipping the default surfaced that
Response::StorageErrorwas the onlyNotify::Networkarm inraft_core.rsnot gated on vote freshness. On a leadership change with an uncommitted-log revert (normal underloosen-follower-log-revert), an in-flight replication stream from the old epoch reads the truncated range and itsStorageErrorwould have FATAL'd the now-healthy ex-leader.StorageErrorwithsender_voteand discard it in RaftCore unless the vote matches (mirrors the existingHigherVotegating).tests/.../replication/t70_stale_session_storage_error.rs, backed by a newBlockOperation::LimitedGetLogEntriesmemstore delay hook — proven to fail without the gate and pass with it.⚠️ Operational caveat for downstream stores (the one thing outside this repo)
The default now goes FATAL on a genuine contract violation. IXT's
SqliteLogStore(separate repo) is the reason the old workaround existed — itstry_get_log_entries()could return empty for a bounded non-empty range. Before pinning this rev, that store must either (a) uphold the contract — ideally overridelimited_get_log_entries(), or at minimum never return empty for a range overlapping present entries (the extended conformance test will catch it), or (b) run withtolerate-empty-log-rangeas a documented break-glass during transition.Closing as fixed. The
SqliteLogStoredecision above is tracked downstream.