Hive Hive
Sign in

fix(kura): serialize segment ring state mutations

GitHub issue · Closed

Metadata
Source
tuist/tuist #11249
Updated
Jun 24, 2026
Domains
Kura
Details

Problem

active_segment (rotation, under segment_write_lock) and evict_segment (previously unlocked) both perform read-modify-write cycles on the segment ring state: clone the latest state, mutate the clone, save it. With no common lock, a stale clone can overwrite a concurrent save, either resurrecting an evicted segment in the ring or orphaning a freshly rotated one.

Today this race is latent, not live. Eviction and rotation overlap routinely on a busy node — eviction runs outside the write lock, and its RocksDB index scan plus 512 MB unlink take long enough for the next rotation to start — but during that overlap only the rotation ever writes state, because of an accidental invariant chain:

  1. push_new removes the evicted segment from the ring before the producing rotation saves,
  2. so by the time evict_segment runs, its conditional save (if state.remove_segment(...)) is always a no-op in the rotation-produced flow,
  3. so every save that actually executes is a rotation save, serialized by segment_write_lock.

None of that is written down or enforced — review flagged this as a live P1 lost-update race precisely because a careful reader cannot tell the code is safe. And the chain breaks the moment anything evicts a segment that is still in the ring. With disk-sized rings (#11221, kura-cas-capacity-from-disk), pressure-eviction of ring members is the natural next caller, and at that point the interleavings become real:

  • An eviction save landing inside rotation’s clone-to-save window gets overwritten by the rotation’s stale clone, resurrecting the evicted segment: file unlinked and manifests deleted, but the ring and capacity accounting count a phantom segment, pushing the next pressure pass to evict a live segment early — dangling REAPI action-cache references, the #11221 failure class back as an intermittent race.
  • The mirror ordering orphans the freshly rotated segment from the ring: its artifacts still serve via manifests, but it never ages out (a permanent segment-sized disk leak), and active() reverts to the already-full previous segment, forcing an immediate spurious re-rotation.

Fix

Every segment-state read-modify-write now funnels through a new mutate_segment_state helper guarded by a dedicated segment_state_lock: acquire the lock, copy the latest cached snapshot, apply the mutation, persist only when the state actually changed. Both rotation and eviction call it. This replaces a global, fragile correctness argument (“audit every caller and the ring semantics”) with a local, enforced one (“all RMW goes through the helper; the helper holds the lock”) — before the capacity work invalidates the old argument.

Design notes:

  • A dedicated mutex was chosen over reusing segment_write_lock because that lock is held across whole segment appends; evictions are awaited on the persist hot path and would otherwise queue behind unrelated bulk writes. Lock ordering stays an acyclic hierarchy (segment_refresh_locksegment_write_locksegment_state_lock), and the state lock is a leaf with a synchronous-only critical section, so it cannot participate in a deadlock cycle.
  • active_segment’s rotate decision still reads a snapshot taken before the state lock is acquired. That stays valid because eviction — the only other mutator — never removes the active segment (documented in a code comment). This is also a standing constraint on any future pressure-evictor: it must only target non-active generations.

Validation

The two new multi-threaded tests simulate exactly the future caller that activates the race — they operate on ring members directly instead of rotation overflow (today’s only production flow, where the race cannot fire):

  • 16 concurrent mutate_segment_state pushes — every pushed segment survives in the final state.
  • 16 concurrent evict_segment calls against a pre-seeded ring — every segment is gone from the final state, none resurrected by a stale save.

Both pass locally (cargo test do_not_lose).

Stacked on #11247; only the top commit is new here.


Compound Engineering v2.60.0 🤖 Generated with Claude Fable 5 via Claude Code

Comments

No GitHub comments yet.