Hive
perf(kura): cache the parsed segment ring state in memory
GitHub issue · Closed
Performance follow-up to #11222. That PR grows the CAS segment ring from a fixed 5 segments to a disk-derived count (~300 segments on a 200 Gi volume, up to 16,384 at the ceiling) — which turns a previously invisible inefficiency into a per-request tax that scales with ring size.
The problem
Every serving-path generation check went through load_segment_state: a RocksDB point-get of the entire ring state blob plus a full serde_json parse (one String allocation per segment), followed by a linear scan in generation_of. Call sites:
prepare_artifact_for_serving— every serve of a segment-backed artifactmaybe_refresh_manifest— pre-check plus re-check under the refresh lock, so a request that triggers a refresh parsed the state up to 3×active_segment— every artifact writesnapshot()— the metrics loop
The state only changes on rotation (once per ~512 MiB ingested); between rotations every parse reconstructs an identical value. And there is no cross-process freshness to buy: the data-dir writer lock guarantees a single writer process, and every mutation already funnels through save_segment_state on the same Store.
Cost by ring size (~75–80 bytes of JSON per segment reference):
| Ring | State blob | Per-parse | Per-request impact |
|---|---|---|---|
| 5 segments (legacy) | ~400 B | ~1 µs | noise |
| ~300 segments (#11222 on a 200 Gi PVC) | ~24 KB | ~50–100 µs + 300 allocs | 1–3 parses per request; ~5–15% of a core at 1k req/s |
16,384 segments (MAX_DESIRED_SEGMENTS) |
~1.3 MB | 2–4 ms + 16k allocs | would dominate serve latency |
What changed
The parsed state is now cached in memory as a SegmentStateSnapshot — the SegmentState plus a HashMap<segment_id, SegmentGeneration> index:
Store::openseeds the snapshot with one parse at startup (load_segment_state_from_dbis now only called there).save_segment_statekeeps writing RocksDB exactly as before (durability unchanged) and then swaps the snapshot, so the cache cannot be forgotten by a mutation path — both mutators (active_segmentrotation,evict_segment’s defensive removal) already funnel through it.segment_generationbecomes anArcclone plus a hash lookup: no RocksDB read, no parse, no linear scan.active_segmentreads the snapshot and only clones the state when actually rotating, so the per-write cost drops to a pointer clone too.- The generation index is rebuilt once per save — O(ring) once per ~512 MiB ingested instead of O(ring) per request.
SegmentState::generation_of (the linear scan) is removed — the snapshot index is its only remaining caller’s replacement, and clippy’s dead-code deny would flag it. Note for #11246: its orphan sweep calls generation_of; whichever PR merges second resolves the one-liner to snapshot.generations.contains_key(...).
Deliberately not changed: the persisted format. The JSON blob shape is what a rolled-back binary reads (downgrade safety), and with a single parse per process lifetime plus a ~0.25% write amplification per rotation even at the maximum ring, there is no remaining reason to split it into per-segment keys.
Concurrency notes
Mutations were already effectively serialized: rotation runs under segment_write_lock, and evict_segment’s save is a near-no-op in practice because push_new removes ring-evicted segments from the state before eviction runs. The cache swap happens inside save_segment_state, after the RocksDB write succeeds, so readers see either the previous or the new fully-built snapshot — never a partial one. Readers hold the mutex only long enough to clone an Arc.
Validation
- New tests:
segment_generation_tracks_saved_state(cache coherence throughsave_segment_statefor all three generations + miss),evicting_a_segment_updates_the_cached_generation(the defensive eviction branch updates the cache),segment_state_snapshot_survives_reopen(a secondStore::openon the same data dir seeds the snapshot from RocksDB). - The pre-existing test that calls
save_segment_statedirectly to fabricate anold-generation segment keeps passing unchanged, confirming the refresh path reads through the cache. cargo test --lib— 236 passed, 0 failed;cargo clippy --all-targets -- -D warnings— clean;cargo fmt— no changes.
🤖 Generated with Claude Code
No GitHub comments yet.