Hive Hive
Sign in

fix(kura): key ByteStream CAS blobs with the blob/ prefix so FindMissingBlobs sees them

GitHub issue · Closed

Metadata
Source
tuist/tuist #11141
Updated
Jun 24, 2026
Domains
Kura
Details

What changed

parse_blob_resource_name (the shared parser for ByteStream Write and Read resource names) now builds its CAS storage key via blob_key(...)blob/{hash}/{size} — matching the digest-based CAS operations, instead of the prefix-less {hash}/{size}.

Root cause

ByteStream-uploaded CAS blobs were stored under {hash}/{size}, while FindMissingBlobs, BatchUpdateBlobs, and BatchReadBlobs look blobs up under blob_key(&digest_key(..)) = blob/{hash}/{size}. The two key namespaces never matched, so a blob uploaded via ByteStream was invisible to FindMissingBlobs. ByteStream Write and Read share the parser, so blob round-trips worked (and the existing concurrency test only ever read blobs back), which is why it went unnoticed.

User/developer impact

REAPI clients (e.g. Bazel) call FindMissingBlobs to verify a cached action’s output blobs are present before trusting the action result. With ByteStream-uploaded outputs reported missing, the client discards the cache hit and re-executes the action. This made Kura ineffective as a remote cache for any workload whose outputs are uploaded via ByteStream (large blobs): for Bazel, actions whose outputs went up via ByteStream were never cached, while small outputs (uploaded via BatchUpdateBlobs, which already used the correct key) cached fine. Symptom: the OCI image build stayed ~28–32 min warm vs ~3 min with a local disk cache.

How it was found

Two consecutive Bazel builds against a Kura node with --remote_grpc_log, diffing the digests sent in build 1 (Write/UpdateActionResult) against those requested/missing in build 2 (FindMissingBlobs/GetActionResult): 1719/1720 of build 2’s missing blobs had been sent by build 1, with zero hash changes — blobs that were uploaded weren’t being found, with no non-determinism. A gRPC-level in-process test then reproduced it (a ByteStream-uploaded blob reported missing by FindMissingBlobs, even in the same process), and the source diff confirmed the key-prefix mismatch.

Why this fix

The blob/ prefix (via blob_key()) is the established CAS key convention used by every digest-based path; the ByteStream parser was the lone outlier. Adding the prefix in the parser unifies all CAS paths (Write, Read, FindMissingBlobs, BatchUpdate/ReadBlobs) on one key, rather than special-casing the lookups.

Migration note

Blobs written by a pre-fix node are stored under the old prefix-less key and become orphaned after upgrade. This is harmless: they were already unreachable via FindMissingBlobs/BatchReadBlobs, so nothing relied on them, and they age out via normal segment eviction.

Validation

  • New gRPC-level regression test bytestream_uploaded_blob_is_visible_to_find_missing_blobs: drives the real serve() handlers via ByteStreamClient.write + ContentAddressableStorageClient.find_missing_blobs; fails on main (blob reported missing), passes with the fix.
  • Updated the two parse_*_resource_name unit tests that had asserted the prefix-less key.
  • cargo test (reapi suite) and cargo clippy --all-targets -- -D warnings green.

🤖 Generated with Claude Code

Comments
TA
tuist-atlas[bot] Jun 9, 2026

The fix for keying ByteStream CAS blobs with the blob/ prefix is now available in kura@0.7.3. Update to ghcr.io/tuist/kura:0.7.3 to get this change.