Hive
fix(kura): key ByteStream CAS blobs with the blob/ prefix so FindMissingBlobs sees them
GitHub issue · Closed
What changed
parse_blob_resource_name (the shared parser for ByteStream Write and Read resource names) now builds its CAS storage key via blob_key(...) — blob/{hash}/{size} — matching the digest-based CAS operations, instead of the prefix-less {hash}/{size}.
Root cause
ByteStream-uploaded CAS blobs were stored under {hash}/{size}, while FindMissingBlobs, BatchUpdateBlobs, and BatchReadBlobs look blobs up under blob_key(&digest_key(..)) = blob/{hash}/{size}. The two key namespaces never matched, so a blob uploaded via ByteStream was invisible to FindMissingBlobs. ByteStream Write and Read share the parser, so blob round-trips worked (and the existing concurrency test only ever read blobs back), which is why it went unnoticed.
User/developer impact
REAPI clients (e.g. Bazel) call FindMissingBlobs to verify a cached action’s output blobs are present before trusting the action result. With ByteStream-uploaded outputs reported missing, the client discards the cache hit and re-executes the action. This made Kura ineffective as a remote cache for any workload whose outputs are uploaded via ByteStream (large blobs): for Bazel, actions whose outputs went up via ByteStream were never cached, while small outputs (uploaded via BatchUpdateBlobs, which already used the correct key) cached fine. Symptom: the OCI image build stayed ~28–32 min warm vs ~3 min with a local disk cache.
How it was found
Two consecutive Bazel builds against a Kura node with --remote_grpc_log, diffing the digests sent in build 1 (Write/UpdateActionResult) against those requested/missing in build 2 (FindMissingBlobs/GetActionResult): 1719/1720 of build 2’s missing blobs had been sent by build 1, with zero hash changes — blobs that were uploaded weren’t being found, with no non-determinism. A gRPC-level in-process test then reproduced it (a ByteStream-uploaded blob reported missing by FindMissingBlobs, even in the same process), and the source diff confirmed the key-prefix mismatch.
Why this fix
The blob/ prefix (via blob_key()) is the established CAS key convention used by every digest-based path; the ByteStream parser was the lone outlier. Adding the prefix in the parser unifies all CAS paths (Write, Read, FindMissingBlobs, BatchUpdate/ReadBlobs) on one key, rather than special-casing the lookups.
Migration note
Blobs written by a pre-fix node are stored under the old prefix-less key and become orphaned after upgrade. This is harmless: they were already unreachable via FindMissingBlobs/BatchReadBlobs, so nothing relied on them, and they age out via normal segment eviction.
Validation
- New gRPC-level regression test
bytestream_uploaded_blob_is_visible_to_find_missing_blobs: drives the realserve()handlers viaByteStreamClient.write+ContentAddressableStorageClient.find_missing_blobs; fails onmain(blob reported missing), passes with the fix. - Updated the two
parse_*_resource_nameunit tests that had asserted the prefix-less key. cargo test(reapi suite) andcargo clippy --all-targets -- -D warningsgreen.
🤖 Generated with Claude Code