Hive
kura: REAPI ByteStream uploads fail with fd_pool_exhausted under bursty concurrent uploads (Bazel remote cache)
GitHub issue · Closed
Summary
When Kura is used as a Bazel/Buck2 remote cache, ByteStream/Write uploads intermittently fail under bursty concurrency with:
INTERNAL: failed to persist CAS blob: fd_pool_exhausted:
timed out waiting 5s for file descriptor permit during open_file
A failed upload means the client can’t write that action’s result, so the action is never cached and re-executes on every build.
Where it bites
Cargo build scripts emit large directory outputs: librocksdb-sys produces ~339 .o files, which Bazel uploads concurrently via ByteStream. Each write needs FD-pool permits (temp-file create + segment-append open). The pool (auto-derived from RLIMIT_NOFILE) runs out, and rather than waiting/queuing, Kura fails the write after a 5 s permit timeout.
File-output actions (single blob) never hit this, so it’s invisible until a directory-output workload (rocksdb) shows up.
Evidence (rocksdb build-script round-trip against a Kura cache)
| default pool | KURA_FILE_DESCRIPTOR_POOL_SIZE=4096 + --ulimit nofile=16384 |
|
|---|---|---|
ByteStream/Write failures (build #1) |
~50 / 799 (fd_pool_exhausted) |
0 / 799 |
GetActionResult on rebuild (build #2) |
rocksdb = “action result not found” → recompile | 145/145 hits, ~5 s, no recompile |
So it’s purely an FD-pool capacity/behavior issue — bumping the pool + ulimit resolves it.
Proposed fix (options)
- Write backpressure instead of hard-fail — when the FD pool is saturated, wait/queue (or return a retryable status) rather than failing the write after 5 s. This is the robust fix for any concurrent-upload client.
- Size the pool for client upload concurrency — raise the default / auto-derivation headroom, and/or document tuning for build-cache deployments.
- Reduce per-write FD usage on the ByteStream → segment path.
Notes
- Distinct from (and in addition to) the ByteStream flush fix in #11129, which is necessary but not sufficient for these workloads.
- Currently mitigated only at the deployment level (local-ci sets the pool + ulimit on its cache node). Production Kura-as-remote-cache for Bazel/Buck2 needs a real answer here.