Hive Hive
Sign in

feat(kura): accept HTTP/2 and serve hot artifacts via mmap

GitHub issue · Closed

Metadata
Source
tuist/tuist #11086
Updated
Jun 24, 2026
Domains
Kura
Details

What changed

Two things land in this PR for Kura’s public artifact path:

  1. HTTP/2 on the public servers. Both the public HTTP and HTTPS servers now negotiate HTTP/2 alongside HTTP/1 (128 max concurrent streams, 16 KiB frames, adaptive flow control, and h2 keep-alive ping/timeout to reap half-open connections). Clients can multiplex concurrent artifact downloads over long-lived connections.

  2. Zero-copy, page-cache-resident mmap artifact serving. File-backed artifact GETs now try to serve straight from an mmap-backed Bytes instead of copying through a streaming reader, falling back to the reader when mmap isn’t a win.

It also includes a small correctness fix that was the branch’s original scope: cache-miss 404s for key-value and artifact reads now return a JSON body ({"message": ...}) instead of an empty response, matching the rest of the API’s error shape.

Why, and why this design

Copying every artifact byte through a userspace streaming reader is wasteful for the common case of a cache mesh: popular artifacts are re-served constantly and are already hot in the page cache. mmap lets us serve those zero-copy.

The hazard is that mmap hides disk I/O from the async runtime: touching a non-resident page faults synchronous, kernel-side disk I/O on the tokio worker thread that writes the socket. Under many concurrent cold downloads that would starve the worker pool, the exact thing the existing reader path avoids with spawn_blocking. The naive “just mmap everything” version trades a real correctness-under-load property for throughput.

So mmap serving is gated to only ever be a win:

  • Inline / oversized / under-pressure → reader path. Inline artifacts, anything larger than a bounded mmap serving pool (new, sized from host memory headroom in the memory controller), and any moment memory pressure is not Normal skip mmap entirely.
  • Cold pages → reader path. Before serving, we check mincore and only map regions whose pages are already resident. A fully-resident region serves with zero faults; anything cold returns None and falls back to Store::read_artifact_bytes, which keeps the blocking read off the async workers. This targets the actual failure mode (aggregate worker fault-I/O), not just per-poll latency (already bounded by chunk-yielding), and keeps the zero-copy benefit exactly where mmap beats the reader.

Alternatives considered and rejected: a size cap only shrinks the volume of faulting traffic rather than eliminating worker faults; pre-faulting in spawn_blocking eagerly reads the whole artifact (worse time-to-first-byte) and reads data an early-disconnecting client never wants.

The mapping uses memmap2 (one documented unsafe map call plus a read-only mincore query) rather than hand-rolled libc, keeping the audit surface minimal.

Safety invariant (load-bearing)

The mappings are sound only because Kura segment and blob files are append-only and reclaimed by unlink, never truncated in place. Unlink is safe (an active mapping pins the inode), but truncation would SIGBUS a live mapping. Page-cache eviction between the mincore check and a socket write is benign: it re-faults cleanly from the still-present backing file (correct bytes, a brief blocking re-read for that chunk). The only behavior that differs from the reader path is a genuine disk read error during a fault, which mmap surfaces as SIGBUS rather than a recoverable error, inherent to serving from a mapping, and a crashed node is recoverable because peers keep serving and the artifact re-replicates.

This invariant is documented in src/mmap.rs and recorded as a rollout-safety constraint in kura/AGENTS.md, along with the broader rule that mesh changes must stay compatible across one version skew during rolling deploys.

Rollout safety

All three changes are safe under a rolling deploy with mixed-version pods:

  • HTTP/2 is additive and negotiated, old clients and old pods keep using HTTP/1.
  • mmap serving is node-local, produces identical bytes and headers, and degrades to the existing reader path on pressure, cold cache, or any error.
  • memmap2 is pure Rust with no new system requirements in the release image.

No on-disk format or replication wire change.

User / developer impact

  • Clients can multiplex artifact downloads over HTTP/2.
  • Hot artifacts serve faster (zero-copy) with less worker-pool contention under load.
  • Cache-miss responses now carry a JSON error body.
  • No API contract or storage format change.

Validation

  • cargo clippy --all-targets -- -D warnings: clean.
  • cargo fmt --check: clean.
  • Full lib suite: 203 passed.
  • New tests: residency-gated mapping, non-zero segment offsets, bytes_chunks multi-chunk reassembly, end-to-end serving through both the mmap and reader-fallback paths, JSON 404s across CAS/module/Gradle/key-value routes, and the public HTTP builder advertising both protocols.

🤖 Generated with Claude Code

Comments

No GitHub comments yet.