Hive Hive
Sign in

fix(infra): stream bootstrap binaries instead of copying them per roll

GitHub issue · Closed

Metadata
Source
tuist/tuist #11065
Updated
Jun 24, 2026
Domains
Compute
Details

What changed

RunCommandWithStdin in infra/macos-host-bootstrap now takes an io.Reader instead of a string. The four bootstrap binaries (tart-kubelet, tart, tailscale, node_exporter) are streamed to the host with bytes.NewReader over the operator’s already-resident slice instead of string(binary), which copied the whole binary on every call. String payloads (kubeconfig, launchd plist, auth keys, runner token) pass strings.NewReader.

No resource-limit change — the limit is left at the original 512Mi.

Why

The capi-scaleway-applesilicon manager was OOMKilling in production (Exit Code: 137), both replicas ping-ponging the leader lease and firing a Pod CrashLoop / Frequent Restarts alert. This removes the actual cause rather than giving it more headroom.

Root cause

It’s an OOM, not an application error — logs show a clean startup with no panic; the process is killed mid-reconcile.

The manager reads the four macOS bootstrap artifacts into memory once at startup and holds them resident for the process lifetime (~96MB total: tart-kubelet 47MB + tart 22MB + tailscale 14MB + node_exporter 13MB). That part is deliberate. The problem was the transfer path: each host install called RunCommandWithStdin(..., string(binary)), and the []bytestring conversion allocates a fresh copy of the whole binary on every call.

When the embedded tart-kubelet SHA changes — which is exactly what the capi-scaleway@0.7.2 deploy did, since #11044 shipped a new tart-kubelet binary — every ScalewayAppleSiliconMachine shows binary drift and re-rolls at once. With --machine-max-concurrent-reconciles=4, that’s up to four concurrent 47MB copies on top of the ~96MB resident set, and default GOGC pacing lets the heap overshoot before a collection runs:

~96MB resident
+ 4 × 47MB transient string copies (concurrent drift rolls) ≈ 188MB
+ GC headroom (GOGC=100 → heap grows ~2× live before collecting)
= peak past 512Mi during a fleet-wide roll

Already-Ready machines skip the full bootstrap, so this isn’t generic reconcile pressure — it’s specifically the binary-SHA drift roll fanning across the fleet, which recurs every time the embedded tart-kubelet changes.

Why this solution over raising the limit

The earlier revision of this PR bumped the limit to 1Gi + added GOMEMLIMIT. That treats the symptom. The transient copies are pure waste: bytes.NewReader streams straight from the resident slice with zero extra allocation, so the burst disappears entirely. bytes.Reader only reads and never mutates the backing array, so the concurrent rolls share the one resident slice safely. With the copies gone, the working set during a fleet-wide roll stays near the ~96MB baseline — comfortably inside the original 512Mi, no headroom bump needed.

Impact

No customer-facing outage during the incident — the Mac mini nodes kept running (independent kubelets, all Ready). But the fleet control plane was degraded: machine provisioning/recovery, fleet-spread auto-roll, and node-readiness reconciliation flapped. This fix lands in the manager image, so it takes effect once a new capi-scaleway release is built and deployed (not on a plain Helm value roll).

Validation

  • go build / go vet / go test pass for macos-host-bootstrap.
  • The cluster-api-provider-scaleway-applesilicon consumer still builds.

🤖 Generated with Claude Code

Comments
TA
tuist-atlas[bot] Jun 4, 2026

The fix to stream bootstrap binaries instead of copying them per roll is now available in capi-scaleway@0.8.0. This resolves the OOMKills (exit code 137) caused by transient string copies during fleet-wide rolls. Update to this version to eliminate the memory pressure without needing to raise resource limits.