Hive Hive
Sign in

feat(infra): ship the kura cache StorageClass with the chart

GitHub issue · Closed

Metadata
Source
tuist/tuist #11371
Updated
Jun 24, 2026
Domains
Kura
Details

Draft — opening for review of the approach before merge.

Problem

The scw-local-nvme StorageClass + local-path-provisioner that the kura runner-cache PVCs bind against was a manual, per-workload-cluster bootstrap (infra/k8s/mgmt/bootstrap/local-path-provisioner.yaml) with zero automation:

  • mgmt-cluster-apply.yml doesn’t apply it — that workflow targets the management cluster and only applies a fixed allowlist (infra/k8s/clusters/** + four specific mgmt files); infra/k8s/mgmt/bootstrap/** is excluded.
  • No other workflow applies it. The file’s header literally says “apply with kubectl apply -f …”.

So a cluster could enable the kura fleet but silently lack the StorageClass its cache PVCs need. That’s exactly what happened on production during the macOS-runner-cache cutover: the EM node joined and the kura-tuist-scw-fr-par KuraInstance + pod were created, but the pod hung Pending on an unbound scw-local-nvme PVC (FailedScheduling: pod has unbound immediate PersistentVolumeClaims. not found).

Change

Move the provisioner into the tuist chart as templates/kura-fleet-storage.yaml, gated on the same kuraFleet.enabled flag as the fleet itself, and delete the standalone bootstrap manifest. Now the StorageClass ships wherever the fleet that needs it ships — they can’t drift, and a fresh env can’t miss it.

The StorageClass name scw-local-nvme is load-bearing (the scw-fr-par-runners kura region spec provisions PVCs with exactly that name) and is unchanged.

One behavior change worth a look

The bootstrap manifest pinned the provisioner controller to the EM pool with a nodeSelector. I removed that in the chart version: the controller only watches PVCs and launches a helper Pod on the target node (which still tolerates the runner-cache taint), so it doesn’t need to live on the cache node. Pinning it would make the release helm --wait block on the controller being Available whenever the EM node isn’t up yet (e.g. a cold deploy), which is a regression we don’t want. Flagging it explicitly since it diverges from the validated bootstrap.

Validation

  • helm template renders all 7 resources (Namespace, SA, ClusterRole/Binding, Deployment, StorageClass, ConfigMap) when kuraFleet.enabled, and nothing when disabled.
  • Deployment renders with no nodeSelector (unpinned); StorageClass scw-local-nvme present.

Note

This doesn’t retro-fix the live clusters — staging already has the bootstrap applied; production still needs the one-time kubectl apply (or this merged + redeployed) to unstick the currently-Pending cache pod.

🤖 Generated with Claude Code

Comments
TA
tuist-atlas[bot] Jun 19, 2026

This change is now available in CAPI Scaleway 0.10.0. Update to capi-scaleway@0.10.0 to get the kura cache StorageClass shipped with the chart.