The fix from this pull request is now available in CAPI Scaleway 0.7.2. Update to capi-scaleway@0.7.2 to use it.
Hive
fix(infra): report container readiness for tart-kubelet VM pods
GitHub issue · Closed
What
When a Tart VM is up with an IP, tart-kubelet now synthesizes a containerStatus per spec container (Ready: true, Running, name/image from the spec, ContainerID: tart://<vm>), mirroring the Pod Ready condition it already sets. Implemented as a small pure helper in reconciler.go’s podStatus, unit-tested directly.
Why
tart-kubelet runs each Pod as a single Tart VM with no per-container CRI, so it never reports any containerStatuses to the API server. It does set the Pod Ready condition — which is what Deployments and helm --wait use for availability — but kubectl get pods counts ready containers, and with none reported, every healthy VM-backed Pod shows 0/N READY permanently.
That’s not just cosmetic in practice: the READY column is the first thing an operator reads, and a working xcresult-processor sitting at 0/1 for hours read as an outage during a real incident. The investigation chased a phantom DB problem and nearly scaled the Deployment to zero before confirming (via the Oban queue) that the processor was fine all along — the column was lying. This makes the column tell the truth for the whole macOS fleet (xcresult-processor and customer runner Pods).
Scope / safety
- Only the running branch (VM up + IP) is touched. Booting (
Pending) and exited (Succeeded) Pods keep reporting no ready containers, which is accurate. - No behavior change to availability: Deployments/
helm --waitalready used theReadycondition, which is unchanged. This only populates the per-container viewkubectl/tooling read. - No billing impact: the runners-controller’s pod-lifecycle billing reads
state.terminated.finishedAt; the running branch setsstate.Runningonly, and the Succeeded branch is untouched, so the existing fallback is preserved. - Pod ↔ VM is 1:1 (multi-container Pods are rejected at admission), so this is effectively a single status, but the helper iterates the spec to stay correct if that ever changes.
Validation
go build ./..., go vet, gofmt, and go test ./internal/podagent/... all pass. New test TestRunningContainerStatusesReportsReady asserts Ready/Started/Running + name/image/ContainerID mirroring the spec and VM.
Opening as a draft for review.