The changes from this pull request are now available in Runners Controller 0.9.0.
You can update to the new Docker image:
ghcr.io/tuist/tuist-runners-controller:0.9.0
Hive
GitHub issue · Closed
The first production RUNNER_VITALS data (from #11094, now live) revealed two gaps:
/proc/pressure/* doesn’t exist in the kata guest — the kata kernel ships CONFIG_PSI=y but boots with PSI disabled. PSI was the dedicated CPU/memory-starvation signal (the half of “lost communication” that isn’t OOM), so we were blind to it.infra (runners-controller): set psi=1 on the kata guest kernel cmdline for Linux kata runner Pods, via the io.katacontainers.config.hypervisor.kernel_params pod annotation. It’s honored because the containerd kata runtime whitelists io.katacontainers.* annotations. macOS pods aren’t kata, so they don’t get it. Covered by tests (TestBuild_LinuxPodEnablesPSIViaKataAnnotation, TestBuild_MacOSPodHasNoKataKernelParamsAnnotation).linux-runner-image (vitals.sh):
cpu.busy.pct from /proc/stat deltas — a guest-wide CPU-utilization signal that works regardless of PSI, so the CPU-starvation dimension is covered even if psi=1 doesn’t take on some kernel./proc/pressure is absent, so we never log the empty cpu.psi.some.avg10= noise seen in the first data.psi=1I can’t verify the kata kernel’s CONFIG_PSI offline, so shipping psi=1 alone risks a blind deploy-and-hope round-trip. The /proc/stat CPU% gives a reliable starvation signal independently; if psi=1 does take, PSI’s stall-time fields are a bonus on top.
go build / go vet / go test ./internal/podtemplate/... clean; new annotation tests pass.bash -n + shellcheck -S warning clean on vitals.sh; CPU% math spot-checked.Both halves apply via the production server deployment (the runners-controller image pin + the runner image pin), same path that just landed #11094. Confirm in Grafana after rollout that cpu.busy.pct appears and the PSI fields populate (or are cleanly absent).
The changes from this pull request are now available in Runners Controller 0.9.0.
You can update to the new Docker image:
ghcr.io/tuist/tuist-runners-controller:0.9.0