Hive Hive
Sign in

feat(infra): mirror runner Docker Hub pulls through mirror.gcr.io

GitHub issue · Closed

Metadata
Source
tuist/tuist #11096
Updated
Jun 24, 2026
Domains
Compute
Details

What changed

Two things, in two commits:

  1. The fix (feat(infra)): the Linux runner dind sidecar’s dockerd now launches with --registry-mirror=https://mirror.gcr.io, so Docker Hub (docker.io) image pulls route through Google’s public pull-through cache instead of hitting Docker Hub directly.
  2. Bootstrap unblocks (ci(infra), temporary): the runners-controller image build and the server + Kura controller image builds (production cascade + standalone staging deploy) move to ubuntu-latest.

Why

We recently enabled Docker support in the Tuist runners. Linux runner Pods run a privileged docker:dind sidecar, and every microVM on a bare-metal host NATs through that host’s single egress IP. Docker Hub rate-limits by IP (100 pulls / 6h unauthenticated), so the whole host shares one budget and CI jobs started failing with toomanyrequests:

buildx failed with: toomanyrequests: You have reached your unauthenticated pull rate limit.

The sidecar’s dockerd had no registry mirror and no authentication, so every docker pull / FROM went straight to registry-1.docker.io from the shared IP.

Root cause of the fix’s chosen shape

mirror.gcr.io is the lowest-effort durable mitigation that needs no infrastructure and no secret: it’s a transparent Docker Hub pull-through cache. GCR absorbs cache misses on its own backend, so the runner’s IP never contacts Docker Hub for docker.io images; dockerd only falls back to Hub directly if the mirror itself is unreachable (fail-safe). It’s hardcoded rather than a Helm knob because there’s no per-environment branch point today and a new controller flag would risk the flag.Parse -> os.Exit(2) CrashLoop skew; the flag here goes to dockerd inside the docker:28-dind sidecar, which supports it, so there’s no skew.

Why the temporary ubuntu-latest moves

Chicken-and-egg: the fix only takes effect once a new controller image is built and the chart redeploys, but those builds run on the rate-limited tuist-linux fleet and pull base images from Docker Hub through the same un-mirrored dind, so they fail with the very error the change fixes. The production deploy’s Build server image job already hit this. ubuntu-latest pulls via GitHub’s rotating IP pool instead of the fleet’s shared egress IP. It’s 4 vCPU / 16 GB (same memory as tuist-linux-large), so the Elixir release compile won’t OOM.

Rollout

  1. Merge -> controller image builds on ubuntu-latest, lands in ghcr.
  2. Deploy (production cascade): Build server image now runs on ubuntu-latest, helm rolls the mirrored controller.
  3. Verify on a fresh runner pod: docker info in the dind sidecar lists mirror.gcr.io under Registry Mirrors and a docker pull succeeds. (Existing pods are unaffected; the mirror only applies to newly created pods.)
  4. Revert the three runs-on changes back to tuist-linux / tuist-linux-large once the mirror is fleet-wide. Each carries a TEMPORARY: comment: git grep -n "TEMPORARY: " .github/workflows.

Scope / follow-ups (not in this PR)

  • The mirror covers the dind dockerd layer (the job’s own pulls) only. The host containerd pull of docker:28-dind and the runner image from Docker Hub is a separate dependency on the same shared IP, bounded by per-node image caching. A host-level containerd registry mirror closes it.
  • Durable fix is a self-hosted pull-through cache backed by our own object storage, removing the dependency on the Google-operated, unauthenticated mirror.gcr.io.
  • Watch ubuntu-latest root-disk headroom (~14 GB free) for the heavy server image build; if it ENOSPCs, add a disk-reclaim step or switch that one job to a buildkit-level mirror.

Validation

  • go build ./... and go test ./internal/podtemplate/ pass (new TestBuild_LinuxDindUsesRegistryMirror asserts the flag is wired into the sidecar).
  • Both deploy workflow YAMLs and the controller-image workflow YAML parse.
Comments
TA
tuist-atlas[bot] Jun 5, 2026

The Docker Hub registry mirror fix (using mirror.gcr.io to avoid rate limits) is now available in runners-controller@0.8.0. Update to this version to enable the registry mirror in runner dind sidecars.