Summary
Redesigns the release pipeline so it never commits to main and so a fleet/runtime image (runner-image, linux-runner-image, controllers, Kura runtime, xcresult-processor) is built and deployed with the server in one pass — replacing the old “deploy server → release commits an image pin → server redeploys” double cascade.
Opened as a draft: this touches the most critical CI in the repo and can only be fully exercised once on main, so it wants careful review first.
Root cause (one diagnosis, both problems)
The deployed version of every image except the server itself lived as a committed pin in the managed Helm values (xcresultProcessor, capi/macosFleet, runnersController, kuraRuntime in values-managed-common.yaml; runnerImageSemver + runnersFleetLinux.shapeRunnerImage in the env overlays; the hetzner controller in the mgmt manifest). release.yml sed-rewrote those pins after building the images and pushed the bump to main. Because server-production-deployment.yml watched infra/helm/tuist/**, that bump commit re-triggered the whole canary → acceptance → production cascade.
So the committed pin was simultaneously (a) the commit to main and (b) the cause of the second deploy. The server image already avoided this — server.image.tag: "" resolved at deploy via --set from the commit SHA. This PR extends that model to every image.
While implementing, the committed pins turned out to have drifted stale from the released tags (runnersController pinned 0.11.0 but the latest tag is 0.12.0; kuraRuntime 0.7.1 vs 0.7.4; xcresultProcessor 0.12.3 vs 0.12.4; runnerImageSemver 0.3.1 vs 0.3.2). Resolving from the tag removes that drift class entirely. Consequence to expect: the first deploy after merge rolls these forward to the latest already-released versions.
What changed
Phase 1 — deploy-time tag resolution
- Managed values pins set to
"". server-deployment.yml resolves each version from the latest <component>@<semver> git tag (--merged the deployed SHA, so re-promoting an old commit resolves versions as they were) and passes them via --set.
release.yml tags every fleet/runtime image it builds (tag-infra-releases).
- The
[Release] commit-and-push step and all sed-into-values pin-backs are removed.
Phase 2 — Sparkle appcast off main
SUFeedURL repointed from raw.githubusercontent.com/.../main/app/appcast.xml to a stable GitHub Release asset (releases/download/appcast/appcast.xml); the app release publishes the signed feed there (make_latest: false). The feed is a published artifact, not committed source. (Has a one-time post-merge migration — see below.)
Phase 3 — versions stamped, not committed
- Each release job stamps its version into the build artifact before building (Constants.swift, Project.swift, Cargo.toml, mix.exs, Chart.yaml), so nothing needs committing back. The dead
mise.toml/mise.lock mutation steps are removed.
Phase 4 — Renovate owns the in-source consumer pins
renovate.json: the tuist mise pin runs at any time (overriding a new weekly default for everything else) and auto-merges; the hetzner mgmt-manifest image and the dev.tuist Gradle plugin consumer pins (CLI Constants.swift, android settings.gradle.kts) are Renovate-tracked. These land via auto-merged PRs, not CI pushes — so they survive branch protection.
Deploy orchestration (how the pieces fit)
server-production-deployment.yml is the deploy. It triggers on push (server-relevant paths) and workflow_dispatch, both routed through a leading gate job: gate runs release:check and, on push, defers (skips the cascade) when a fleet/runtime image is releasing this push — otherwise it deploys now. It also resolves the deploy SHA and detects hotfix (commit subject on push, input on dispatch). Its own concurrency group keeps it off release.yml’s build queue and serializes cascades.
release.yml owns server/cache/kura/noora/helm/skills/gradle + the infra images, and dispatches server-production-deployment.yml (trigger-deploy) only when a fleet/runtime image released this run — that deploy must wait for the image build+tag, and gate defers to it. So no push deploys twice or is skipped.
cli-release.yml / app-release.yml — CLI and app build+publish in their own workflows with their own concurrency lanes, so a fleet-image deploy never queues behind a ~50-min CLI/app build. Each self-serializes its own component’s releases for version correctness.
notify-deploy-success.yml notifies Slack only when the run’s production deploy job (… / Deploy to tuist) actually succeeded, so a gate deferral (a green run that didn’t deploy) doesn’t false-fire.
Why dispatch, not a needs: edge: a deploy inside release.yml would either let cancel-in-progress SIGKILL an in-flight helm --atomic deploy, or (cancel off) serialize all builds behind every deploy. A separate dispatched run is immune to release-pipeline cancellation and preserves server-deployment.yml’s per-env serialization. release.yml itself is queue-not-cancel (serialize for version correctness, but never tear an in-flight release/deploy).
Post-merge steps (cannot be done in this PR)
1. Sparkle appcast cutover
The new-feed code is in this PR, but the migration only executes at the first app release after merge — installed apps baked the old SUFeedURL and poll raw.githubusercontent.com/.../main/app/appcast.xml until they update to a build carrying the new URL. In order:
- First app release after merge —
app-release.yml builds the app (new SUFeedURL baked in) and publishes the signed feed to the appcast GitHub Release. The appcast release + feed don’t exist until this runs (the feed’s EdDSA signature is produced by that build).
- Bridge the old feed — mandatory, one time. That same cutover build must also be advertised in the old committed
app/appcast.xml, or existing installs never see a new-URL build and strand. Generate + commit it once (same invocation the workflow uses, signed with the Sparkle private key from 1Password):
generate_appcast --link https://github.com/tuist/tuist/releases \
--download-url-prefix https://github.com/tuist/tuist/releases/download/<app@version>/Tuist.dmg \
-o app/appcast.xml app/build/artifacts --ed-key-file -
- Freeze — never update
app/appcast.xml again. New releases publish only to the appcast release. An old install that checks later still finds the cutover build in the frozen feed → updates to it → follows the new feed thereafter (stepped update, no permanent stranding).
- Then protect
main (the frozen file just sits there; no further CI commits to it).
2. Protect main
After confirming a real run no longer pushes to main, apply a ruleset on main — restrict creations/updates to PRs, block force pushes, require the conventional-pr + relevant checks. Tags bypass branch protection by design; Renovate/human PRs auto-merge through it. Repo setting, not code.
3. Renovate dry-run
renovate.json passes renovate-config-validator (schema). Before relying on auto-merge, confirm via a hosted Renovate dry-run that the mise manager resolves the tuist depName and the Gradle-plugin-portal datasource maps as configured.
Validation
- All the touched workflows (
release.yml, cli-release.yml, app-release.yml, server-deployment.yml, server-production-deployment.yml, notify-deploy-success.yml) parse as YAML and pass actionlint (only pre-existing self-hosted-label/shellcheck noise).
renovate.json validates with renovate-config-validator.
- All seven
<component>@<semver> tags exist, so the first deploy resolves real versions.
- Not runnable end-to-end pre-merge:
release.yml triggers on push to main, and the dispatched deploy uses main’s workflow version. Inherent to release/deploy-workflow changes, and the main reason this is a draft.
🤖 Generated with Claude Code