Hive Hive
Sign in

fix(cli): keep shard plan reference stable across CI re-runs

GitHub issue · Closed

Metadata
Source
tuist/tuist #11284
Updated
Jun 24, 2026
Domains
CLI
Details

What changed

Test sharding derives a shard plan reference per CI run. On GitHub Actions it was built as github-{runId}-{attemptNumber} from GITHUB_RUN_ID + GITHUB_RUN_ATTEMPT. This change drops the attempt counter so the reference is just github-{runId}, stable across the build job, every shard execution job, and partial re-runs.

Why

Re-running only the failed jobs of a sharded workflow failed with:

The shard plan was not found.

Sharding has two phases:

  • Build-for-testing creates the shard plan on the server, keyed by the reference.
  • Shard execution (--without-building --shard-index N) looks the plan up by the same reference.

On GitHub’s “Re-run failed jobs”, the build phase is skipped (it reuses the successful attempt-1 output), so the plan stays registered as github-{runId}-1. The shard execution job, however, re-runs with GITHUB_RUN_ATTEMPT=2, so the CLI derived github-{runId}-2 and the server returned 404 because that plan was never created. The test-products archive still downloaded fine because it is keyed only by run_id, which is why only the plan lookup failed.

“Re-run all jobs” worked because the build phase re-runs and recreates the plan at attempt 2.

Root cause

Including the attempt counter in the reference breaks the invariant that the build job and its downstream shard jobs must agree on the reference. The intent — that the reference binds across all jobs of a run — is already documented for CircleCI, where we deliberately use the workflow-stable CIRCLE_WORKFLOW_ID. GITHUB_RUN_ID is GitHub’s equivalent stable-across-jobs identifier; appending the attempt number defeated it.

Why this solution

Removing the attempt counter is the minimal fix and is safe for the full-rerun path:

  • Re-run failed jobs (build skipped): execution resolves github-{runId} → the plan created by the original build. Fixed.
  • Re-run all jobs (build re-runs): a fresh plan is inserted under the same github-{runId} reference, and the server resolves a reference to the latest plan by inserted_at, so the new plan wins. Unchanged behavior.

No server change is required. It also matches the temporary workaround customers can apply today (TUIST_SHARD_REFERENCE: github-${{ github.run_id }}-1), which stays forward-compatible — once this ships, the explicit override can be dropped and the auto-derived reference behaves identically.

attemptNumber is kept on CIInfo as legitimate CI metadata; only its use in shardReference was removed.

Validation

  • Added a regression test asserting the reference stays github-123456789 even with GITHUB_RUN_ATTEMPT=2.
  • TuistCITests/CIControllerTests — 12/12 pass against the generated project.
  • Lint clean for the changed files.
Comments
T
tuist[bot] Jun 15, 2026

🛠️ Tuist Run Report 🛠️

Tests 🧪
Scheme Status Cache hit rate Tests Skipped Ran Commit
TuistAcceptanceTests 0 % 0 0 0 feefa0b77
TuistUnitTests 91 % 2971 5 2966 feefa0b77
Flaky Tests ⚠️
  • TuistUnitTests: 3 flaky tests (View all)
Test case Module Suite
parseTestStatuses_returnsPassingModuleNames() TuistXCResultServiceTests XCResultServiceTests
parseTestStatuses_returnsCorrectStatuses() TuistXCResultServiceTests XCResultServiceTests
parseTestStatuses_extractsModuleAndSuiteNames() TuistXCResultServiceTests XCResultServiceTests
Builds 🔨
Scheme Status Duration Commit
TuistAcceptanceTests 1m 4s feefa0b77
TuistUnitTests 2m 29s feefa0b77
TA
tuist-atlas[bot] Jun 16, 2026

The fix from this pull request is now available in Tuist CLI 4.200.3. The shard plan reference is now stable across CI re-runs on GitHub Actions. Update to get this fix.