Hive Hive
Sign in

iOS app bundle-size check flaps ±5 MB from binary-cache coverage (phantom regressions)

GitHub issue · Open

Metadata
Source
tuist/tuist #11488
Updated
Jun 25, 2026
Domains
Cache
Details

Summary

The iOS app bundle-size check (the Tuist.app .ipa measured by tuist inspect bundle in app.yml‘s Device Build job and tracked on tuist.dev) is non-deterministic. For the same source, the reported install size swings between ~19.5 MB and ~25–28 MB across CI runs. This surfaces as phantom “+N MB” regressions on PRs that don’t touch the app at all — e.g. a CLI-only PR getting flagged “+5.3 MB (+27%)”.

Root cause

The swing is entirely the binary/module cache, not the source.

Since #11290 (library products became cacheable) and #11281 (XCTest / Swift Testing support frameworks became cacheable), the app links those newly-cacheable targets as prebuilt binaries when they’re warm in the cache, and compiles them from source when they’re not. A prebuilt static/dynamic-library binary links into the app ~5 MB larger than the same code compiled in-graph. So the measured size depends on cache hit-coverage at build time, which varies run-to-run, rather than on anything in the diff.

This is a coverage problem, not a hash-collision bug: the content-hash cache key is deterministic (the hashing chain is sorted end-to-end — GraphContentHasher, TargetContentHasher, SettingsContentHasher, DependenciesContentHasher), and the EE TargetsToCacheBinariesGraphMapper matches artifacts by hash+name. The variance is in which of those targets happen to be warm.

Evidence / investigation

  1. It’s the iOS app bundle, and the whole delta is the binary. Diffing the artifact trees of two bundles, Assets.car and every resource are byte-identical; only Tuist.app/Tuist (the Mach-O) changes size.

  2. The app doesn’t link the modules of the PRs being blamed. The gRPC/CAS/REAPI stack is only reachable via .cacheCommand/.bazelCommand, which the app doesn’t depend on — so those PRs cannot be the cause.

  3. Onset matches the caching batch. Bundle history was a stable ~19.5 MB for ~12 days, then started flapping with the 2026-06-16 batch (#11290 / #11281). (It is not the SwifterPM 0.8.12 bump #11308 — that diff is a determinism fix + cosmetic output change.)

  4. Source build is deterministic. Building the app twice locally from source (tuist generate TuistApp --no-binary-cache + xcodebuild -destination 'generic/platform=iOS') produced a byte-identical Tuist.app/Tuist size (37,055,664 B both times; identical __TEXT/__DATA).

  5. Re-running with cache on did NOT flap. Two back-to-back Device Build runs on the same commit both landed at ~24.85 MB (+35 KB) — i.e. they share the same warm cache state. The flap is warm-vs-cold, not per-run jitter.

  6. Cold-vs-warm CI proof. Temporarily adding --no-binary-cache to the Device Build’s generate step and pushing:

    Build Cache Install Download
    Warm on 24,883,855 (23.7 MiB) 17,302,170
    Cold off 19,661,043 (18.7 MiB) 14,802,368
    Δ +5,222,812 (~5.0 MiB) +2,499,802

    The cold size lands on the historical low baseline, and the delta matches the reported “+5.3 MB / +2.6 MB” almost exactly.

Impact

  • Phantom “+N MB” regressions on unrelated PRs, which erodes trust in the bundle-size signal and can scare reviewers / authors.
  • The metric currently can’t distinguish a real size regression from cache-coverage noise.

Proposed fixes (increasing effort)

  1. Make the measured .ipa cache-invariant — build it with --no-binary-cache in the Device Build job. Cheapest, and proven reproducible. The measured number would track the true (cold) size.
  2. Guarantee deterministic full cache warming before the bundle build, so coverage is always 100% (consistent, but reports the inflated warm size).
  3. Make the cached library / test-support artifacts reproducible so partial coverage no longer changes the linked size — the deepest, real fix.

Option 1 is the quickest way to stop the phantom regressions while the underlying reproducibility is sorted out.

Comments
E
esnunes Jun 25, 2026

Update: the swing breaks down into two layers (~5 MB external + ~3 MB first-party)

Pulling the module-cache targets for three generations of the app (via the dashboard data) pins down exactly what moves. The whole swing is the main Mach-O; assets/resources are byte-identical. A cache-hit (remote) target is linked as a prebuilt static framework; a miss target is compiled from source and dead-stripped, so it links smaller. Bundle size therefore tracks how many targets were cache-warm at tuist generate time:

Coverage Install What’s prebuilt
0 cached (--no-binary-cache) 19.66 MB nothing (all source)
135 cached 24.85 MB external deps only
145 cached 27.88 MB external + first-party

~5 MB layer (19.66 → 24.85): the 135 third-party deps (ArgumentParser, SwiftSyntax, Nuke, …). These are stable (only change on dep bumps), so they’re almost always warm and contribute a near-constant +5 MB. They only go cold on a cache invalidation/regression.

~3 MB layer (24.85 → 27.88): exactly 10 first-party Tuist* modulesTuistCore, TuistServer, TuistAuthentication, TuistAutomation, TuistPreviews, TuistOnboarding, TuistProfile, TuistErrorHandling, TuistMenuBar, TuistXCActivityLog. This is the intermittent part. Their content hash changes on nearly every main commit (e.g. TuistCore’s buildable_folders subhash differed between the two generations, giving a different cache_hash and a miss vs remote lookup), so whether a build links the prebuilt binary depends purely on whether the cache-warm already ran for that exact hash.

Why caching inflates the binary

The inflation is not library-evolution overhead. BUILD_LIBRARY_FOR_DISTRIBUTION is only set for ProjectDescription/ProjectAutomation in Module.swift; TuistCore and the rest fall into the default branch with it off. The residual difference is inherent cross-module optimization loss: compiled with the app, a module is inlined/specialized across the boundary and the remainder dead-stripped; as a prebuilt static framework it’s compiled in isolation and linked fuller. That part is only recoverable by building from source, not by post-hoc dead-stripping (the static framework already goes through the app’s -dead_strip).

Implication for the fix

  • tuist generate --cache-profile only-external stops caching the volatile first-party modules. It removes the ±3 MB flap and keeps third-party build speed, and those first-party hashes barely hit the cache anyway (they churn every commit).
  • --no-binary-cache on the measured/shipped build yields the true ~19.66 MB size, deterministic but slower.

Net: ~5 MB is stable external-cache inflation, ~3 MB is the first-party-cache flap. Both are “prebuilt framework links larger than source,” not a real code regression, which is why unrelated PRs get blamed.