Hive Hive
Sign in

feat(server): add account-scoped artifact retention jobs

GitHub issue · Closed

Metadata
Source
tuist/tuist #10983
Updated
Jun 24, 2026
Domains
Storage
Details

Resolves N/A

Add account-scoped object storage retention jobs so Tuist can reduce S3/Tigris usage without changing analytics metadata retention.

This PR adds application-level retention for tenant-owned binary artifacts in object storage:

  • Adds a plan-aware retention policy for previews, build archives, test attachments, shard bundles, and cache artifacts.
  • Adds a hosted-only daily scheduler that paginates accounts with follow-up scheduler jobs so retries resume at page boundaries.
  • Adds one idempotent deletion worker per DB-backed artifact family: previews, build archives, test attachments, and shard bundles.
  • Adds server-owned cache S3 retention workers for Xcode cache, Xcode module cache, and Gradle cache objects. These list S3 as the source of truth and delete only expired objects matching cache key shapes. Registry artifacts are excluded.
  • Keeps DB-backed artifact selection bounded by small batches per account.
  • Adds PostgreSQL/ClickHouse indexes and projections for the DB-backed retention access pattern, so workers avoid broad table scans as usage grows.
  • Uses S3 bulk deletion in parallel 1000-key chunks with bounded concurrency and explicit S3 response checks.

Buckets and artifacts

Bucket Configuration Artifacts stored Object key families Retention source
Tuist application bucket, or the account custom S3 bucket when fully configured Global bucket from TUIST_S3_BUCKET_NAME / secrets through Environment.s3_bucket_name/0; custom account bucket through Account.s3_bucket_name App preview bundles, preview icons, build archives, test run attachments, shard bundles <account>/<project>/previews/<app_build_id>.zip, <account>/<project>/previews/<app_build_id>.apk, <account>/<project>/previews/<preview_id>/icon.png, <account>/<project>/builds/<build_id>/build.zip, <account>/<project>/tests/runs/<test_run_id>/attachments/<attachment_id>/<file_name>, legacy <account>/<project>/tests/test-case-runs/<test_case_run_id>/attachments/<attachment_id>/<file_name>, <account_id>/<project_id>/shards/<shard_plan_id>/bundle.zip Database inserted_at for the matching preview, build, test attachment, or shard plan row
Xcode cache bucket TUIST_CACHE_XCODE_S3_BUCKET_NAME, S3_XCODE_CACHE_BUCKET, or cache secrets; falls back to the shared cache bucket when no dedicated Xcode bucket is configured Xcode compilation cache artifacts <account>/<project>/xcode/... S3 last_modified
Shared cache bucket TUIST_CACHE_S3_BUCKET_NAME, S3_BUCKET, or cache secrets Xcode module cache artifacts and Gradle cache artifacts <account>/<project>/module/..., <account>/<project>/gradle/... S3 last_modified

Registry artifacts are not matched by the cache cleanup key filters and are not deleted by these workers.

Retention policy

The active or trialing account subscription determines the plan. Accounts without an active subscription fall back to Air. Air and Open Source share the same windows.

Artifact family Air / Open Source Pro Enterprise
Cache artifacts, including Xcode compilation cache, Xcode module cache, and Gradle cache 14 days 30 days 90 days
App preview bundles and preview icons 60 days 180 days 365 days
Build archives 30 days 90 days 365 days
Test run attachments 30 days 90 days 365 days
Shard bundles 7 days 14 days 30 days

Cleanup strategy

DB-backed artifacts are cleaned from the hosted-only Oban cron at 02:30 daily. ScheduleExpiredArtifactsWorker pages accounts by id with a default page size of 500, bulk inserts per-account deletion jobs, and schedules a continuation job when another account page remains. Each per-account worker selects one keyset-paginated batch ordered by (inserted_at, id), computes object keys, deletes the blobs, and self-enqueues the next cursor when the batch was full. This lets a run walk the backlog without repeatedly selecting the oldest expired rows.

Cache artifacts are cleaned by hosted-only Oban crons at 03:00 for Xcode cache, 03:15 for Xcode module cache, and 03:30 for Gradle cache. Each worker lists one S3 page, filters by the expected key shape, batch-loads the matching account plans for that page, compares the object last_modified timestamp to the plan cutoff, deletes expired keys, and self-enqueues the next S3 continuation token when the bucket page is truncated. Objects whose account handle no longer resolves to a known account are skipped instead of defaulting to the shortest plan window.

Cleanup deletes only binary blobs from object storage. Metadata rows such as previews, build runs, test runs, test attachments, and shard plans remain available for analytics, dashboards, and data exports. Tuist does not persist a per-artifact purge ledger; retention status is derived from the artifact timestamp and the current account plan at cleanup time.

All object deletion uses S3 multi-object deletion in chunks of 1000 keys with bounded concurrency. Deletion errors are returned to Oban so failed jobs retry instead of silently succeeding.

How to test locally

  • mix format lib/tuist/environment.ex lib/tuist/storage.ex lib/tuist/shards.ex lib/tuist/oban/runtime_config.ex lib/tuist/storage/retention_policy.ex lib/tuist/storage/expired_artifacts.ex lib/tuist/storage/cache_artifact_retention.ex lib/tuist/storage/workers/delete_expired_build_archives_worker.ex lib/tuist/storage/workers/delete_expired_preview_artifacts_worker.ex lib/tuist/storage/workers/delete_expired_shard_bundles_worker.ex lib/tuist/storage/workers/delete_expired_test_attachments_worker.ex lib/tuist/storage/workers/delete_expired_xcode_cache_artifacts_worker.ex lib/tuist/storage/workers/delete_expired_xcode_module_cache_artifacts_worker.ex lib/tuist/storage/workers/delete_expired_gradle_cache_artifacts_worker.ex lib/tuist/storage/workers/schedule_expired_artifacts_worker.ex priv/repo/migrations/20260529100000_add_artifact_retention_indexes.exs priv/ingest_repo/migrations/20260529100000_add_artifact_retention_projections.exs test/test_helper.exs test/tuist/storage_test.exs test/tuist/oban/runtime_config_test.exs test/tuist/storage/retention_policy_test.exs test/tuist/storage/cache_artifact_retention_test.exs test/tuist/storage/workers/delete_expired_artifact_workers_test.exs test/tuist/storage/workers/delete_expired_cache_artifact_workers_test.exs test/tuist/storage/workers/schedule_expired_artifacts_worker_test.exs
  • mix compile --warnings-as-errors
  • mix test test/tuist/storage/cache_artifact_retention_test.exs test/tuist/storage/retention_policy_test.exs
  • mise run security
  • aube install --frozen-lockfile
  • mix test test/tuist/storage/workers/delete_expired_cache_artifact_workers_test.exs test/tuist/storage/workers/delete_expired_artifact_workers_test.exs test/tuist/storage/workers/schedule_expired_artifacts_worker_test.exs test/tuist/storage/cache_artifact_retention_test.exs was attempted earlier, but the local test DB failed before tests ran with ERROR 42P01 (undefined_table) relation "users" does not exist.
Comments
F
fortmarek Jun 2, 2026

Review findings:

  • [P1] server/lib/tuist/storage/cache_artifact_retention.ex:131 prevents cache cleanup from continuing past the first S3 page. ExAws parses IsTruncated as the XML text value ("true"/"false"), not a boolean, so this comparison never returns the next token for real S3 responses. Once a bucket has more than page_size objects, the worker deletes only page one and never enqueues the continuation job, leaving later expired cache artifacts untouched indefinitely. Please handle both the ExAws string shape and any mocked boolean shape in tests.

  • [P1] server/lib/tuist/storage.ex:307 treats a 2xx multi-object delete response as full success. S3 DeleteObjects can return HTTP 200 while including per-key <Error> entries in the response body, so this can return :ok even when some objects were not deleted. The retention workers then advance their cursor/continuation token and do not retry those failed deletes. Please inspect the parsed body for delete errors before returning success.

  • [P2] server/lib/tuist/storage/expired_artifacts.ex:125 skips legacy test attachments forever. The storage helper still supports the pre-test_run_id path (tests/test-case-runs/<test_case_run_id>/attachments/...) for attachments created before test_run_id was added, but this query requires attachment.test_run_id to be non-null and joins only through test_runs. Those older rows can be older than the retention window but will never be selected, so their blobs remain in object storage. Please add a legacy selection path, or otherwise include those rows when deriving the deletion key.

F
fortmarek Jun 2, 2026

Additional review finding:

  • [P2] The DB-backed artifact workers will re-attempt deletion for the full expired history on every scheduled run. Each worker selects rows with inserted_at < cutoff, deletes the derived object keys, and advances only within that worker chain via cursor. On the next daily scheduler run, there is no durable purged_at marker or deletion ledger, and the metadata rows are intentionally retained, so the same years-old rows still match inserted_at < cutoff and are selected again. S3 delete is idempotent, so this may not break correctness, but it means the jobs can keep issuing delete calls for already-deleted blobs forever, which works against the storage-cost-reduction goal and can create unnecessary S3/API load as historical metadata grows. We may want to use S3 as the source of truth for these artifact families too, or otherwise persist blob deletion state so the DB-backed cleanup can actually drain old work.
F
fortmarek Jun 2, 2026

[P2] Keep artifact retention cursors monotonic

The new durable cursor is written with on_conflict: {:replace, ...}, so any worker that finishes later can overwrite a newer cursor with an older (after_inserted_at, after_id). The deletion workers are unique by account_id plus the cursor args, not by account/artifact alone, so a fresh scheduled no-cursor run can coexist with a cursor-followup chain for the same account/type when a backlog spans schedule windows or when multiple server pods consume :storage_retention. In that case one chain can advance the cursor, then an older duplicate page can finish and move it backwards, causing the next scheduled run to re-delete objects that were already purged. Please make the cursor update monotonic (only advance when the incoming cursor is greater than the stored cursor) and/or serialize per account/artifact. Using S3 as the source of truth, or keeping per-object purge state, would also avoid relying on a DB cursor that can regress.

Relevant code: server/lib/tuist/storage/expired_artifacts.ex:292 and worker uniqueness such as server/lib/tuist/storage/workers/delete_expired_build_archives_worker.ex:4.

P
pepicrft Jun 2, 2026

Follow-up for https://github.com/tuist/tuist/pull/10983#issuecomment-4602655137: addressed in c4e03befaf.

The persisted artifact retention cursor now updates through a guarded ON CONFLICT DO UPDATE that only applies when the incoming (after_inserted_at, after_id) tuple is ahead of the stored one. If a stale worker page finishes after a newer page, the conflict update is skipped and treated as a successful no-op.

I also added a worker regression test that seeds a newer cursor, runs a stale page, and asserts the cursor stays on the newer row.

Verification: mix test test/tuist/storage/artifact_retention_cursor_test.exs test/tuist/storage/workers/delete_expired_artifact_workers_test.exs.

F
fortmarek Jun 2, 2026

[P2] Don’t delete preview-level icons from app-build retention

delete_previews/3 selects expired app_builds and deletes both the app build object and AppBuilds.icon_storage_key(... preview_id ...). The icon key is preview-scoped (previews/<preview_id>/icon.png), while previews can be reused and have multiple app builds: multipart_start calls find_or_create_preview/1 and then creates another AppBuild under the same preview. That means an old app build can age past the retention window and cause this worker to delete the preview icon even when the same preview still has newer, retained app builds. The binary for the fresh app build survives, but the active preview loses its icon. Please only delete the icon when the preview itself is no longer retained / has no non-expired app builds, or track icon retention separately.

Relevant code: server/lib/tuist/storage/expired_artifacts.ex where each expired app build adds icon_storage_key, and server/lib/tuist_web/controllers/api/previews_controller.ex where previews are reused before creating new app builds.

TA
tuist-atlas[bot] Jun 4, 2026

Account-scoped artifact retention jobs are now available in server@1.205.0. Update to this version to enable plan-aware retention for previews, build archives, test attachments, shard bundles, and cache artifacts.

TA
tuist-atlas[bot] Jun 5, 2026

The changes from this PR are now available in release xcresult-processor-image@0.11.0. Account-scoped artifact retention jobs are now live.