Hive Hive
Sign in

feat(server): exclude unvalidated test cases from flaky-test alert triggers

GitHub issue · Closed

Metadata
Source
tuist/tuist #11009
Updated
Jun 24, 2026
Details

What changed

Flaky-test alert evaluation now filters its triggered set down to test cases that have at least one successful (status=success, is_flaky=false) run on the project’s default_branch. Test cases that have never been validated on the default branch are excluded from the triggered set entirely, so trigger actions (mute / skip via change_state, labels, Slack) never fire for them.

  • Tuist.Tests.test_case_ids_with_successful_default_branch_run/3 — new batched ClickHouse query returning the subset of given ids with a successful, non-flaky run on the default branch.
  • Tuist.Automations.Workers.AlertEvaluationWorker — applies the gate via reject_unvalidated_test_cases/2 before the baseline/transition split, so both baseline establishment and ongoing transitions honor it. Recovery (unmuting) is deliberately left unfiltered.

Why it changed

Brand-new tests that have never had a successful run on the default branch were being auto-muted on the very pull request that introduced them. When a PR’s early commits produced failing/flaky runs, the branch-blind flakiness monitor tripped its low threshold and quarantined those tests. The selective-tests step then reported every selected test as muted and exited without running any of them, so the PR merged green. The breakage only surfaced on the next default-branch run, where the tests were exercised for the first time and hung.

Root cause

The flakiness monitor reads a branch-blind daily-stats materialized view (TestCaseRunDailyStatsPerCase has no branch dimension), so flakiness accrued on a PR branch counts toward a test’s flakiness even though the test has never landed. Combined with a low flaky threshold, a never-merged test can be auto-muted before it has ever passed.

Why this approach

Rather than special-casing the change_state action, the gate excludes unvalidated tests from the triggered set at a single chokepoint in the worker. The predicate “≥1 successful non-flaky run on the default branch” covers both the brand-new-test case and the merged-broken case, and being all-time (not windowed) keeps long-established tests eligible. The accepted tradeoff: genuinely PR-only-flaky tests get no alert signal (label / Slack) until they merge and accrue a passing default-branch run, at which point they re-enter evaluation naturally.

How to test locally

  • mix test test/tuist/automations/workers/alert_evaluation_worker_test.exs — covers the new “default-branch validation gate” describe block (skips trigger actions for unvalidated test cases, fires only for validated tests in a mixed set, excludes unvalidated tests from baseline establishment).
  • mix test test/tuist/tests_test.exs — covers test_case_ids_with_successful_default_branch_run/3 (validated vs. PR-only / failing-on-main / flaky-success-on-main rows, and the empty-input case).

🤖 Generated with Claude Code

Comments
TA
tuist-atlas[bot] Jun 3, 2026

The fix to exclude unvalidated test cases from flaky-test alert triggers is now available in server@1.204.0. Update to use this change.

TA
tuist-atlas[bot] Jun 5, 2026

The changes from this PR are now available in release xcresult-processor-image@0.11.0. Flaky-test alert triggers now exclude unvalidated test cases.