Hive
feat(server): exclude unvalidated test cases from flaky-test alert triggers
GitHub issue · Closed
What changed
Flaky-test alert evaluation now filters its triggered set down to test cases that have at least one successful (status=success, is_flaky=false) run on the project’s default_branch. Test cases that have never been validated on the default branch are excluded from the triggered set entirely, so trigger actions (mute / skip via change_state, labels, Slack) never fire for them.
Tuist.Tests.test_case_ids_with_successful_default_branch_run/3— new batched ClickHouse query returning the subset of given ids with a successful, non-flaky run on the default branch.Tuist.Automations.Workers.AlertEvaluationWorker— applies the gate viareject_unvalidated_test_cases/2before the baseline/transition split, so both baseline establishment and ongoing transitions honor it. Recovery (unmuting) is deliberately left unfiltered.
Why it changed
Brand-new tests that have never had a successful run on the default branch were being auto-muted on the very pull request that introduced them. When a PR’s early commits produced failing/flaky runs, the branch-blind flakiness monitor tripped its low threshold and quarantined those tests. The selective-tests step then reported every selected test as muted and exited without running any of them, so the PR merged green. The breakage only surfaced on the next default-branch run, where the tests were exercised for the first time and hung.
Root cause
The flakiness monitor reads a branch-blind daily-stats materialized view (TestCaseRunDailyStatsPerCase has no branch dimension), so flakiness accrued on a PR branch counts toward a test’s flakiness even though the test has never landed. Combined with a low flaky threshold, a never-merged test can be auto-muted before it has ever passed.
Why this approach
Rather than special-casing the change_state action, the gate excludes unvalidated tests from the triggered set at a single chokepoint in the worker. The predicate “≥1 successful non-flaky run on the default branch” covers both the brand-new-test case and the merged-broken case, and being all-time (not windowed) keeps long-established tests eligible. The accepted tradeoff: genuinely PR-only-flaky tests get no alert signal (label / Slack) until they merge and accrue a passing default-branch run, at which point they re-enter evaluation naturally.
How to test locally
mix test test/tuist/automations/workers/alert_evaluation_worker_test.exs— covers the new “default-branch validation gate” describe block (skips trigger actions for unvalidated test cases, fires only for validated tests in a mixed set, excludes unvalidated tests from baseline establishment).mix test test/tuist/tests_test.exs— coverstest_case_ids_with_successful_default_branch_run/3(validated vs. PR-only / failing-on-main / flaky-success-on-main rows, and the empty-input case).
🤖 Generated with Claude Code