Hive
fix(server): look up existing test cases with a single array param
GitHub issue · Closed
What changed
The existing-test-case lookup in Tuist.Tests.get_existing_test_cases/2 now binds all test-case IDs as a single Array(UUID) ClickHouse parameter, instead of an Ecto id in ^ids clause that expands to one bound parameter per ID. It stays on Ecto primitives: fragment("? IN (?)", tc.id, type(^ids_chunk, {:array, Ecto.UUID})).
Why
Large tuist inspect test reports (for example an 8.8 MB xcresult with thousands of test cases) failed to ingest at POST /api/projects/:account/:project/tests with:
** (Ch.Error) Poco::Exception. Code: 1000, e.code() = 0, HTML Form Exception: Too many form fields
This is the same root cause as the earlier 431 Request Header Fields Too Large failure that #10955 addressed, just shifted to a different limit.
The lookup built an Ecto query with tc.id in ^ids_chunk, which ecto_ch expands into one bound parameter per ID (id IN ({$0:UUID}, {$1:UUID}, ...)).
- Originally those parameters rode in the request URL/headers, so a large batch blew the header size limit (
431). - #10955 added
multipart: true. But in multipart mode theChdriver emits one form field per parameter (param_$0,param_$1, …). A 5,000-ID batch produces about 5,001 form fields, which exceeds ClickHouse’s PocoHTMLFormfield-count limit (Too many form fields).
In both cases the failure mode is the same: the parameter count scaled with the number of test cases.
Reproduction
Both failure modes were reproduced locally, before and after the fix.
Form-field count (the reported error). Encoding the lookup the way the Ch driver sends it and counting the multipart form fields:
| IDs | pre-fix (id in ^ids) form fields |
fixed (single array param) form fields |
|---|---|---|
| 10 | 12 | 3 |
| 100 | 102 | 3 |
| 1000 | 1002 | 3 |
| 5000 | 5002 | 3 |
The pre-fix lookup sends one form field per ID (5002 for a 5k-case report), which overflows the server’s form-field limit. The fixed lookup is constant at 3 fields regardless of report size. The field-count regression test fails on the pre-fix implementation (its SQL is id IN ({$1:String},{$2:String},...)) and passes on the fix.
Per-field value length (a second, related limit found while probing). Sending the IDs as one array parameter must stay under ClickHouse’s per-field value-length limit. Against a real ClickHouse, a single un-chunked request crosses it between 3,000 and 3,500 UUIDs:
n=3000 => :ok
n=3500 => Ch.Error ... HTML Form Exception: Field value too long
This is why the batch size is capped (see below); at 2,000 IDs the encoded value is about 78 KB, comfortably under the limit.
The fix and why this approach
Binding the IDs as a single array parameter makes the request carry a constant number of form fields (query + project_id + ids) regardless of how many test cases the report contains, so neither the header-size nor the form-field-count limit can be tripped.
ecto_ch renders a pinned list inside a fragment as one array parameter rather than expanding it, and type(^ids_chunk, {:array, Ecto.UUID}) types it as Array(UUID) (the generated SQL is id IN (CAST({$1:Array(String)} AS Array(UUID)))). Staying on Ecto primitives keeps the schema-typed select map, so ecto_ch continues to decode UUID, array, datetime, and boolean columns; there is no raw row mapping or manual UUID conversion.
The obvious alternative, keeping the per-ID IN ^ids clause but shrinking the batch below the form-field limit, was rejected: it would require dozens of sequential round trips per large report and stays fragile against an exact, version-dependent limit, whereas the single-array approach is robust by construction.
Notes:
multipart: trueis retained so the array travels in the request body and can’t blow the header limit either.- The batch size is lowered from 5,000 to 2,000 so the single array parameter’s encoded value (about 78 KB) stays well under ClickHouse’s per-field value-length limit.
Impact
Self-hosted and managed servers can ingest test results from large reports without 500s. No API or schema changes.
How to test locally
cd server && mix test test/tuist/tests_test.exs(full suite: 223 tests, 0 failures).- Two regression tests cover the failure modes above:
- A field-count invariant test captures the lookup’s Ecto query, asserts it binds a single
Array(UUID)parameter, and that the encoded multipart body has a constant 3 form fields. It fails on the pre-fix code. - An end-to-end test ingests a 5,001-case report through
create_test/1against the real ClickHouse, guarding the value-length limit and exercising row decoding at scale.
- A field-count invariant test captures the lookup’s Ecto query, asserts it binds a single