Hive
feat(server): reason-gate operator access to customer projects
GitHub issue · Closed
Summary
Replaces blanket operator access to customer projects with a per-access, reason-gated, time-boxed grant. An operator who isn’t a member of a customer account is redirected to ops.tuist.dev to justify access, then redirected back with a signed grant the customer server/ verifies offline. Read access is self-serve (reason only); admin access (“sign in as admins”) goes through the same Slack JIT approval as kubectl writes. A valid grant also bypasses the customer’s SSO enforcement, which is what makes SSO-enforced orgs reachable.
Built on #10988, which has since merged into main, so this targets main directly.
Why
Today operator access is a static blanket: ops_access returns true whenever user.account.name in ops_user_handles(), granting read of any project’s dashboard, runs, bundles, billing, etc. There is no per-access justification, no time bound, and no audit of why an operator looked at a customer’s data. Separately, require_sso_authentication redirects operators into the customer’s own SSO, which they can’t complete, so SSO-enforced orgs are inaccessible today.
This makes operator access a deliberate, reason-logged, time-boxed, auditable act, and reuses #10988’s JIT machinery (Tailscale role as source of truth, Slack approval, fail-closed) for the admin tier.
How it works
Operator -> tuist.dev/{account}/{project} (not a member; org may enforce SSO)
server: no grant -> redirect to ops.tuist.dev/grants/new?return_to=...&account=...
ops (Pomerium / Google OIDC; X-Pomerium-Claim-Email):
read -> record grant, mint Ed25519 JWT, 302 back with ?operator_grant=<jwt>
admin -> Slack approval card; second human approves; then 302 back with the jwt
server: verify jwt offline (EdDSA-strict, iss/aud pinned, TTL capped),
pin account_id, store in session, strip the param,
skip SSO enforcement, authorize via the grant
What changed
tuist-ops (the grant issuer)
lib/tuist_ops/project_access/new feature module:request.ex+grant.ex(schemas),approvals.ex(read = self-serve grant inline; admin = pending until Slack approval),policy.ex(read self-serve / admin-approver gates, reusingJIT.TailscaleClient),token.ex(mints the Ed25519 grant),slack_blocks.ex.grant_controller.ex+ router/grants/*: the reason form, identity fromX-Pomerium-Claim-Email. The admin “pending approval” page is a plain HTML poll of/grants/:id/status, not LiveView, since tuist-ops has no asset pipeline.slack_controller.ex:pa_approve/pa_denyinteractive actions.- Migration
create_project_access_tables,josedep, new env config.
server (offline verification + enforcement)
lib/tuist_web/operator_grant.ex:verify/1(JOSE.JWT.verify_strict(_, ["EdDSA"], _)only, rejectingnone/HS256confusion tokens;exp, max-TTL ceiling,iss/audpinning), theaccept_operator_granthandoff plug,load_operator_grantplug + on_mount, and theredirect_to_ops_if_operatorgate.- Authorization: split the shared
ops_accesscheck. The internal/opspanel keeps its static behavior via the newinternal_ops_access; customer-data objects move to a grant-scopedops_access/ops_write_access(admin tier required for writes), resolving each object to its account viaobject_account_id/1. - SSO bypass in
require_sso_authenticationwhen the operator holds a valid grant for that account. ops_user_handlesremoved. Operator eligibility is nowTuist.Accounts.tuist_operator?/1(confirmed@tuist.devemail), and the redirect gate additionally requires a Google-authenticated session (the “google sso identity under the @tuist.dev org” signal), so password and test sessions stay on the normal path.
infra / docs
PROJECT_ACCESS_SIGNING_KEYadded to the tuist-ops ESO.infra/k8s/operator-project-access-audit.md: the three-trail audit story plus the deployment runbook.
Reasoning behind the non-obvious choices
- Signed grant, not a server->ops call. The server verifies offline with only the public key, so operator access and the customer-facing app stay on separate failure domains (the same decoupling #10988 is built on). Asymmetric, so even a compromised server can’t mint grants. Revocation is short TTL + keypair rotation.
- Google-auth gate on the redirect. The eligibility heuristic is a routing decision, not the security boundary (the boundaries are Pomerium/Google-OIDC at ops and the offline verification). Requiring
auth_method == :googleis the faithful reading of “google sso identity” and keeps the many@tuist.devtest fixtures on the normal path.
Security notes for the reviewer
X-Pomerium-Claim-Emailis the operator identity for/grants/*and is spoofable on a raw ingress. Those routes must be Pomerium-fronted (which #10988 deferred); they are deliberately not added to the unprotected public ingress here. Called out in the runbook.- The
?operator_grant=token is stripped via redirect before any page renders or any observability plug logs the query string.
Deployment (manual, in the runbook)
- Generate an Ed25519 keypair. Private PEM -> 1P
TUIST_OPS_BOT.project_access_signing_key; public PEM -> serverTUIST_OPERATOR_GRANT_PUBLIC_KEY. - Stand up Pomerium in front of
ops.tuist.dev/grants/*(Google OIDC), carving out/webhooks/slack/*.
Validation
- tuist-ops: 131 tests pass (new suites for
policy,approvals,token);mix format+mix credoclean; compiles with--warnings-as-errors. - server: full suite 4923/4924 pass;
mix format+mix credoclean; edited tests compile with--warnings-as-errors. The single failing test (TuistWeb.TestsLiveTest, arender_async100ms ClickHouse-latency flake) was confirmed pre-existing on the base commit by stashing all changes and re-running. - New server tests cover
verify/1(valid /alg:none/HS256/ expired / over-TTL / wrong aud / wrong iss), the grant-based checks across every object shape, the redirect gate (operator vs non-operator vs grant-holder vs unauthenticated), the token handoff, and a full account-settings integration test exercising grant -> plug -> on_mount -> SSO bypass ->ops_write_access.
🤖 Generated with Claude Code