Hive Hive
Sign in

feat(server): support OAuth service tokens

GitHub issue · Closed

Metadata
Source
tuist/tuist #11047
Updated
Jun 24, 2026
Details

What

Adds first-class support for OAuth client-credentials service tokens in the server authentication pipeline.

The change introduces an AuthenticatedService subject, teaches Guardian how to sign and resolve it, updates the Boruta token generator to distinguish user tokens from service tokens, and lets plugs and authorization checks handle service-authenticated requests.

Why

Internal services should be able to authenticate to Tuist with short-lived OAuth tokens without minting or storing one token per customer account. This PR keeps the scope to authentication plumbing only, so a follow-up can add whichever API surface needs to consume the service subject.

Notes

Service tokens are only issued for explicitly configured static OAuth service clients. Dynamically registered clients and the regular Tuist OAuth client do not become service subjects just because they use client credentials.

The implementation is generic: configured service clients carry a client id, secret, optional display name, optional TTL, and explicit scopes in the generated token. No downstream service name is encoded in the implementation.

Note

Scopes should describe the resource and action, for example account:usage:read. The account boundary comes from the authenticated subject: an account token with that scope can be limited to its own account with accounts_match, while a statically configured service token with the same scope can be authorized by policy to read that resource across accounts. This avoids introducing separate :any scopes while keeping cross-account access explicit in authorization rules.

Important

Service-token usage is intended to be cluster-internal. The follow-up API surface should be protected with network rules so only workloads inside the cluster can reach it. OAuth client credentials and short-lived tokens remain the application-level control, while the network boundary limits exposure if a token or client secret is leaked.

Validation

  • mise exec -- mix format --check-formatted
  • mise exec -- mix test test/tuist/oauth/clients_test.exs test/tuist/oauth/token_generator_test.exs test/tuist_web/authentication_test.exs test/tuist/oauth/introspection_test.exs test/tuist_web/rate_limit/mcp_test.exs test/tuist_web/plugs/sentry_context_plug_test.exs test/tuist/analytics_test.exs test/tuist_web/plugs/api/authorization/authorization_plug_test.exs
Comments
F
fortmarek Jun 3, 2026

Findings

[P1] Service-token issuance should be constrained to the management-cluster boundary.

This PR adds bearer service tokens that are intentionally cross-account once authorized by policy. Since bearer tokens are portable after minting, OAuth client credentials alone do not prove that the caller is still Atlas, Tuist Server, or even running inside the same management cluster. For defense in depth, both service-token issuance and any API surface that accepts AuthenticatedService should be reachable only through the internal management-cluster path. Concretely, I would keep these routes off the public ingress and enforce the boundary with Kubernetes NetworkPolicy and/or an internal Gateway route. The stronger version is to require workload identity too, for example mTLS/service-mesh identity or a projected Kubernetes service-account JWT with an audience for Tuist Server, so the server verifies the expected namespace/service account in addition to the OAuth token.

[P1] Service detection can diverge from the client Boruta actually authenticated.

TokenGenerator.generate/2 decides to issue a service token with Clients.service_client?(client_id), but Clients.get_client/1 resolves built-in static clients before service clients. If the configured service-client list accidentally reuses the Tuist OAuth client id, Boruta authenticates the Tuist client while the token generator still signs an AuthenticatedService token. That would bypass the intended service-client secret, TTL, and scope configuration. Please either reject duplicate service-client ids at config parsing time or make service-token generation depend on the same resolved client identity/order that Boruta authenticated.

[P2] Service tokens can pass require_authentication and then crash account-owned endpoints.

authenticated_subject_account/1 now returns nil for AuthenticatedService, but several authenticated API paths still assume every authenticated non-user subject has an owning account. For example, OrganizationsController.index/2 preloads authenticated_subject_account(conn) and then dereferences account.organization; a valid service token hitting GET /api/organizations would satisfy authentication and then raise instead of returning a controlled 403/401. Please explicitly reject service subjects on account-required routes, or update those callers to handle nil before dereferencing/storing the account.

P
pepicrft Jun 4, 2026

Thanks Marek, all three are good catches.

Service detection diverging from Boruta (P1): fixed in fca511f61b. The root issue was that static_client/1 resolves the built-in CLI/Kura clients before service clients, so a service-client config reusing one of those ids would have Boruta authenticate the built-in client while service_client?/1 still treated it as a service client. I now drop any service-client config whose id collides with a reserved built-in id, so client resolution, service_client?/1, and authorized_scopes/1 all agree with what Boruta authenticated.

Service tokens crashing account-owned endpoints (P2): also fixed in fca511f61b. The authenticated_api pipeline now rejects service subjects with a 403 right after require_authentication. No route on that surface is meant for service tokens today, and their authorization goes through the let_me policy layer separately, so this kills the whole nil-account crash class at one chokepoint instead of guarding each caller.

Management-cluster boundary (P1): I agree, but it’s an infra change rather than something for this PR. Keeping these routes off the public ingress plus a NetworkPolicy/internal Gateway route, and ideally workload identity on top, lives in the Helm/infra layer. I’ll follow up with that separately so we don’t block this on it.

P
pepicrft Jun 22, 2026

@fortmarek quick update on the Atlas/service-token feedback.

I ended up taking your suggestion and narrowed this away from generic OAuth M2M/service subjects:

  • Removed AuthenticatedService and the configured OAuth service-client token path, so /oauth2/token does not mint accountless service principals.
  • Added a curated Atlas read model at GET /api/internal/atlas/accounts/:account_handle/usage. It takes the account handle because Atlas already has that handle.
  • Kept the Atlas endpoint outside the normal authenticated API pipeline, so user/account policy code does not need to special-case a reusable cross-tenant service subject.
  • Checked the Atlas production cluster. Atlas runs in a separate Kubernetes cluster, so Tuist cannot call its own TokenReview API for these tokens. The server now verifies Atlas projected ServiceAccount JWTs against the pinned Atlas public JWKS, checks the tuist-server audience, and gates access to system:serviceaccount:atlas-production:atlas.
  • Pinned the current Atlas public JWKS in managed production values. This is public verification material, not a secret.
  • The route is still not network-private at the ingress layer today. The important change here is that it is no longer protected by a static OAuth client secret or generic accountless principal. The Atlas-side hardening that remains is mounting a dedicated short-lived projected token with audience: tuist-server for this call.

Validation: mix test test/tuist/atlas_workload_identity_test.exs test/tuist_web/controllers/internal/atlas_usage_controller_test.exs test/tuist/kubernetes/client_test.exs passed with 24 tests, 0 failures, and I rendered the production Helm chart with dummy release-time image pins to verify the Atlas env block.