Hive
fix(server): add stable egress for customer allowlists
GitHub issue · Closed
What changed
- Added optional platform-chart support for
CiliumEgressGatewayPolicyso Tuist server pods SNAT public outbound traffic through a fixed environment-specific egress IP. - Added a host-configurer DaemonSet for the selected gateway node so the Hetzner Floating IP and source route are continuously present on the node.
- Added per-cluster platform overlays loaded by
k8s:install-platform:- staging: active Floating IP egress through
78.47.186.71 - canary: active Floating IP egress through
78.47.174.50 - production: active Floating IP egress through
116.202.0.10
- staging: active Floating IP egress through
- Updated the platform README with the shared Floating IP failover runbook.
- Updated the public network guide to document
116.202.0.10/32as the production outbound address customers should allowlist.
Why
After the Kubernetes migration, Tuist server pods egressed through whichever worker node handled the traffic. Those node IPs are tied to node lifecycle and no longer matched the documented stable outbound range, which breaks customer infrastructure allowlists such as GHES instances.
The first staging/canary Floating IP reservations were bad at the provider-routing level: the gateway host emitted packets with the 91.98.* Floating IP source address, but return traffic never arrived, and the IPs were not reachable inbound either. A fresh Floating IP assigned to the same staging node worked immediately with the same host configuration, so the fix replaces the bad reservations instead of changing the Cilium design.
Impact
Customer-facing outbound traffic from Tuist Server now has stable source IPs:
- staging:
78.47.186.71(tuist-staging-server-egress, delete-protected) - canary:
78.47.174.50(tuist-canary-server-egress, delete-protected) - production:
116.202.0.10(tuist-production-server-egress, delete-protected)
Private, link-local, and tailnet ranges are excluded from the egress gateway policy so internal traffic keeps its normal route.
If a gateway node is replaced, operators need to move the relevant HCloud Floating IP assignment and tuist.dev/stable-egress-gateway=server node label together; the README includes that runbook.
Validation
helm lint infra/helm/platformhelm template platform infra/helm/platform --show-only templates/cilium-egress-gateway-policy.yaml -f infra/helm/platform/values-hetzner.yaml -f infra/helm/platform/values-tuist-staging.yamlhelm template platform infra/helm/platform --show-only templates/cilium-egress-gateway-host-configurer.yaml -f infra/helm/platform/values-hetzner.yaml -f infra/helm/platform/values-tuist-staging.yamlhelm template platform infra/helm/platform --show-only templates/cilium-egress-gateway-policy.yaml -f infra/helm/platform/values-hetzner.yaml -f infra/helm/platform/values-tuist-canary.yamlhelm template platform infra/helm/platform --show-only templates/cilium-egress-gateway-host-configurer.yaml -f infra/helm/platform/values-hetzner.yaml -f infra/helm/platform/values-tuist-canary.yamlhelm template platform infra/helm/platform --show-only templates/cilium-egress-gateway-policy.yaml -f infra/helm/platform/values-hetzner.yaml -f infra/helm/platform/values-tuist.yaml- Applied the platform chart to staging and verified:
CiliumEgressGatewayPolicy/tuist-server-stable-egresstargets78.47.186.71DaemonSet/kube-system/tuist-server-stable-egress-host-configureris rolled outkubectl -n tuist-staging exec deploy/tuist-tuist-server -- curl -4 https://api.ipify.orgreturns78.47.186.71
- Applied the platform chart to canary and verified:
CiliumEgressGatewayPolicy/tuist-server-stable-egresstargets78.47.174.50DaemonSet/kube-system/tuist-server-stable-egress-host-configureris rolled outkubectl -n tuist-canary exec deploy/tuist-tuist-server -- curl -4 https://api.ipify.orgreturns78.47.174.50
- Previously applied the platform chart to production and verified representative server pods report
116.202.0.10fromhttps://api.ipify.org. - Verified final HCloud Floating IP table:
tuist-staging-server-egress:78.47.186.71, assigned totuist-staging-md-0-jmvxt-ljwhq-4jv57, delete-protectedtuist-canary-server-egress:78.47.174.50, assigned totuist-canary-md-0-rwjfn-kh7bn-4ft4f, delete-protectedtuist-production-server-egress:116.202.0.10, assigned totuist-md-0-vq65h-zm8qt-rdln4, delete-protected
- Deleted the bad old
91.98.110.60and91.98.105.79Floating IP reservations after replacing them. - Verified temporary debug pods were deleted from staging and canary.