All systems operational
Tuist Status Updated Jun 16, 06:04 PM UTC
Components
API Public REST and OpenAPI surfaces consumed by the CLI and integrations.
Operational
CLI Command-line workflows: generate, build, test, cache, registry.
Operational
Cache Remote build and test cache used by Tuist projects.
Operational
Dashboard Web dashboard at tuist.dev.
Operational
Documentation Tuist's documentation site
Operational
Active incidents
No active incidents.
Past 14 days

tuist.dev not reachable

Critical Operational
Jun 14, 01:24 PM UTC → Jun 14, 01:59 PM UTC
  1. resolved. Production tuist.dev returned 503 because all main server pods were crash-looping during startup. The immediate failure was a timeout in license validation against Keygen, but Keygen itself was healthy. The real issue was the production stable egress gateway. Server pod traffic is routed through a Cilium egress gateway using the fixed IP 116.202.0.10. After node churn, the gateway node label and Hetzner floating IP were not attached to any active node, so selected server traffic could not reach the public internet. We restored service by assigning the floating IP to a live general worker, labeling that node as the stable egress gateway, forcing Cilium to refresh the policy, and restarting the server deployment. tuist.dev is now healthy again. To prevent this from recurring, we are working on making the stable egress setup declarative/self-healing instead of relying on a manual node handoff, and adding monitoring for the gateway readiness and server egress path.

Tests stuck in processing

Major Operational
Jun 8, 10:14 AM UTC → Jun 8, 12:37 PM UTC
  1. resolved. All stuck test results have been processed now. We're putting up an alert to catch a condition like this sooner.

Intermitent failures interacting with the cache

Minor Operational
May 29, 12:56 PM UTC → Jun 12, 08:41 AM UTC
  1. resolved. Some of our caching nodes exhibited intermittent failures under high load, which we've mitigated by adding additional regional nodes. We are actively working on developing and testing a new solution that we plan to deploy in a per-tenant model fashion that will self-regulate under high-load scenarios.
Subscribe

Follow updates from any feed reader.