On March 05, 2025 between 20:14 and 20:39 UTC, workspaces hosted on the ca.leanix.net instance were inaccessible via the UI or APIs.
The incident was resolved by stopping two long-running background jobs.
Two long-running background jobs were conflicting and blocking each other, leading to an exhaustion of server resources that caused the downtime.
We introduced logic to prevent such conflicting jobs to run at the same time. Additionally, we have adjusted our alerting configuration to be notified early of resource shortage situations.