Service Disruption in CA region

Incident Report for SAP LeanIX

Postmortem

Incident Description

On March 05, 2025 between 20:14 and 20:39 UTC, workspaces hosted on the ca.leanix.net instance were inaccessible via the UI or APIs.

Incident Resolution

The incident was resolved by stopping two long-running background jobs.

Root Cause Analysis

Two long-running background jobs were conflicting and blocking each other, leading to an exhaustion of server resources that caused the downtime.

Preventative Measures

We introduced logic to prevent such conflicting jobs to run at the same time. Additionally, we have adjusted our alerting configuration to be notified early of resource shortage situations.

Posted Mar 07, 2025 - 10:30 UTC

Resolved

We are currently experiencing a service disruption in CA region. Our team is working to identify the root cause and implement a solution.
Posted Mar 05, 2025 - 20:30 UTC