Degraded performance in EU
Incident Report for LeanIX
Postmortem

Summary

Between April 11, 07:24 AM UTC and 07:47 AM UTC, response times for frontend assets increased significantly - resulting in failed requests for customers in some cases.

What happened?

A change to the infrastructure component that serves frontend assets (e.g. JavaScript, CSS) increased the CPU usage of that component significantly. As a result, the response times increased significantly and even caused timeouts for some requests. Our alerting system made us aware of the high load times, but the threshold was too high to get alerted earlier.

Mitigation: What did we do about it?

We increased the number of replicas for this component to handle the increase in load.

Follow-ups: How will we improve?

We reduced the threshold of the alert to be notified earlier. Furthermore, additional measurements and mitigation steps will be implemented on the infrastructure component side.

Posted Apr 15, 2024 - 12:16 CEST

Resolved
This incident has been resolved.
Posted Apr 11, 2024 - 10:09 CEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Apr 11, 2024 - 09:51 CEST
Investigating
Users may experience degraded performance in EU. Our team is working to identify the root cause and implement a solution.

We will send an additional update in 30 minutes.
Posted Apr 11, 2024 - 09:40 CEST
This incident affected: EU Instances (EAM, VSM, SMP).