Service Disruption for OData Integration in All Regions

Incident Report for SAP LeanIX

Postmortem

Incident Description

From March 24, 11:29 UTC, a change in the English translation model of Pathfinder introduced a new placeholder naming convention and modifications in the source language used for translations. These changes impacted:

  • The OData API, where labels with specific placeholders were not resolved, and field and relation names disrupted customer integrations.
  • Fact Sheet update notifications, where placeholders meant to render relation or field names were not properly resolved. While notifications were still sent, some contained unexpected values.

This led to issues for customers whose integrations relied on exact values. The issue was mitigated by implementing a transformation layer in our OData API to ensure compatibility with previous naming conventions and re-translating specific labels.

Incident Resolution

  • A fix was deployed on March 25 to address placeholder issues.
  • Affected customers were identified and contacted.
  • The initial change was not reverted; instead, a solution was implemented to maintain compatibility without requiring customer action.
  • On March 26, at 13:40 UTC, a fix was applied to the notification system at 08:47 UTC, followed by a fix for the OData API to ensure proper resolution of placeholders.

Root Cause Analysis

The translation model change introduced non-backward-compatible placeholders, which were not accounted for in our OData API. Additionally, customers were unaware of the changes affecting their integrations. There was no monitoring in place for translation model changes impacting downstream services.

Preventative Measures

To prevent similar incidents, the following improvements will be implemented:

  • Testing & Monitoring: Ensure translation model changes are tested against all integrations and establish alerts for changes affecting OData and Notifications.
  • Incident Management: Improve coordination between teams for faster response, treating functionality breaking changes as incidents with clear internal and external communication.
  • Automation & Prevention: Automate testing for translation model updates and strengthen the review process for transformation related changes.
  • Detection: Expand logging and alerting mechanisms to detect placeholder resolution issues earlier and proactively identify affected customers.
Posted Apr 04, 2025 - 07:47 UTC

Resolved

We have mitigated the problems with the OData integration in all regions.
Posted Mar 26, 2025 - 07:41 UTC

Identified

We are currently experiencing issues with our OData integration in all regions. The root cause has been identified, and we are in the process of mitigating the problem. Further updates regarding this issue will be communicated shortly.
Posted Mar 25, 2025 - 08:31 UTC
This incident affected: EU Instances (EAM), US Instances (EAM), CA Instances (EAM), AU Instances (EAM), DE Instances (EAM), CH Instances (EAM), AE Instances (EAM), UK Instances (EAM), BR Instances (EAM), SG Instances (EAM), and JP Instances (EAM).