What happened?
Between 08:55 UTC and 09:40 UTC on 19 September 2025, a platform issue resulted in an impact to the Azure Resource Manager (ARM) service in the West Europe region. Impacted customers experienced failures when performing service management operations (such as create, delete, update, scaling, start, stop) for resources hosted in this region.
What do we know so far?
We determined this service event was caused by a transient issue in an internal component responsible for initiating API requests to our backend dependencies. This component encountered intermittent failures that interrupted the communication flow, resulting in API call timeouts and subsequent failure of service request processing. As a result, service operations could not be completed manifesting in impact mentioned above.
How did we respond?
What Happens Next?
What happened?
Between 08:55 UTC and 09:25 UTC on 19 September 2025, a platform issue resulted in an impact to the Azure Resource Manager in West Europe region. Customers may have experienced receiving error notifications when performing service management operations - such as create, delete, update, scaling, start, stop - for resources hosted in this region.
This issue is now mitigated. An update with more information will be provided shortly.
What happened?
Between 12:05 UTC and 21:33 UTC on 15 September 2025, a platform issue impacted Azure Front Door (Standard, Premium, and Classic SKUs) and Azure CDN Standard from Microsoft. During this time, customers accessing services in the East US region may have experienced 504 (Gateway Timeout) errors. The impact was limited to two Points of Presence (PoPs) in East US and affected only cache-miss traffic.
What do we know so far?
Our investigation determined that the incident was caused by elevated CPU utilization in two Azure Front Door (AFD) environment. This condition led to intermittent 504 (Gateway Timeout) errors affecting cache-miss traffic. Customer retries were usually successful. The issue was identified through service monitoring and impacted approximately 0.25% of overall cache-miss requests in the affected environments. The elevated CPU load was traced to an increase in cumulative traffic, which overloaded the processing capacity of these environments.
How did we respond?
What happens next?
What happened?
Between 12:05 UTC and 21:33 UTC on 15 September 2025, a platform issue resulted in an impact to the Azure Front Door (Standard, Premium or Classic SKU) or Azure CDN Standard by Microsoft, who may have experienced 504 HTTP status response code errors when using the services in East US region. The impact is limited to 2 points of presence in East US area and occurs for caching traffic.
This issue is now mitigated. An update with more information will be provided shortly.
What happened?
Between 03:00 UTC on 19 August 2025 and 10:00 UTC on 9 September 2025, customers using the Azure App Service in Public Azure may have experienced intermittent service disruptions due to a platform issue. This resulted in intermittent 502 or 503 HTTP status codes or elevated latency affecting application workloads.
What went wrong, and why?
A recent Azure App Service release introduced a bug in an application responsible for customer app startup. This bug prevented automatic retries for certain requests needed to retrieve site startup content. The issue remained undetected until an Azure OS rollout caused some infrastructure VMs to be only partially provisioned. As a result, the application failed to retry startup requests directed to these affected VMs. Other applications successfully retried requests automatically, avoiding impact, which contributed to a delay in identifying the cause.
How did we respond?
What happens next?
What happened?
Between 03:00 UTC on 19 August 2025 and 10:00 UTC on 9 September 2025, customers using the Windows Azure App Service in Public Azure may have experienced intermittent service disruptions due to a platform issue. This resulted in occasional 502 or 503 HTTP status codes or elevated latency affecting application workloads.
This issue is now mitigated. An update with more information will be provided shortly.
What happened?
Between 03:47 UTC and 04:50 UTC 12 September 2025, a platform issue resulted in an impact to the Azure Monitor service. Impacted customers experienced delays in viewing the most recent platform metrics, delayed or missed alert activations, and incorrect auto-scale rule evaluations based on impacted metrics.
What do we know so far?
We determined that a backend processing delay in the Azure Monitor pipeline caused a lag in metric ingestion and alert evaluation. This delay was triggered by multiple service nodes entering a degraded state. This resulted in publication delays for the subset of metrics being processed by the impacted nodes, which triggered an automated pause in alert evaluation to prevent false alerts.
What do we know so far?
What happens next?
Impact Statement: Starting at 03:47 UTC on 12 September 2025, you may have been identified as a customer using Azure Monitor who may experience a delay in seeing most recent data for Azure Monitor platform metrics. You may also observe delayed or missed alert activation and incorrect auto-scale rule evaluation on impacted metrics as a result of this delay.
This impact is now mitigated; more information will be provided shortly.
Impact Statement: Starting at 03:47 UTC on 12 September 2025, you may have been identified as a customer using Azure Monitor who may experience a delay in seeing most recent data for Azure Monitor platform metrics. You may also observe delayed or missed alert activation and incorrect auto-scale rule evaluation on impacted metrics as a result of this delay.
Current Status: We detected this issue via our automated service monitoring upon identifying an outage impacting Azure Monitor metrics queries. We are aware of this issue and are actively investigating the underlying cause. The next update will be provided within 60 minutes, or as events warrant.