Degraded Performance

Incident Report for Upstash Status

Postmortem

A routine system maintenance operation at the OS level led to the application of system updates across multiple EC2 instances in our clusters in several AWS regions. These updates included changes to networking components, which inadvertently triggered restarts.

As a result, several EC2 nodes failed health checks and temporarily dropped out of the cluster, disrupting high availability and causing partial connectivity issues for some clients and operations.

We have since reproduced the issue in a controlled environment and verified the root cause. To prevent a recurrence, we are updating our node maintenance strategy to ensure greater control over the timing and impact of system-level changes and excluding networking components from automated upgrades.

Posted Jun 11, 2025 - 11:49 UTC

Resolved

This incident has been resolved.

Posted Jun 11, 2025 - 10:02 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Jun 11, 2025 - 09:20 UTC

Update

We are continuing to investigate this issue.

Posted Jun 11, 2025 - 08:24 UTC

Investigating

We are currently investigating this issue.

Posted Jun 11, 2025 - 06:51 UTC

This incident affected: Redis Global (N. Virginia, USA (us-east-1), N. California, USA (us-west-1), Oregon, USA (us-west-2), Frankfurt, Germany (eu-central-1), Ireland (eu-west-1), Singapore (ap-southeast-1), Sydney, Australia (ap-southeast-2), Mumbai, India (ap-south-1), Tokyo, Japan (ap-northeast-1), São Paulo, Brazil (sa-east-1), Ohio, USA (us-east-2), London, UK (eu-west-2)).