Degraded performance across endpoints / models

Resolved·Degraded performance

🎉 The issue has been resolved and we are back to normal.

The incident is now fully resolved, and we won't need to schedule a maintenance window regarding the DB scale up. Impact of downtime: * period of 2 hours with higher latencies * average of 10% of requests were timing out. * /classify was most hit with 80% of requests failing

Tue, Apr 16, 2024, 07:22 PM

(1 year ago)

·

Affected components

Apr 16, 2024, 04:25 PM

07:22 PM

Coral

Playground

Updates

Resolved

🎉 The issue has been resolved and we are back to normal.

The incident is now fully resolved, and we won't need to schedule a maintenance window regarding the DB scale up. Impact of downtime: * period of 2 hours with higher latencies * average of 10% of requests were timing out. * /classify was most hit with 80% of requests failing

Tue, Apr 16, 2024, 07:22 PM

Monitoring

👀 We are monitoring to make sure the incident has been fully resolved.

A fix has been implemented, error rates & latency response times have been resolved since 2:10 PM.

Tue, Apr 16, 2024, 06:53 PM(29 minutes earlier)

Identified

🛠️ We have identified the root cause of the incident, and are working diligently to fix.

We have identified an issue with the database related to increased pressure on the system. A subset of requests experienced high latency during a window from 12:05PM. We have root caused and are deploying mitigating issues until we can schedule a bigger maintenance window for the fix.

Tue, Apr 16, 2024, 04:25 PM(2 hours earlier)