Resolved
The issue looked like it stemmed from a deployment of a networking component. After an initial spike of error responses, this component started to self heal. We saw a reduction in 5xx response count and latencies. However, the problem wasn't fully resolved - 5xxs were still elevated and there was still a moderate elevation for latencies.
The operations team identified the root cause and deployed a fix to this component. The fix returned the dataplane back to operational.
Posted .
Created
We are seeing a spike in 5xx responses for Marqo index API requests. Initial estimates are that less than 10% of requests are affected.
Posted .
Resolved