[Quickbase US Status] Notice: Service Not Available (Unplanned) - Down Time
[Quickbase US Status] Notice: Service Not Available (Unplanned) - Down Time
06/08/2023, 02:35pm EDT
06/08/2023, 02:35pm EDT
Starting around 2:35 PM Eastern US Time, we are seeing performance degradation to the platform. We are investigating this issue and will provide an update as soon as we have one.
06/08/2023, 03:02pm EDT
06/08/2023, 03:02pm EDT
We continue to investigate this issue. We will provide another update in the next 30 minutes.
06/08/2023, 03:29pm EDT
06/08/2023, 03:29pm EDT
We are still investigating this issue with the highest urgency. We will provide another update in 30 minutes.
06/08/2023, 03:35pm EDT
06/08/2023, 03:35pm EDT
Starting around 2:35 PM Eastern US Time, we are seeing performance degradation to the platform. We are investigating this issue and will provide an update within 30 minutes.
06/08/2023, 03:54pm EDT
06/08/2023, 03:54pm EDT
We are seeing improvements in performance to the Quickbase platform. We continue to monitor this issue and will provide another update in 30 minutes.
06/08/2023, 03:55pm EDT
06/08/2023, 03:55pm EDT
As of 3:55 PM Eastern US Time, the Quickbase platform returned to normal performance levels. (Note, you are likely receiving this e-mail close to 4:30 PM Eastern US Time.)
This incident is closed.
06/09/2023, 12:30am EDT
06/09/2023, 12:30am EDT
All times shown are Eastern US Time.
Between 2:35 PM and 3:55 PM, the performance of the Quickbase US platform was degraded with many requests timing out or returning an error. Between 2:35 PM and 3:20 PM, the customer experience may have been normal for some customers while others had degraded performance. After 3:20 PM, most customers likely had a poor user experience. This incident was resolved by 3:55 PM.
The preliminary root cause is that a platform service that checks which platform features a user is eligible to access received a large influx of requests that resulted in queuing of requests. At this point in the incident, performance was slow but most requests would ultimately complete. This queuing in turn caused a platform service that routes requests to eventually exhaust its available resources which is when the platform performance further degraded. We are still evaluating the cause of the large influx of initial requests that triggered the problem but believe it originated with unexpected behavior of how pipelines interacted with the feature status service.
We've implemented two improvements in our platform monitoring that will provide us with faster identification of the specific area of the platform where this issue occurred. We are also researching methods of improving the scalability of the feature status service and the routing service.
We will continue to update this root cause as we learn more from our investigation of this incident.