Summary
On Wednesday, January 27th Redtail experienced an issue that impacted users accessibility to most of our products. One of our central database servers experienced issues which were exacerbated by user login attempts across multiple time zones at the same time once the database issue was resolved thus resulting in a longer recovery time.
What Happened?
On the evening of January 26th, our infrastructure team migrated some data tables from a legacy database server to it's AWS replacement. During the hours after the migration, the RDS instance began to experience performance issues. A secondary issue not related to the data migration and occurred when the user table of the legacy database server became inaccessible. This resulted in connections not being able to authenticate. Due to the inability to authenticate, connections were building up causing a flood of connections to hit our services at the same time once access to the table was restored. After the data table access was restored, our engineers worked to add some additional infrastructure to mitigate the impact of the connection load flooding our servers.
Another complicating issue was the weather in the Sacramento area. The inclement weather resulted in multiple power outages through the night and lead to several technicians being without power. This directly impacted response times and monitoring solutions thus delaying the overall initial response during triage of the incident.
Root Cause(s):
The root cause for the incident was related to a data table becoming inaccessible within a central database.
We at Redtail would like to extend to you our humble and sincere apology for any negative impact the outline issue(s) above had on you and your business. We understand how critical it is that we deliver maximum uptime to support your daily operations, and will increase our efforts to meet and exceed your availability expectations. Please rest assured that we will do everything we can to learn from this event and use it to strive for improvement across all of our services.