Incident Report: Paragon Primary Database Disruption
On March 2, 2025 at 22:08 UTC, our primary database experienced a sharp increase in connection requests, exceeding its configured limits and causing temporary unresponsiveness. This resulted in downtime for several downstream services that rely on this including our Dashboard web app, the Connect Portal and our Billing and Account reconciliation systems. It did not impact any inbound connection or non-workflow related systems.
Impact to Customers:
- Service Downtime: Customers were unable to access or perform actions within affected services for approximately 35 minutes.
- Delayed or Failed Transactions: Some database-dependent operations, such as updates or queries, may have failed or been delayed.
- Potential Data Latency: Any actions taken during the disruption may not have been immediately processed.
Resolution & Next Steps:
- Immediate Recovery: We restored service by increasing our allowable database connections and adjusting system-wide thresholds.
- Long-Term Improvements: We are increasing connection capacity, enhancing auto-scaling configurations, and refining our monitoring to detect and prevent similar issues.
We sincerely apologize for the inconvenience and appreciate your patience as we work to strengthen our systems for better reliability.