Elevated errors in API

Incident Report for Paragon

Postmortem

Incident Report: Paragon Primary Database Disruption

On March 2, 2025 at 22:08 UTC, our primary database experienced a sharp increase in connection requests, exceeding its configured limits and causing temporary unresponsiveness. This resulted in downtime for several downstream services that rely on this including our Dashboard web app, the Connect Portal and our Billing and Account reconciliation systems. It did not impact any inbound connection or non-workflow related systems.

Impact to Customers:

  • Service Downtime: Customers were unable to access or perform actions within affected services for approximately 35 minutes.
  • Delayed or Failed Transactions: Some database-dependent operations, such as updates or queries, may have failed or been delayed.
  • Potential Data Latency: Any actions taken during the disruption may not have been immediately processed.

Resolution & Next Steps:

  • Immediate Recovery: We restored service by increasing our allowable database connections and adjusting system-wide thresholds.
  • Long-Term Improvements: We are increasing connection capacity, enhancing auto-scaling configurations, and refining our monitoring to detect and prevent similar issues.

We sincerely apologize for the inconvenience and appreciate your patience as we work to strengthen our systems for better reliability.

Posted Mar 03, 2025 - 17:16 PST

Resolved

This incident has been resolved.
Posted Mar 03, 2025 - 14:48 PST

Investigating

We are currently investigating this issue.
Posted Mar 03, 2025 - 14:21 PST
This incident affected: Workflows.