Notification of Service Degradation and Intermediate Transaction Processing Issues

Incident Report for ApcoPay

Postmortem

1. Summary of the Incident

During the week of June 24, 2025, two separate incidents of intermittent disruption of service were observed due to issues related to database replication and synchronization. These incidents occurred:

  • Tuesday, June 24th between 04:44 and 05:26 CEST

The service disruption was caused by problems arising from unexpected growth of database logs causing interference in the normal operation of replication and synchronization processes. Immediate remediation efforts were required to restore system functionality. 

During the disruption, users experienced intermittent failures across most functions of the payment gateway, including transaction processing, balance checks, and merchant API responses.

2. Impact Analysis

  • Affected Services: the majority of services were affected intermittently during the disruption windows mentioned.
  • Affected Users: the majority of users who were using portal functions.
  • Impact Description: Users experienced intermittent transaction declines for transactions and limited portal access.

3. Root Cause

The root cause of the incident was traced to abnormal growth in database logs, which overwhelmed the replication mechanisms and resulted in synchronization failures. The underlying log management processes failed to flag the anomaly early, allowing the issue to escalate into a full-service disruption.

4. Resolution and Mitigation Steps Taken

 To resolve the issue:

  • Manual intervention was performed to investigate and stabilize the affected databases.
  • Large logs were archived and deleted to restore the normal replication and synchronization processes.
  • Database health was monitored in real time to ensure restoration was successful and persistent.

5. Preventive Measures

To prevent recurrence:

  • Monitoring and alerting systems have been enhanced to track database log size and replication status proactively.
  • Threshold-based alerts are now in place to warn operators before database logs reach critical levels.
  • A regular log maintenance schedule has been updated to avoid unplanned log growth.

6. Next Steps

·        Conduct a post-mortem analysis to assess incident response and escalation effectiveness, and to identify process improvements.

·        Implement additional monitors, checks, and notifications to track log sizes and database free space in order to mitigate similar occurrences in the future.

Posted Jun 27, 2025 - 15:40 CEST

Resolved

Dear Merchant,,

We regret to inform you that we are currently experiencing a degradation of service, and as a result, transaction processing is intermittently affected. We understand the importance of providing you with a reliable and efficient service, and we apologize for any inconvenience this may cause.

Our technical team is working diligently to identify and resolve the root cause of this issue. We are closely monitoring the situation and will keep you updated on the progress.

During this time, you may experience delays in transaction processing, and some transactions may fail. We advise you to retry the transaction at a later time, or alternatively, contact our support team for assistance.

We appreciate your patience and understanding during this period of service disruption. We assure you that we are doing everything we can to restore normal service as quickly as possible.

If you have any questions or concerns, please do not hesitate to contact

Kind Regards
Apco Support
Posted Jun 24, 2025 - 04:00 CEST