Incident report: October 15, 2025 (UTC)

Summary

On October 15, Linkly experienced a prolonged outage caused by an automated deployment that introduced a configuration error in production. The issue also disabled key elements of our monitoring and alerting systems, delaying detection and response.

Root cause

A missing safeguard allowed an automated deployment to execute outside approved windows, resulting in a configuration mismatch that disrupted routing. Monitoring checks failed to detect the outage because they were scoped to internal endpoints that continued returning success.

Corrective actions

  1. Disabled automatic deployments; production now requires manual approval.
  2. Implemented external monitoring independent of our core infrastructure.
  3. Added redundant alerting channels (SMS and voice) to ensure immediate awareness of incidents.
  4. Introduced deployment locks during off-hours until 24/7 on-call coverage is established.

Next steps

We are reviewing our operational procedures and expanding incident response coverage to ensure rapid detection and recovery from any future issues.

Closing

We apologise for the disruption and thank our customers for their patience. Reliability remains our top priority, and we are committed to preventing a recurrence.

Track 1000 monthly clicks with all features included.

No credit card required