Cloudflare
15 min read

Cloudflare outage on February 20, 2026

Read Full Article

Summary

On February 20, 2026, Cloudflare experienced a significant outage affecting customers using its Bring Your Own IP (BYOIP) service due to a misconfiguration in the Border Gateway Protocol (BGP) management. The incident resulted in the withdrawal of approximately 1,100 IP prefixes, rendering services unreachable for many users. The root cause was traced back to a bug in the Addressing API that mishandled a request for prefix deletions, leading to unintended mass withdrawals. Cloudflare's engineers were able to restore service by reverting the changes and guiding customers to re-advertise their prefixes. The incident highlighted the need for improved testing and operational protocols, particularly in the context of the ongoing Code Orange: Fail Small initiative aimed at enhancing the resilience of Cloudflare's network operations.

Key Learnings

  • 1Understanding the critical role of BGP in managing IP address advertisements and the potential impact of misconfigurations.
  • 2The importance of robust testing environments that accurately reflect production scenarios to catch bugs before deployment.
  • 3The necessity of having automated rollback mechanisms and clear separation between operational and configured states to facilitate quick recovery from incidents.
  • 4Recognizing the implications of manual processes in automated systems and the risks they pose to production stability.
  • 5The value of clear communication and guidance for customers during service outages to mitigate impact and facilitate recovery.

Who Should Read This

Senior Network Engineers and Cloud Architects focusing on incident management and network reliability in cloud services.

Test Your Knowledge

?

What specific changes were made to the BYOIP service that led to the outage, and how could they have been avoided?

?

How does the Addressing API function, and what improvements are being proposed to enhance its reliability?

?

What are the implications of BGP Path Hunting for end-user connections during an outage?

?

In what ways can Cloudflare's Code Orange: Fail Small initiative improve the resiliency of their network operations?

?

What lessons can be learned from the incident regarding the balance between automation and manual intervention in network management?

Topics

Read Full Article at Cloudflare