BGP zombies and excessive path hunting
Read Full ArticleSummary
The article delves into the phenomenon of BGP zombies, which are routes that persist in the Default-Free Zone despite being withdrawn, causing operational issues for network operators. It explains how BGP path hunting contributes to the formation of these zombies, particularly during the withdrawal of more-specific prefixes. The authors illustrate the mechanics of BGP routing, including the impact of Minimum Route Advertisement Interval (MRAI) on convergence times and the potential for excessive path hunting to exacerbate zombie occurrences. The article also discusses strategies for mitigating the impact of BGP zombies, emphasizing the importance of careful route management and the implementation of best practices in BGP configurations.
Key Learnings
- 1BGP zombies can disrupt network traffic by causing routes to become stuck, leading to inefficient routing and potential packet loss.
- 2Path hunting is a critical process in BGP that can lead to prolonged convergence times, particularly when more-specific prefixes are withdrawn.
- 3The Minimum Route Advertisement Interval (MRAI) plays a significant role in the timing of BGP updates and can contribute to the persistence of BGP zombies.
- 4Implementing a multi-step draining process for prefix withdrawals can help prevent BGP zombies from forming and improve overall network stability.
- 5Understanding the interactions between different Autonomous Systems (AS) is crucial for diagnosing and mitigating BGP-related issues.
Who Should Read This
Senior Network Engineers managing BGP configurations and troubleshooting routing issues in large-scale networks
Test Your Knowledge
What are the primary causes of BGP zombies, and how can they be identified in a network?
How does the Minimum Route Advertisement Interval (MRAI) affect BGP convergence times and the likelihood of path hunting?
What strategies can be employed to minimize the operational impact of BGP zombies on network traffic?
In what scenarios might excessive path hunting lead to a BGP zombie outbreak, and how can network operators prepare for such events?
How do the behaviors of different Autonomous Systems (AS) influence the formation and duration of BGP zombies?
Topics
More articles about BGP
Explore BGP engineering →ASPA: making Internet routing more secure
The article introduces ASPA (Autonomous System Provider Authorization), a new cryptographic standard aimed at enhancing the security of Internet routing by validating the paths that network traffic...
Route leak incident on January 22, 2026
On January 22, 2026, a misconfiguration in Cloudflare's routing policy led to a significant BGP route leak, affecting both Cloudflare customers and external networks. The incident, which lasted 25...
A closer look at a BGP anomaly in Venezuela
The article examines a recent BGP anomaly involving AS8048 (CANTV) in Venezuela, highlighting the implications of route leaks and the underlying routing policies that may have contributed to the...
More from Cloudflare Engineering
View Cloudflare engineering blogs →Complexity is a choice. SASE migrations shouldn’t take years.
The article emphasizes the shift in the cybersecurity landscape regarding SASE migrations, arguing that complexity is a choice rather than an inevitability. It showcases how Cloudflare's SASE...
Active defense: introducing a stateful vulnerability scanner for APIs
The article introduces Cloudflare's new stateful vulnerability scanner designed specifically for APIs, addressing the limitations of traditional defensive security measures. It highlights the...
Fixing request smuggling vulnerabilities in Pingora OSS deployments
The article addresses critical HTTP/1.x request smuggling vulnerabilities identified in the Pingora open source framework, particularly when deployed as an ingress proxy. It outlines the nature of...
From the endpoint to the prompt: a unified data security vision in Cloudflare One
The article outlines Cloudflare One's evolution in data security, emphasizing a unified approach that encompasses protection in transit, visibility and control at rest, and enforcement in use. It...
A QUICker SASE client: re-building Proxy Mode
The article outlines the challenges faced by security teams when implementing proxy modes in SASE environments, particularly the performance issues associated with traditional TCP implementations. It...