Cloudflare
7 min read

Cloudflare outage on December 5, 2025

Read Full Article

Summary

On December 5, 2025, Cloudflare experienced a significant outage affecting a portion of its network due to a configuration change related to its Web Application Firewall (WAF). The incident, which lasted approximately 25 minutes, was triggered by an attempt to increase the buffer size in response to a critical vulnerability in React Server Components. The change inadvertently led to HTTP 500 errors due to a bug in the rules module of the FL1 proxy. The article outlines the sequence of events, the technical failures involved, and the planned improvements to prevent future incidents, including enhanced rollouts and fail-open error handling strategies.

Key Learnings

  • 1Configuration changes in critical systems must be carefully managed to avoid widespread outages.
  • 2The importance of gradual rollouts and health validation in deployment processes to mitigate risks.
  • 3Understanding the implications of using 'execute' actions in rulesets and the potential for runtime errors in loosely typed languages.
  • 4The necessity of robust error handling mechanisms, such as fail-open strategies, to maintain service availability during failures.
  • 5The role of internal testing tools and their compatibility with production configurations in preventing incidents.

Who Should Read This

Senior Cloud Engineers managing high-availability web services and incident response strategies.

Test Your Knowledge

?

What are the potential risks associated with increasing buffer sizes in WAF configurations?

?

How can gradual deployment strategies mitigate the impact of configuration changes in large-scale systems?

?

What lessons can be learned from the Lua exception encountered during the outage, and how could strong typing have prevented it?

?

In what ways can fail-open error handling improve service resilience during configuration failures?

?

What specific changes are being implemented to enhance the robustness of Cloudflare's network following the incident?

Topics

Read Full Article at Cloudflare