Cloudflare | 2025-11-18T00:00:00Z

Cloudflare 2025 Outage: Bot Management Feature File Failure

Cloudflare's 2025 outage was triggered by a database permissions change that made Bot Management feature-file generation include duplicate metadata, pushing the file past a runtime limit.

Incident answer

Impact: Core CDN and security services returned elevated HTTP 5xx errors, with related impact to Turnstile, Workers KV, Access, dashboard login, and other services.

Root cause: A ClickHouse metadata query began returning duplicate column data after a permissions change; the generated Bot Management feature file doubled in size and exceeded a proxy module limit.

Lesson: Generated configuration files need schema assumptions, size limits, validation, staged propagation, and fast rollback just like application code.

Quick Summary

On November 18, 2025, Cloudflare experienced a major outage affecting core network traffic. Cloudflare's postmortem says the incident was not a cyberattack. It was triggered by a database permissions change that caused duplicate entries in a Bot Management feature file.

That generated file was propagated across Cloudflare's network. Because the file was larger than the software limit expected by the proxy module, parts of the request path failed and returned HTTP 5xx errors.

Why It Mattered

Cloudflare sits in front of a large slice of the internet. A failure in core proxy traffic handling can make many unrelated customer sites look broken at the same time.

The incident is also memorable because the original change was not to the proxy itself. It was a permissions and metadata-query interaction in a data system used to generate configuration.

Root Cause Pattern

The pattern was generated config with hidden assumptions. The Bot Management feature-file generator assumed a metadata query would return only one set of columns. After a ClickHouse permissions change, the query also saw underlying table metadata, creating duplicate rows and a larger file.

Warning signs:

Remediation Themes

Practical lessons:

What Engineers Should Practice

When a config file changes frequently, monitor its shape as a production signal. Size, row count, schema, and parse success are all health checks.

The practical takeaway: configuration is code once production software executes it.

External References

Read Next