7 min read
For Developer Week in April 2025, we announced the public beta of R2 Data Catalog , a fully managed Apache Iceberg catalog on top of Cloudflare R2 object storage . Today, we are building on that foundation with three launches:
Cloudflare Pipelines receives events sent via Workers or HTTP, transforms them with SQL, and ingests them into Iceberg or as files on R2
R2 Data Catalog manages the Iceberg metadata and now performs ongoing maintenance, including compaction, to improve query performance
R2 SQL is our in-house distributed SQL engine, designed to perform petabyte-scale queries over your data in R2
Together, these products make up the Cloudflare Data Platform, a complete solution for ingesting, storing, and querying analytical data tables.
Like all Cloudflare Developer Platform products , they run on our global compute infrastructure. They’re built around open standards and interoperability. That means that you can bring your own Iceberg query engine — whether that's PyIceberg, DuckDB, or Spark — connect with other platforms like Databricks and Snowflake — and pay no egress fees to access your data.
Analytical data is critical for modern companies. It allows you to understand your user’s behavior, your company’s performance, and alerts you to issues. But traditional data infrastructure is expensive and hard to operate, requiring fixed cloud infrastructure and in-house expertise. We built the Cloudflare Data Platform to be easy enough for anyone to use with affordable, usage-based pricing.
If you're ready to get started now, follow the Data Platform tutorial for a step-by-step guide through creating a Pipeline that processes and delivers events to an R2 Data Catalog table, which can then be queried with R2 SQL . Or read on to learn about how we got here and how all of this works.
How did we end up building a Data Platform?
... continue reading