Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: parquet Clear Filter

Embedding user-defined indexes in Apache Parquet

Embedding User-Defined Indexes in Apache Parquet Files Posted on: Mon 14 July 2025 by Qi Zhu, Jigao Luo, and Andrew Lamb It’s a common misconception that Apache Parquet files are limited to basic Min/Max/Null Count statistics and Bloom filters, and that adding more advanced indexes requires changing the specification or creating a new file format. In fact, footer metadata and offset-based addressing already provide everything needed to embed user-defined index structures within Parquet files w

Embedding User-Defined Indexes in Apache Parquet

Embedding User-Defined Indexes in Apache Parquet Files Posted on: Mon 14 July 2025 by Qi Zhu, Jigao Luo, and Andrew Lamb It’s a common misconception that Apache Parquet files are limited to basic Min/Max/Null Count statistics and Bloom filters, and that adding more advanced indexes requires changing the specification or creating a new file format. In fact, footer metadata and offset-based addressing already provide everything needed to embed user-defined index structures within Parquet files w