Apache DataFusion Blog

Articles by Xiangpeng Hao

Efficient Filter Pushdown in Parquet

Editor's Note: This blog was first published on Xiangpeng Hao's blog. Thanks to InfluxData for sponsoring this work as part of his PhD funding.


In the previous post …

Parquet Pruning in DataFusion: Read Only What Matters

Editor's Note: This blog was first published on Xiangpeng Hao's blog. Thanks to InfluxData for sponsoring this work as part of his PhD funding.


Apache Parquet has become the industry standard for storing columnar data, and reading Parquet efficiently -- especially from remote storage -- is crucial for query performance.

Apache DataFusion …