Apache DataFusion Blog

Articles by alamb, Dandandan, tustvold

Aggregating Millions of Groups Fast in Apache Arrow DataFusion 28.0.0

Aggregating Millions of Groups Fast in Apache Arrow DataFusion

Andrew Lamb, Daniël Heres, Raphael Taylor-Davies,

Note: this article was originally published on the InfluxData Blog

TLDR

Grouped aggregations are a core part of any analytic tool, creating understandable summaries of huge data volumes. Apache Arrow DataFusion’s parallel aggregation capability …