Aggregating Millions of Groups Fast in Apache Arrow DataFusion
Andrew Lamb, Daniël Heres, Raphael Taylor-Davies,
Note: this article was originally published on the InfluxData Blog
TLDR
Grouped aggregations are a core part of any analytic tool, creating understandable summaries of huge data volumes. Apache Arrow DataFusion’s parallel aggregation capability …