Apache DataFusion Blog

Articles by alamb, akurmustafa

Optimizing SQL (and DataFrames) in DataFusion, Part 1: Query Optimization Overview

Note: this blog was originally published on the InfluxData blog

Introduction

Sometimes Query Optimizers are seen as a sort of black magic, “the most challenging problem in computer science,” according to Father Pavlo, or some behind-the-scenes player. We believe this perception is because:

  1. One must implement the rest of a …

Optimizing SQL (and DataFrames) in DataFusion, Part 2: Optimizers in Apache DataFusion

Note, this blog was originally published on the InfluxData blog.

In the first part of this post, we discussed what a Query Optimizer is, what role it plays, and described how industrial optimizers are organized. In this second post, we describe various optimizations that are found in Apache DataFusion and …