Apache DataFusion¶
DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in Rust, using the Apache Arrow in-memory format.
DataFusion offers SQL and Dataframe APIs, excellent performance, built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and a great community.
To get started, see
The example usage section of the user guide and the datafusion-examples directory.
The library user guide for examples of using DataFusion’s extension APIs
The developer’s guide for contributing and communication for getting in touch with us.
- Introduction
- Using the SQL API
- Working with
Expr
s - Using the DataFrame API
- Write DataFrame to Files
- Building Logical Plans
- Catalogs, Schemas, and Tables
- Adding User Defined Functions: Scalar/Window/Aggregate/Table Functions
- Custom Table Provider
- Extending DataFusion’s operators: custom LogicalPlan and Execution Plans
- Profiling Cookbook
- DataFusion Query Optimizer