.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements. See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership. The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
.. software distributed under the License is distributed on an
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
.. KIND, either express or implied. See the License for the
.. specific language governing permissions and limitations
.. under the License.
.. image:: _static/images/original.svg
:alt: DataFusion Logo
:class: light-logo
.. image:: _static/images/original_dark.svg
:alt: DataFusion Logo
:class: dark-logo
=================
Apache DataFusion
=================
.. raw:: html
DataFusion is an extensible query engine written in `Rust `_ that
uses `Apache Arrow `_ as its in-memory format.
The documentation on this site is for the `core DataFusion project `_, which contains
libraries and binaries for developers building fast and feature rich database and analytic systems,
customized to particular workloads. See `use cases `_ for examples.
The following related subprojects target end users and have separate documentation.
- `DataFusion Python `_ offers a Python interface for SQL and DataFrame
queries.
- `DataFusion Comet `_ is an accelerator for Apache Spark based on
DataFusion.
- `DataFusion Ballista `_ is distributed processing extension for DataFusion.
"Out of the box," DataFusion offers `SQL `_
and `Dataframe `_ APIs,
excellent `performance `_, built-in support for CSV, Parquet, JSON, and Avro,
extensive customization, and a great community.
`Python Bindings `_ are also available.
`Ballista `_ is Apache DataFusion extension enabling the parallelized execution of workloads across multiple nodes in a distributed environment.
DataFusion features a full query planner, a columnar, streaming, multi-threaded,
vectorized execution engine, and partitioned data sources. You can
customize DataFusion at almost all points including additional data sources,
query languages, functions, custom operators and more.
See the `Architecture `_ section for more details.
To get started, see
* The `example usage`_ section of the user guide and the `datafusion-examples`_ directory.
* The `library user guide`_ for examples of using DataFusion's extension APIs
* The `developer’s guide`_ for contributing and `communication`_ for getting in touch with us.
.. _example usage: user-guide/example-usage.html
.. _datafusion-examples: https://github.com/apache/datafusion/tree/main/datafusion-examples
.. _developer’s guide: contributor-guide/index.html#developer-s-guide
.. _library user guide: library-user-guide/index.html
.. _communication: contributor-guide/communication.html
.. _toc.asf-links:
.. toctree::
:maxdepth: 1
:caption: ASF Links
Apache Software Foundation
License
Donate
Thanks
Security
.. _toc.links:
.. toctree::
:maxdepth: 1
:caption: Links
GitHub and Issue Tracker
crates.io
API Docs
Blog
Code of conduct
Download
.. _toc.guide:
.. toctree::
:maxdepth: 1
:caption: User Guide
user-guide/introduction
user-guide/example-usage
user-guide/features
user-guide/concepts-readings-events
user-guide/crate-configuration
user-guide/cli/index
user-guide/dataframe
user-guide/expressions
user-guide/sql/index
user-guide/configs
user-guide/explain-usage
user-guide/metrics
user-guide/faq
.. _toc.library-user-guide:
.. toctree::
:maxdepth: 1
:caption: Library User Guide
library-user-guide/index
library-user-guide/upgrading
library-user-guide/extensions
library-user-guide/using-the-sql-api
library-user-guide/working-with-exprs
library-user-guide/using-the-dataframe-api
library-user-guide/building-logical-plans
library-user-guide/catalogs
library-user-guide/functions/index
library-user-guide/custom-table-providers
library-user-guide/table-constraints
library-user-guide/extending-operators
library-user-guide/profiling
library-user-guide/query-optimizer
.. .. _toc.contributor-guide:
.. toctree::
:maxdepth: 1
:caption: Contributor Guide
contributor-guide/index
contributor-guide/communication
contributor-guide/development_environment
contributor-guide/architecture
contributor-guide/testing
contributor-guide/api-health
contributor-guide/howtos
contributor-guide/roadmap
contributor-guide/governance
contributor-guide/inviting
contributor-guide/specification/index
contributor-guide/gsoc/index
.. _toc.subprojects:
.. toctree::
:maxdepth: 1
:caption: DataFusion Subprojects
DataFusion Ballista
DataFusion Comet
DataFusion Python