Testing¶
Tests are critical to ensure that DataFusion is working properly and is not accidentally broken during refactorings. All new features should have test coverage and the entire test suite is run as part of CI.
DataFusion has several levels of tests in its Test Pyramid and tries to follow the Rust standard Testing Organization described in The Book.
Run tests using cargo
:
cargo test
You can also use other runners such as cargo-nextest.
cargo nextest run
Unit tests¶
Tests for code in an individual module are defined in the same source file with a test
module, following Rust convention.
The test_util module provides useful macros to write unit tests effectively, such as assert_batches_sorted_eq
and assert_batches_eq
for RecordBatches and assert_contains
/ assert_not_contains
which are used extensively in the codebase.
sqllogictests Tests¶
DataFusion’s SQL implementation is tested using sqllogictest which are run like other tests using cargo test --test sqllogictests
.
sqllogictests
tests may be less convenient for new contributors who are familiar with writing .rs
tests as they require learning another tool. However, sqllogictest
based tests are much easier to develop and maintain as they 1) do not require a slow recompile/link cycle and 2) can be automatically updated via cargo test --test sqllogictests -- --complete
.
Like similar systems such as DuckDB, DataFusion has chosen to trade off a slightly higher barrier to contribution for longer term maintainability.
Rust Integration Tests¶
There are several tests of the public interface of the DataFusion library in the tests directory.
You can run these tests individually using cargo
as normal command such as
cargo test -p datafusion --test parquet_exec
SQL “Fuzz” testing¶
DataFusion uses the SQLancer for “fuzz” testing: it generates random SQL queries and execute them against DataFusion to find bugs.
The code is in the datafusion-sqllancer repository, and we welcome further contributions. Kudos to @2010YOUY01 for the initial implementation.
Documentation Examples¶
We use Rust doctest to verify examples from the documentation are correct and up-to-date. These tests are run as part of our CI and you can run them them locally with the following command:
cargo test --doc
API Documentation Examples¶
As with other Rust projects, examples in doc comments in .rs
files are
automatically checked to ensure they work and evolve along with the code.
User Guide Documentation¶
Rust example code from the user guide (anything marked with ```rust) is also tested in the same way using the doc_comment crate. See the end of core/src/lib.rs for more details.
Benchmarks¶
Criterion Benchmarks¶
Criterion is a statistics-driven micro-benchmarking framework used by DataFusion for evaluating the performance of specific code-paths. In particular, the criterion benchmarks help to both guide optimisation efforts, and prevent performance regressions within DataFusion.
Criterion integrates with Cargo’s built-in benchmark support and a given benchmark can be run with
cargo bench --bench BENCHMARK_NAME
A full list of benchmarks can be found here.
cargo-criterion may also be used for more advanced reporting.
Parquet SQL Benchmarks¶
The parquet SQL benchmarks can be run with
cargo bench --bench parquet_query_sql
These randomly generate a parquet file, and then benchmark queries sourced from parquet_query_sql.sql against it. This can therefore be a quick way to add coverage of particular query and/or data paths.
If the environment variable PARQUET_FILE
is set, the benchmark will run queries against this file instead of a randomly generated one. This can be useful for performing multiple runs, potentially with different code, against the same source data, or for testing against a custom dataset.
The benchmark will automatically remove any generated parquet file on exit, however, if interrupted (e.g. by CTRL+C) it will not. This can be useful for analysing the particular file after the fact, or preserving it to use with PARQUET_FILE
in subsequent runs.
Comparing Baselines¶
By default, Criterion.rs will compare the measurements against the previous run (if any). Sometimes it’s useful to keep a set of measurements around for several runs. For example, you might want to make multiple changes to the code while comparing against the master branch. For this situation, Criterion.rs supports custom baselines.
git checkout main
cargo bench --bench sql_planner -- --save-baseline main
git checkout YOUR_BRANCH
cargo bench --bench sql_planner -- --baseline main
Note: For MacOS it may be required to run cargo bench
with sudo
sudo cargo bench ...
More information on Baselines
Upstream Benchmark Suites¶
Instructions and tooling for running upstream benchmark suites against DataFusion can be found in benchmarks.
These are valuable for comparative evaluation against alternative Arrow implementations and query engines.