Apache DataFusion Ballista 0.6.0 Changelog#

Full Changelog

Breaking changes:

  • File partitioning for ListingTable #1141 (rdettai)

  • Register tables in BallistaContext using TableProviders instead of Dataframe #1028 (rdettai)

  • Make TableProvider.scan() and PhysicalPlanner::create_physical_plan() async #1013 (rdettai)

  • Reorganize table providers by table format #1010 (rdettai)

  • Move CBOs and Statistics to physical plan #965 (rdettai)

  • Update to sqlparser v 0.10.0 #934 [sql] (alamb)

  • FilePartition and PartitionedFile for scanning flexibility #932 [sql] (yjshen)

  • Improve SQLMetric APIs, port existing metrics #908 (alamb)

  • Add support for EXPLAIN ANALYZE #858 [sql] (alamb)

  • Rename concurrency to target_partitions #706 (andygrove)

Implemented enhancements:

Fixed bugs:

  • Test execution_plans::shuffle_writer::tests::test Fail #1040

  • Integration test fails to build docker images #918

  • Ballista: Remove hard-coded concurrency from logical plan serde code #708

  • How can I make ballista distributed compute work? #327

  • fix subquery alias #1067 [sql] (xudong963)

  • Fix compilation for ballista in stand-alone mode #1008 (Igosuki)

Documentation updates:

Performance improvements:

  • optimize build profile for datafusion python binding, cli and ballista #1137 (houqp)

Closed issues:

  • InList expr with NULL literals do not work #1190

  • update the homepage README to include values, approx_distinct, etc. #1171

  • [Python]: Inconsistencies with Python package name #1011

  • Wanting to contribute to project where to start? #983

  • delete redundant code #973

  • How to build DataFusion python wheel #853

  • Produce a design for a metrics framework #21

Merged pull requests:

  • [nit] simplify ballista executor CollectExec impl codes #1140 (panarch)

For older versions, see apache/arrow/CHANGELOG.md