Apache DataFusion Ballista 0.7.0 Changelog#
Breaking changes:
Add
Expr::Existsto represent EXISTS subquery expression #2339 (andygrove)Remove dependency from
LogicalPlan::TableScantoExecutionPlan#2284 (andygrove)Move logical expression type-coercion code from
physical-exprcrate toexprcrate #2257 (andygrove)feat: 2061 create external table ddl table partition cols #2099 [sql] (jychen7)
Reorganize the project folders #2081 (yahoNanJing)
Rename
ExecutionContexttoSessionContext,ExecutionContextStatetoSessionState, addTaskContextto support multi-tenancy configurations - Part 1 #1987 (mingmwang)Add Create Schema functionality in SQL #1959 [sql] (matthewmturner)
remove sync constraint of SendableRecordBatchStream #1884 (doki23)
Implemented enhancements:
Add
CREATE VIEW#2279 (matthewmturner)Add missing aggr_expr to PhysicalExprNode for Ballista. #1989 (Ted-Jiang)
Fixed bugs:
Ballista integration tests no longer work #2440
Ballista crates cannot be released from DafaFusion 7.0.0 source release #1980
protobuf OctetLength should be deserialized as octet_length, not length #1834 (carols10cents)
Documentation updates:
docs: Update the Ballista dev env instructions #2419 (haoxins)
Revise document of installing ballista pinned to specified version #2034 (WinkerDu)
Performance improvements:
Introduce StageManager for managing tasks stage by stage #1983 (yahoNanJing)
Closed issues:
Make expected result string in unit tests more readable #2412
remove duplicated
fn aggregate()in aggregate expression tests #2399split
distinct_expression.rsintocount_distinct.rsandarray_agg_distinct.rs#2385move sql tests in
context.rsto corresponding test files indatafustion/core/tests/sql#2328Date32/Date64 as join keys for merge join #2314
Error precision and scale for decimal coercion in logic comparison #2232
Support Multiple row layout #2188
Discussion: Is Ballista a standalone system or framework #1916
Merged pull requests:
MINOR: Enable multi-statement benchmark queries #2507 (andygrove)
Persist session configs in scheduler #2501 (thinkharderdev)
Limit cpu cores used when generating changelog #2494 (andygrove)
Fix stage key extraction #2472 (thinkharderdev)
minor: update versions and paths in changelog scripts #2429 (andygrove)
Re-organize and rename aggregates physical plan #2388 (yjshen)
Bump follow-redirects from 1.13.2 to 1.14.9 in /ballista/ui/scheduler #2325 (dependabot[bot])
Move FileType enum from sql module to logical_plan module #2290 (andygrove)
Update uuid requirement from 0.8 to 1.0 #2280 (dependabot[bot])
Bump async from 2.6.3 to 2.6.4 in /ballista/ui/scheduler #2277 (dependabot[bot])
Bump minimist from 1.2.5 to 1.2.6 in /ballista/ui/scheduler #2276 (dependabot[bot])
Bump url-parse from 1.5.1 to 1.5.10 in /ballista/ui/scheduler #2275 (dependabot[bot])
Bump nanoid from 3.1.20 to 3.3.3 in /ballista/ui/scheduler #2274 (dependabot[bot])
Update to Arrow 12.0.0, update tonic and prost #2253 (alamb)
Add ExecutorMetricsCollector interface #2234 (thinkharderdev)
[Ballista] Enable ApproxPercentileWithWeight in Ballista and fill UT #2192 (Ted-Jiang)
[Ballista]Make PhysicalAggregateExprNode has repeated PhysicalExprNode #2184 (Ted-Jiang)
Implement fast path of with_new_children() in ExecutionPlan #2168 (mingmwang)
[MINOR] ignore suspicious slow test in Ballista #2167 (Ted-Jiang)
Add delimiter for create external table #2162 (matthewmturner)
Update sqlparser requirement from 0.15 to 0.16 #2152 (dependabot[bot])
Add IF NOT EXISTS to
CREATE TABLEandCREATE EXTERNAL TABLE#2143 (matthewmturner)Update quarterly roadmap for Q2 #2133 (matthewmturner)
[Ballista] Add ballista plugin manager and UDF plugin #2131 (gaojun2048)
Serialize scalar UDFs in physical plan #2130 (thinkharderdev)
Reduce repetition in Decimal binary kernels, upgrade to arrow 11.1 #2107 (alamb)
update zlib version to 1.2.12 #2106 (waitingkuo)
Add CREATE DATABASE command to SQL #2094 [sql] (matthewmturner)
Refactor SessionContext, BallistaContext to support multi-tenancy configurations - Part 3 #2091 (mingmwang)
Remove dependency of common for the storage crate #2076 (yahoNanJing)
[MINOR] fix doc in `EXTRACT(field FROM source) #2074 (Ted-Jiang)
[Bug][Datafusion] fix TaskContext session_config bug #2070 (gaojun2048)
split datafusion-object-store module #2065 (yahoNanJing)
Change log level for noisy logs #2060 (thinkharderdev)
use cargo-tomlfmt to check Cargo.toml formatting in CI #2033 (WinkerDu)
Refactor SessionContext, SessionState and SessionConfig to support multi-tenancy configurations - Part 2 #2029 (mingmwang)
Use SessionContext to parse Expr protobuf #2024 (thinkharderdev)
Fix stuck issue for the load testing of Push-based task scheduling #2006 (yahoNanJing)
Make it possible to only scan part of a parquet file in a partition #1990 (yjshen)
Update Dockerfile to fix integration tests #1982 (andygrove)
Update sqlparser requirement from 0.14 to 0.15 #1966 (dependabot[bot])
Allow different types of query variables (
@@var) rather than just string #1943 [sql] (maxburke)Pruning serialization #1941 (thinkharderdev)
Fix select from EmptyExec always return 0 row after optimizer passes #1938 (Ted-Jiang)
Introduce Ballista query stage scheduler #1935 (yahoNanJing)
Add db benchmark script #1928 (matthewmturner)
add metadata to DFSchema, close #1806. #1914 [sql] (jiacai2050)
Refactor scheduler state mod #1913 (yahoNanJing)
Refactor the event channel #1912 (yahoNanJing)
Refactor scheduler server #1911 (yahoNanJing)
Updated Rust version to 1.59 in all the files #1903 (NaincyKumariKnoldus)
Create a
datafusion-protocrate for datafusion protobuf serialization #1887 (carols10cents)Fix clippy lints #1885 (HaoYang670)
Separate cpu-bound (query-execution) and IO-bound(heartbeat) to … #1883 (Ted-Jiang)
Changes after went through “Datafusion as a library section” #1868 (nonontb)
Remove allow unused imports from ballista-core, then fix all warnings #1853 (carols10cents)
Fix compiling ballista in standalone mode, add build to CI #1839 (alamb)
Update documentation example for change in API #1812 (alamb)
Refactor scheduler state with different management policy for volatile and stable states #1810 (yahoNanJing)
DataFusion + Conbench Integration #1791 (dianaclarke)
Enable periodic cleanup of work_dir directories in ballista executor #1783 (Ted-Jiang)
Use
eq_dyn,neq_dyn,lt_dyn,lt_eq_dyn,gt_dyn,gt_eq_dynkernels from arrow #1475 (alamb)