Apache DataFusion Ballista 0.12.0 Changelog#
Documentation updates:
docs: fix link #799 (haoxins)
Merged pull requests:
[minor] remove outdate todo #683 (Ted-Jiang)
Add executor terminating status for graceful shutdown #667 (thinkharderdev)
Allow
BallistaContext::read_*methods to read multiple paths. #679 (luckylsk34)Update scheduler.md #657 (psvri)
Mark
SchedulerStateas pub #688 (Dandandan)Update graphviz-rust requirement from 0.5.0 to 0.6.1 #651 (dependabot[bot])
Upgrade DataFusion to 19.0.0 #691 (r4ntix)
Update release docs #692 (andygrove)
Mark
SchedulerServer::with_task_launcheras pub #695 (Dandandan)Make task_manager pub #698 (Dandandan)
Add ExecutionEngine abstraction #687 (andygrove)
Allow accessing s3 locations in client mode #700 (luckylsk34)
git clone branch incorrect #699 (BubbaJoe)
Fix for error message during testing #707 (yahoNanJing)
Upgrade datafusion to 20.0.0 & sqlparser to to 0.32.0 #711 (r4ntix)
Update README.md #729 (jiangzhx)
Update link to scheduler proto file in dev docs #713 (JAicewizard)
Fix
show tablesfails #715 (r4ntix)Remove redundant fields in ExecutorManager #728 (yahoNanJing)
Fix parameter ‘–config-backend’ to ‘–cluster-backend’ #720 (paolorechia)
Upgrade DataFusion to 21.0.0 #727 (r4ntix)
[minor] remove useless brackets #739 (Ted-Jiang)
Only decode plan in
LaunchMultiTaskParamsonce #743 (Dandandan)Upgrade DataFusion to 22.0.0 #740 (r4ntix)
[feature] support shuffle read with retry when facing IO error. #738 (Ted-Jiang)
[log] Print long running task status. #750 (Ted-Jiang)
Upgrade DataFusion to 23.0.0 #755 (yahoNanJing)
Fix plan metrics length and stage metrics length not match #764 (yahoNanJing)
added match arms to create ClusterStorageConfig #766 (BokarevNik)
[Improve] refactor the offer_reservation avoid wait result #760 (Ted-Jiang)
[fea] Avoid multithreaded write lock conflicts in event queue. #754 (Ted-Jiang)
Upgrade DataFusion to 24.0.0, Object_Store to 0.5.6 #769 (r4ntix)
Refine create_datafusion_context() #778 (yahoNanJing)
Remove output_partitioning for task definition #776 (yahoNanJing)
Upgrade DataFusion to 25.0.0 #779 (r4ntix)
Disable the ansi feature of tracing-subscriber #784 (yahoNanJing)
Add config grpc_server_max_decoding_message_size to make the maximum size of a decoded message at the grpc server side configurable #782 (yahoNanJing)
Fix nodejs issues in Docker build #731 (jnaous)
Upgrade node version to fix build in
main#794 (avantgardnerio)Remove redundant mod session_registry #792 (yahoNanJing)
Make last_seen_ts_threshold for getting alive executor at the scheduler side larger than the heartbeat time interval #786 (yahoNanJing)
Remove the prometheus-metrics from the default feature #788 (yahoNanJing)
Refine the ExecuteQuery grpc interface #790 (yahoNanJing)
Add config to collect statistics, enable in TPC-H benchmark #796 (Dandandan)
Add support for GCS data sources #805 (haoxins)
Update DataFusion to 26 #798 (Dandandan)
Issue 162 build docker image in ci #716 (paolorechia)
Fix index out of bounds panic #819 (yahoNanJing)
Refactor the TaskDefinition by changing encoding execution plan to the decoded one #817 (yahoNanJing)
Fix ballista-cli docs #800 (jonahgao)
docs: fix link #799 (haoxins)
Implement the with_new_children for ShuffleReaderExec #821 (yahoNanJing)
Update to point to the correct documentation #838 (dadepo)
Remove ExecutorReservation and change the task assignment philosophy from executor first to task first #823 (yahoNanJing)
Upgrade DataFusion to 27.0.0 #834 (r4ntix)
Reduce the number of calls to
create_logical_plan#842 (jonahgao)Bump semver from 5.7.1 to 5.7.2 in /ballista/scheduler/ui #843 (dependabot[bot])
Bump actions/labeler from 4.1.0 to 4.3.0 #841 (dependabot[bot])
Bump tough-cookie from 4.1.2 to 4.1.3 in /ballista/scheduler/ui #840 (dependabot[bot])
Update flatbuffers requirement from 22.9.29 to 23.5.26 #801 (dependabot[bot])
Update dirs requirement from 4.0.0 to 5.0.1 #767 (dependabot[bot])
Update libloading requirement from 0.7.3 to 0.8.0 #761 (dependabot[bot])
Introduce a cache crate supporting concurrent cache value loading #825 (yahoNanJing)
Fix cargo clippy for latest rust version #848 (yahoNanJing)
Introduce CachedBasedObjectStoreRegistry to use data source cache transparently #827 (yahoNanJing)
Add ConsistentHash for node topology management #830 (yahoNanJing)
Implement 3-phase consistent hash based task assignment policy #833 (yahoNanJing)
Update tonic requirement from 0.8 to 0.9 #733 (dependabot[bot])
Update itertools requirement from 0.10 to 0.11 #844 (dependabot[bot])
Update etcd-client requirement from 0.10 to 0.11 #845 (dependabot[bot])
Update hashbrown requirement from 0.13 to 0.14 #846 (dependabot[bot])
Bump word-wrap from 1.2.3 to 1.2.4 in /ballista/scheduler/ui #849 (dependabot[bot])
Update hdfs requirement from 0.1.1 to 0.1.4 #856 (yahoNanJing)
Update to DataFusion 28 #858 (Dandandan)
Upgrade datafusion to 30.0.0 #866 (r4ntix)
refactor: port get_scan_files to Ballista #877 (alamb)
Upgrade datafusion to 31.0.0 #878 (r4ntix)
Upgrade datafusion to 32.0.0 #899 (r4ntix)
Update to DataFusion 33 #900 (Dandandan)
Refactor lru mod, remove linked_hash_map #918 (PsiACE)
Dynamically optimize aggregate (count) based on shuffle stats #919 (Dandandan)
Use lz4 compression for shuffle files & flight stream, refactoring / improvements #920 (Dandandan)
Make max encoding message size configurable #928 (andygrove)
Set max message size to 16MB in gRPC clients #931 (andygrove)
Upgrade to DataFusion 34.0.0-rc1 #927 (andygrove)
Use official DF 34 release #939 (andygrove)
Use StreamWriter instead of FileWriter #943 (avantgardnerio)
Remove some TODO comments related to context fetching schemas from scheduler #946 (andygrove)
Fix Docker build #947 (andygrove)
Fix regression in DataFrame.write_xxx #945 (andygrove)