Starting a Ballista Cluster using Docker Compose

Docker Compose is a convenient way to launch a cluster when testing locally.

Build Docker Images

Run the following commands to download the official Docker image:

docker pull ghcr.io/apache/datafusion-ballista-standalone:0.12.0-rc4

Altenatively run the following commands to clone the source repository and build the Docker images from source:

git clone git@github.com:apache/datafusion-ballista.git -b 0.12.0
cd datafusion-ballista
./dev/build-ballista-docker.sh

This will create the following images:

  • apache/datafusion-ballista-benchmarks:0.12.0

  • apache/datafusion-ballista-cli:0.12.0

  • apache/datafusion-ballista-executor:0.12.0

  • apache/datafusion-ballista-scheduler:0.12.0

  • apache/datafusion-ballista-standalone:0.12.0

Start a Cluster

Using the docker-compose.yml from the source repository, run the following command to start a cluster:

docker-compose up --build

This should show output similar to the following:

$ docker-compose up
Creating network "ballista-benchmarks_default" with the default driver
Creating ballista-benchmarks_etcd_1 ... done
Creating ballista-benchmarks_ballista-scheduler_1 ... done
Creating ballista-benchmarks_ballista-executor_1  ... done
Attaching to ballista-benchmarks_etcd_1, ballista-benchmarks_ballista-scheduler_1, ballista-benchmarks_ballista-executor_1
ballista-executor_1   | [2021-08-28T15:55:22Z INFO  ballista_executor] Running with config:
ballista-executor_1   | [2021-08-28T15:55:22Z INFO  ballista_executor] work_dir: /tmp/.tmpLVx39c
ballista-executor_1   | [2021-08-28T15:55:22Z INFO  ballista_executor] concurrent_tasks: 4
ballista-scheduler_1  | [2021-08-28T15:55:22Z INFO  ballista_scheduler] Ballista v0.12.0 Scheduler listening on 0.0.0.0:50050
ballista-executor_1   | [2021-08-28T15:55:22Z INFO  ballista_executor] Ballista v0.12.0 Rust Executor listening on 0.0.0.0:50051

The scheduler listens on port 50050 and this is the port that clients will need to connect to.

Connect from the Ballista CLI

docker run --network=host -it apache/datafusion-ballista-cli:0.12.0 --host localhost --port 50050