Updating the DataFusion / protobuf schema version#
Three things must move together when bumping DataFusion:
native/Cargo.toml— thedatafusioncrate dependency.pom.xml— the<datafusion.version>Maven property. Must equal the Cargo version; a mismatch means JVM-built protobuf plans won’t deserialize on the native side.pom.xml— the<sha512>checksums on the twodownload-maven-pluginexecutions. These pin the downloaded.protofiles; the build fails if upstream silently re-tags them, which is the desired behavior.
Recipe#
# 1. Bump the Cargo dep
$EDITOR native/Cargo.toml # set datafusion = "<new>"
(cd native && cargo update -p datafusion)
# 2. Bump the Maven property to match
$EDITOR pom.xml # set <datafusion.version>
# 3. Compute the new SHA-512 hashes for both `.proto` files from the
# upstream tag you just set in step 2, then paste them into the two
# <sha512> elements in pom.xml.
NEW=$(grep -m1 -oE '<datafusion.version>[^<]+' pom.xml | cut -d'>' -f2)
curl -sL "https://raw.githubusercontent.com/apache/datafusion/$NEW/datafusion/proto-common/proto/datafusion_common.proto" | shasum -a 512 | awk '{print $1}'
curl -sL "https://raw.githubusercontent.com/apache/datafusion/$NEW/datafusion/proto/proto/datafusion.proto" | shasum -a 512 | awk '{print $1}'
$EDITOR pom.xml # paste the two hashes into the <sha512> elements
# Drop the local download cache so the next build re-downloads against
# the new hashes.
rm -rf ~/.m2/repository/.cache/download-maven-plugin target/proto
# 4. Verify
make && make test
Why the protobuf runtime version is separate#
The protobuf runtime version (<protobuf.version> in pom.xml) tracks
the Java ecosystem (security and JDK compatibility), not DataFusion.
Bump it independently when there is a reason.