Array Expressions#
ArrayExcept#
By default, Comet accelerates ArrayExcept using JVM codegen dispatch, which runs Spark’s generated code inside Comet’s native pipeline and matches Spark exactly. Set spark.comet.expression.ArrayExcept.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
Null handling and ordering may differ from Spark
ArrayIntersect#
By default, Comet accelerates ArrayIntersect using JVM codegen dispatch, which runs Spark’s generated code inside Comet’s native pipeline and matches Spark exactly. Set spark.comet.expression.ArrayIntersect.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
Result array element order may differ from Spark when the right array is longer than the left (DataFusion probes the longer side).
The following cases are not supported by Comet:
array_intersect on collated strings is not supported.
ArrayJoin#
By default, Comet accelerates ArrayJoin using JVM codegen dispatch, which runs Spark’s generated code inside Comet’s native pipeline and matches Spark exactly. Set spark.comet.expression.ArrayJoin.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
Null handling may differ from Spark
ArraysZip#
The following cases are not supported by Comet:
Not all input data types are supported; falls back to Spark for unsupported types
ElementAt#
The following cases are not supported by Comet:
Input must be an array.
Mapinputs are not supported.
Size#
The following cases are not supported by Comet:
Only supports
ArrayTypeinput;MapTypeinput is not supported
SortArray#
The following incompatibilities cause SortArray to fall back to Spark by default. Set spark.comet.expression.SortArray.allowIncompatible=true to enable Comet acceleration despite these differences.
When
spark.comet.exec.strictFloatingPoint=true, sorting on floating-point types is not 100% compatible with Spark
The following cases are not supported by Comet:
Nested arrays with
StructorNullchild values are not supported natively and will fall back to Spark.