Array Expressions#
ArrayExcept#
By default, Comet runs a Spark-compatible implementation of ArrayExcept. Set spark.comet.expression.ArrayExcept.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
Null handling and ordering may differ from Spark
ArrayIntersect#
By default, Comet runs a Spark-compatible implementation of ArrayIntersect. Set spark.comet.expression.ArrayIntersect.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
Result array element order may differ from Spark when the right array is longer than the left (DataFusion probes the longer side).
array_intersect does not propagate non-UTF8_BINARY collations to the output array elements (https://github.com/apache/datafusion-comet/issues/2190)
ArrayJoin#
By default, Comet runs a Spark-compatible implementation of ArrayJoin. Set spark.comet.expression.ArrayJoin.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
Null handling may differ from Spark
array_join does not propagate non-UTF8_BINARY collations to the output string (https://github.com/apache/datafusion-comet/issues/2190)
ArraysZip#
The following cases are not supported by Comet:
Not all input data types are supported; falls back to Spark for unsupported types
Size#
The following cases are not supported by Comet:
Only supports
ArrayTypeinput;MapTypeinput is not supported
SortArray#
By default, Comet runs a Spark-compatible implementation of SortArray. Set spark.comet.expression.SortArray.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
When
spark.comet.exec.strictFloatingPoint=true, sorting on floating-point types is not 100% compatible with Spark
The following cases are not supported by Comet:
Nested arrays with
StructorNullchild values are not supported natively and will fall back to Spark.