Array Expressions#

ArrayExcept#

By default, Comet runs a Spark-compatible implementation of ArrayExcept. Set spark.comet.expression.ArrayExcept.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:

  • Null handling and ordering may differ from Spark

ArrayIntersect#

By default, Comet runs a Spark-compatible implementation of ArrayIntersect. Set spark.comet.expression.ArrayIntersect.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:

  • Result array element order may differ from Spark when the right array is longer than the left (DataFusion probes the longer side).

  • array_intersect does not propagate non-UTF8_BINARY collations to the output array elements (https://github.com/apache/datafusion-comet/issues/2190)

ArrayJoin#

By default, Comet runs a Spark-compatible implementation of ArrayJoin. Set spark.comet.expression.ArrayJoin.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:

  • Null handling may differ from Spark

  • array_join does not propagate non-UTF8_BINARY collations to the output string (https://github.com/apache/datafusion-comet/issues/2190)

ArraysZip#

The following cases are not supported by Comet:

  • Not all input data types are supported; falls back to Spark for unsupported types

Size#

The following cases are not supported by Comet:

  • Only supports ArrayType input; MapType input is not supported

SortArray#

By default, Comet runs a Spark-compatible implementation of SortArray. Set spark.comet.expression.SortArray.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:

  • When spark.comet.exec.strictFloatingPoint=true, sorting on floating-point types is not 100% compatible with Spark

The following cases are not supported by Comet:

  • Nested arrays with Struct or Null child values are not supported natively and will fall back to Spark.