collection_funcs Expression Audits#
Audit notes for expressions in this category that have been audited. Absence of an entry means the expression has not been audited yet, not that it is unsupported. See the user guide Spark Expression Support for current support status.
array_size#
Native via
size; returns -1 instead of NULL for NULL input (https://github.com/apache/datafusion-comet/issues/4560).
concat#
Spark 3.4.3 (audited 2026-05-27): identical to 3.5.8.
Spark 3.5.8 (audited 2026-05-27): baseline.
Concat(children) extends ComplexTypeMergingExpression with QueryErrorsBase;allowedTypes = Seq(StringType, BinaryType, ArrayType); result type is the merged child type. Empty children is allowed and returns the empty string of the result type.Spark 4.0.1 (audited 2026-05-27):
allowedTypeswidensStringTypetoStringTypeWithCollation(supportsTrimCollation = true). Error-formatting helper changes fromparamIndextoordinalNumber. Runtime semantics unchanged forUTF8_BINARY.Spark 4.1.1 (audited 2026-05-27): identical to 4.0.1.
Known limitation: Comet only supports
StringTypechildren natively;BinaryTypeandArrayTypeinputs fall back to Spark (https://github.com/apache/datafusion-comet/issues/4471). Non-default Spark 4.0 string collations are not propagated (https://github.com/apache/datafusion-comet/issues/2190).
reverse#
Spark 3.4.3 (audited 2026-05-27): identical to 3.5.8.
Spark 3.5.8 (audited 2026-05-27): baseline.
Reverse(child) extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant;inputTypes = Seq(TypeCollection(StringType, ArrayType));dataType = child.dataType. For string, callsUTF8String.reverse(); for array, reverses element order in-place viaGenericArrayData.Spark 4.0.1 (audited 2026-05-27):
NullIntoleranttrait replaced byoverride def nullIntolerant: Boolean = true;inputTypeswidened toSeq(TypeCollection(StringTypeWithCollation(supportsTrimCollation = true), ArrayType)). Semantics unchanged forUTF8_BINARY.Spark 4.1.1 (audited 2026-05-27): identical to 4.0.1.
Known limitation:
Reverseon an array containingBinaryTypeelements is reported asIncompatibleand falls back unless explicitly enabled (https://github.com/apache/datafusion-comet/issues/2763).
size#
Spark 3.4.3 (audited 2026-05-27): identical to 3.5.8.
Spark 3.5.8 (audited 2026-05-27): baseline.
Size(child, legacySizeOfNull) extends UnaryExpression with ExpectsInputTypes;inputTypes = Seq(TypeCollection(ArrayType, MapType)) -> IntegerType.legacySizeOfNull=truereturns-1for NULL input;falsereturns NULL. Comet routes viaCometSize, which emits aCaseWhen(isNotNull(child), size_scalar(child), Literal(legacySizeOfNull)).Spark 4.0.1 (audited 2026-05-27): byte-for-byte identical to 3.5.8.
Spark 4.1.1 (audited 2026-05-27): byte-for-byte identical to 3.5.8.
Known limitation:
SizeoverMapTypefalls back to Spark (https://github.com/apache/datafusion-comet/issues/4472).