String Expressions#
BitLength#
The following cases are not supported by Comet:
BinaryTypeinput is not supported
Concat#
The following incompatibilities cause Concat to fall back to Spark by default. Set spark.comet.expression.Concat.allowIncompatible=true to enable Comet acceleration despite these differences.
concat does not support non-UTF8_BINARY collations (https://github.com/apache/datafusion-comet/issues/2190)
The following cases are not supported by Comet:
CONCAT supports only string input parameters
Left#
The following cases are not supported by Comet:
Only supports
BinaryTypeandStringTypeinputThe length argument must be a literal value
Length#
The following cases are not supported by Comet:
BinaryTypeinput is not supported
OctetLength#
The following cases are not supported by Comet:
BinaryTypeinput is not supported
Reverse#
By default, Comet accelerates Reverse using JVM codegen dispatch, which runs Spark’s generated code inside Comet’s native pipeline and matches Spark exactly. Set spark.comet.expression.Reverse.allowIncompatible=true to use Comet’s faster native implementation instead, which has the following differences from Spark:
reverse on array containing binary is not supported
reverse does not support non-UTF8_BINARY collations (https://github.com/apache/datafusion-comet/issues/2190)
Right#
The following cases are not supported by Comet:
Only supports
StringTypeinput
StringLPad#
The following cases are not supported by Comet:
Scalar values are not supported for the
strargument.Only scalar values are supported for the
padargument.
StringRPad#
The following cases are not supported by Comet:
Scalar values are not supported for the
strargument.Only scalar values are supported for the
padargument.
StringRepeat#
The following differences from Spark are always present and do not require any additional configuration:
A negative argument for the number of times to repeat throws an exception instead of returning an empty string as Spark does
StringTranslate#
The following incompatibilities cause StringTranslate to fall back to Spark by default. Set spark.comet.expression.StringTranslate.allowIncompatible=true to enable Comet acceleration despite these differences.
DataFusion’s translate iterates over Unicode graphemes (Spark uses code points) and substitutes U+0000 instead of treating it as a deletion sentinel