String Expressions#
Concat#
The following incompatibilities cause Concat to fall back to Spark by default. Set spark.comet.expression.Concat.allowIncompatible=true to enable Comet acceleration despite these differences.
concat does not support non-UTF8_BINARY collations (https://github.com/apache/datafusion-comet/issues/2190)
The following cases are not supported by Comet:
CONCAT supports only string input parameters
Left#
The following cases are not supported by Comet:
Only supports
BinaryTypeandStringTypeinputThe length argument must be a literal value
Length#
The following cases are not supported by Comet:
BinaryTypeinput is not supported
RLike#
The following incompatibilities cause RLike to fall back to Spark by default. Set spark.comet.expression.RLike.allowIncompatible=true to enable Comet acceleration despite these differences.
Uses Rust regexp engine, which has different behavior to Java regexp engine
RegExpReplace#
The following incompatibilities cause RegExpReplace to fall back to Spark by default. Set spark.comet.expression.RegExpReplace.allowIncompatible=true to enable Comet acceleration despite these differences.
Regexp pattern may not be compatible with Spark
The following cases are not supported by Comet:
Only supports
regexp_replacewith an offset of 1 (no offset)
Reverse#
The following incompatibilities cause Reverse to fall back to Spark by default. Set spark.comet.expression.Reverse.allowIncompatible=true to enable Comet acceleration despite these differences.
reverse on array containing binary is not supported
reverse does not support non-UTF8_BINARY collations (https://github.com/apache/datafusion-comet/issues/2190)
Right#
The following cases are not supported by Comet:
Only supports
StringTypeinput
StringLPad#
The following cases are not supported by Comet:
Scalar values are not supported for the
strargument. Only scalar values are supported for thepadargument.
StringRPad#
The following cases are not supported by Comet:
Scalar values are not supported for the
strargument. Only scalar values are supported for thepadargument.
StringRepeat#
The following differences from Spark are always present and do not require any additional configuration:
A negative argument for the number of times to repeat throws an exception instead of returning an empty string as Spark does
StringSplit#
The following incompatibilities cause StringSplit to fall back to Spark by default. Set spark.comet.expression.StringSplit.allowIncompatible=true to enable Comet acceleration despite these differences.
Regex engine differences between Java and Rust