String Expressions#

Concat#

The following incompatibilities cause Concat to fall back to Spark by default. Set spark.comet.expression.Concat.allowIncompatible=true to enable Comet acceleration despite these differences.

  • CONCAT supports only string input parameters

GetJsonObject#

The following incompatibilities cause GetJsonObject to fall back to Spark by default. Set spark.comet.expression.GetJsonObject.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Spark allows single-quoted JSON and unescaped control characters which Comet does not support

InitCap#

The following incompatibilities cause InitCap to fall back to Spark by default. Set spark.comet.expression.InitCap.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Treats hyphen as a word separator (e.g. robert rose-smith produces Robert Rose-Smith instead of Spark’s Robert Rose-smith) (https://github.com/apache/datafusion-comet/issues/1052)

Left#

The following cases are not supported by Comet:

  • Only supports BinaryType and StringType input

  • The length argument must be a literal value

Length#

The following cases are not supported by Comet:

  • BinaryType input is not supported

Lower#

The following incompatibilities cause Lower to fall back to Spark by default. Set spark.comet.expression.Lower.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Results can vary depending on locale and character set. Requires spark.comet.caseConversion.enabled=true to enable.

RLike#

The following incompatibilities cause RLike to fall back to Spark by default. Set spark.comet.expression.RLike.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Uses Rust regexp engine, which has different behavior to Java regexp engine

RegExpReplace#

The following incompatibilities cause RegExpReplace to fall back to Spark by default. Set spark.comet.expression.RegExpReplace.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Regexp pattern may not be compatible with Spark

The following cases are not supported by Comet:

  • Only supports regexp_replace with an offset of 1 (no offset)

Reverse#

The following incompatibilities cause Reverse to fall back to Spark by default. Set spark.comet.expression.Reverse.allowIncompatible=true to enable Comet acceleration despite these differences.

  • reverse on array containing binary is not supported

StringLPad#

The following cases are not supported by Comet:

  • Scalar values are not supported for the str argument. Only scalar values are supported for the pad argument.

StringRPad#

The following cases are not supported by Comet:

  • Scalar values are not supported for the str argument. Only scalar values are supported for the pad argument.

StringRepeat#

The following differences from Spark are always present and do not require any additional configuration:

  • A negative argument for the number of times to repeat throws an exception instead of returning an empty string as Spark does

StringSplit#

The following incompatibilities cause StringSplit to fall back to Spark by default. Set spark.comet.expression.StringSplit.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Regex engine differences between Java and Rust

Upper#

The following incompatibilities cause Upper to fall back to Spark by default. Set spark.comet.expression.Upper.allowIncompatible=true to enable Comet acceleration despite these differences.

  • Results can vary depending on locale and character set. Requires spark.comet.caseConversion.enabled=true to enable.