Supported Spark Expressions¶
Comet supports the following Spark expressions. Expressions that are marked as Spark-compatible will either run natively in Comet and provide the same results as Spark, or will fall back to Spark for cases that would not be compatible.
All expressions are enabled by default, but can be disabled by setting
spark.comet.expression.EXPRNAME.enabled=false
, where EXPRNAME
is the expression name as specified in
the following tables, such as Length
, or StartsWith
.
Expressions that are not Spark-compatible will fall back to Spark by default and can be enabled by setting
spark.comet.expression.EXPRNAME.allowIncompatible=true
.
It is also possible to specify spark.comet.expression.allowIncompatible=true
to enable all
incompatible expressions.
Conditional Expressions¶
Expression |
SQL |
Spark-Compatible? |
---|---|---|
CaseWhen |
|
Yes |
If |
|
Yes |
Predicate Expressions¶
Expression |
SQL |
Spark-Compatible? |
---|---|---|
And |
|
Yes |
EqualTo |
|
Yes |
EqualNullSafe |
|
Yes |
GreaterThan |
|
Yes |
GreaterThanOrEqual |
|
Yes |
LessThan |
|
Yes |
LessThanOrEqual |
|
Yes |
In |
|
Yes |
IsNotNull |
|
Yes |
IsNull |
|
Yes |
InSet |
|
Yes |
Not |
|
Yes |
Or |
|
Yes |
String Functions¶
Expression |
Spark-Compatible? |
Compatibility Notes |
---|---|---|
Ascii |
Yes |
|
BitLength |
Yes |
|
Chr |
Yes |
|
ConcatWs |
Yes |
|
Contains |
Yes |
|
EndsWith |
Yes |
|
InitCap |
No |
Behavior is different in some cases, such as hyphenated names. |
Length |
Yes |
|
Like |
Yes |
|
Lower |
No |
Results can vary depending on locale and character set. Requires |
OctetLength |
Yes |
|
Reverse |
Yes |
|
RLike |
No |
Uses Rust regexp engine, which has different behavior to Java regexp engine |
StartsWith |
Yes |
|
StringInstr |
Yes |
|
StringRepeat |
Yes |
Negative argument for number of times to repeat causes exception |
StringReplace |
Yes |
|
StringRPad |
Yes |
|
StringSpace |
Yes |
|
StringTranslate |
Yes |
|
StringTrim |
Yes |
|
StringTrimBoth |
Yes |
|
StringTrimLeft |
Yes |
|
StringTrimRight |
Yes |
|
Substring |
Yes |
|
Upper |
No |
Results can vary depending on locale and character set. Requires |
Date/Time Functions¶
Expression |
SQL |
Spark-Compatible? |
Compatibility Notes |
---|---|---|---|
DateAdd |
|
Yes |
|
DateSub |
|
Yes |
|
DatePart |
|
Yes |
Supported values of |
Extract |
|
Yes |
Supported values of |
FromUnixTime |
|
No |
Does not support format, supports only -8334601211038 <= sec <= 8210266876799 |
Hour |
|
Yes |
|
Minute |
|
Yes |
|
Second |
|
Yes |
|
TruncDate |
|
Yes |
|
TruncTimestamp |
|
Yes |
|
Year |
|
Yes |
|
Month |
|
Yes |
|
DayOfMonth |
|
Yes |
|
DayOfWeek |
|
Yes |
|
WeekDay |
|
Yes |
|
DayOfYear |
|
Yes |
|
WeekOfYear |
|
Yes |
|
Quarter |
|
Yes |
Math Expressions¶
Expression |
SQL |
Spark-Compatible? |
Compatibility Notes |
---|---|---|---|
Acos |
|
Yes |
|
Add |
|
Yes |
ANSI mode is not supported. |
Asin |
|
Yes |
|
Atan |
|
Yes |
|
Atan2 |
|
Yes |
|
BRound |
|
Yes |
ANSI mode is not supported. |
Ceil |
|
Yes |
|
Cos |
|
Yes |
|
Divide |
|
Yes |
ANSI mode is not supported. |
Exp |
|
Yes |
|
Expm1 |
|
Yes |
|
Floor |
|
Yes |
|
Hex |
|
Yes |
|
IntegralDivide |
|
Yes |
ANSI mode is not supported. |
IsNaN |
|
Yes |
|
Log |
|
Yes |
|
Log2 |
|
Yes |
|
Log10 |
|
Yes |
|
Multiply |
|
Yes |
ANSI mode is not supported. |
Pow |
|
Yes |
|
Rand |
|
Yes |
|
Randn |
|
Yes |
|
Remainder |
|
Yes |
ANSI mode is not supported. |
Round |
|
Yes |
ANSI mode is not supported. |
Signum |
|
Yes |
|
Sin |
|
Yes |
|
Sqrt |
|
Yes |
|
Subtract |
|
Yes |
ANSI mode is not supported. |
Tan |
|
Yes |
|
TryAdd |
|
Yes |
Only integer inputs are supported |
TryDivide |
|
Yes |
Only integer inputs are supported |
TryMultiply |
|
Yes |
Only integer inputs are supported |
TrySubtract |
|
Yes |
Only integer inputs are supported |
UnaryMinus |
|
Yes |
|
Unhex |
|
Yes |
Hashing Functions¶
Expression |
Spark-Compatible? |
---|---|
Md5 |
Yes |
Murmur3Hash |
Yes |
Sha2 |
Yes |
XxHash64 |
Yes |
Bitwise Expressions¶
Expression |
SQL |
Spark-Compatible? |
---|---|---|
BitwiseAnd |
|
Yes |
BitwiseCount |
Yes |
|
BitwiseGet |
Yes |
|
BitwiseOr |
|
Yes |
BitwiseNot |
|
Yes |
BitwiseXor |
|
Yes |
ShiftLeft |
|
Yes |
ShiftRight |
|
Yes |
Aggregate Expressions¶
Expression |
SQL |
Spark-Compatible? |
Compatibility Notes |
---|---|---|---|
Average |
Yes, except for ANSI mode |
||
BitAndAgg |
Yes |
||
BitOrAgg |
Yes |
||
BitXorAgg |
Yes |
||
BoolAnd |
|
Yes |
|
BoolOr |
|
Yes |
|
Corr |
Yes |
||
Count |
Yes |
||
CovPopulation |
Yes |
||
CovSample |
Yes |
||
First |
No |
This function is not deterministic. Results may not match Spark. |
|
Last |
No |
This function is not deterministic. Results may not match Spark. |
|
Max |
Yes |
||
Min |
Yes |
||
StddevPop |
Yes |
||
StddevSamp |
Yes |
||
Sum |
Yes, except for ANSI mode |
||
VariancePop |
Yes |
||
VarianceSamp |
Yes |
Array Expressions¶
Expression |
Spark-Compatible? |
Compatibility Notes |
---|---|---|
ArrayAppend |
No |
|
ArrayCompact |
No |
|
ArrayContains |
Yes |
|
ArrayDistinct |
No |
Behaves differently than spark. Comet first sorts then removes duplicates while Spark preserves the original order. |
ArrayExcept |
No |
|
ArrayInsert |
No |
|
ArrayIntersect |
No |
|
ArrayJoin |
No |
|
ArrayMax |
Yes |
|
ArrayMin |
Yes |
|
ArrayRemove |
Yes |
|
ArrayRepeat |
No |
|
ArrayUnion |
No |
Behaves differently than spark. Comet sorts the input arrays before performing the union, while Spark preserves the order of the first array and appends unique elements from the second. |
ArraysOverlap |
No |
|
CreateArray |
Yes |
|
ElementAt |
Yes |
Input must be an array. Map inputs are not supported. |
Flatten |
Yes |
|
GetArrayItem |
Yes |
Map Expressions¶
Expression |
Spark-Compatible? |
---|---|
GetMapValue |
Yes |
MapKeys |
Yes |
MapEntries |
Yes |
MapValues |
Yes |
MapFromArrays |
Yes |
Struct Expressions¶
Expression |
Spark-Compatible? |
---|---|
CreateNamedStruct |
Yes |
GetArrayStructFields |
Yes |
GetStructField |
Yes |
StructsToJson |
Yes |
Conversion Expressions¶
Expression |
Spark-Compatible |
Compatibility Notes |
---|---|---|
Cast |
Depends on specific cast |
See the Comet Compatibility Guide for list of supported cast expressions and known issues |
Other¶
Expression |
Spark-Compatible? |
Compatibility Notes |
---|---|---|
Alias |
Yes |
|
AttributeRefernce |
Yes |
|
BloomFilterMightContain |
Yes |
|
Coalesce |
Yes |
|
CheckOverflow |
Yes |
|
KnownFloatingPointNormalized |
Yes |
|
Literal |
Yes |
|
MakeDecimal |
Yes |
|
MonotonicallyIncreasingID |
Yes |
|
NormalizeNaNAndZero |
Yes |
|
PromotePrecision |
Yes |
|
RegExpReplace |
No |
Uses Rust regexp engine, which has different behavior to Java regexp engine |
ScalarSubquery |
Yes |
|
SparkPartitionID |
Yes |
|
ToPrettyString |
Yes |
|
UnscaledValue |
Yes |