Supported Spark Expressions

Comet supports the following Spark expressions. Expressions that are marked as Spark-compatible will either run natively in Comet and provide the same results as Spark, or will fall back to Spark for cases that would not be compatible.

All expressions are enabled by default, but can be disabled by setting spark.comet.expression.EXPRNAME.enabled=false, where EXPRNAME is the expression name as specified in the following tables, such as Length, or StartsWith.

Expressions that are not Spark-compatible will fall back to Spark by default and can be enabled by setting spark.comet.expression.EXPRNAME.allowIncompatible=true.

It is also possible to specify spark.comet.expression.allowIncompatible=true to enable all incompatible expressions.

Conditional Expressions

Expression

SQL

Spark-Compatible?

CaseWhen

CASE WHEN expr THEN expr ELSE expr END

Yes

If

IF(predicate_expr, true_expr, false_expr)

Yes

Predicate Expressions

Expression

SQL

Spark-Compatible?

And

AND

Yes

EqualTo

=

Yes

EqualNullSafe

<=>

Yes

GreaterThan

>

Yes

GreaterThanOrEqual

>=

Yes

LessThan

<

Yes

LessThanOrEqual

<=

Yes

In

IN

Yes

IsNotNull

IS NOT NULL

Yes

IsNull

IS NULL

Yes

InSet

IN (...)

Yes

Not

NOT

Yes

Or

OR

Yes

String Functions

Expression

Spark-Compatible?

Compatibility Notes

Ascii

Yes

BitLength

Yes

Chr

Yes

ConcatWs

Yes

Contains

Yes

EndsWith

Yes

InitCap

No

Behavior is different in some cases, such as hyphenated names.

Length

Yes

Like

Yes

Lower

No

Results can vary depending on locale and character set. Requires spark.comet.caseConversion.enabled=true

OctetLength

Yes

Reverse

Yes

RLike

No

Uses Rust regexp engine, which has different behavior to Java regexp engine

StartsWith

Yes

StringInstr

Yes

StringRepeat

Yes

Negative argument for number of times to repeat causes exception

StringReplace

Yes

StringRPad

Yes

StringSpace

Yes

StringTranslate

Yes

StringTrim

Yes

StringTrimBoth

Yes

StringTrimLeft

Yes

StringTrimRight

Yes

Substring

Yes

Upper

No

Results can vary depending on locale and character set. Requires spark.comet.caseConversion.enabled=true

Date/Time Functions

Expression

SQL

Spark-Compatible?

Compatibility Notes

DateAdd

date_add

Yes

DateSub

date_sub

Yes

DatePart

date_part(field, source)

Yes

Supported values of field: year/month/week/day/dayofweek/dayofweek_iso/doy/quarter/hour/minute

Extract

extract(field FROM source)

Yes

Supported values of field: year/month/week/day/dayofweek/dayofweek_iso/doy/quarter/hour/minute

FromUnixTime

from_unixtime

No

Does not support format, supports only -8334601211038 <= sec <= 8210266876799

Hour

hour

Yes

Minute

minute

Yes

Second

second

Yes

TruncDate

trunc

Yes

TruncTimestamp

trunc_date

Yes

Year

year

Yes

Month

month

Yes

DayOfMonth

day/dayofmonth

Yes

DayOfWeek

dayofweek

Yes

WeekDay

weekday

Yes

DayOfYear

dayofyear

Yes

WeekOfYear

weekofyear

Yes

Quarter

quarter

Yes

Math Expressions

Expression

SQL

Spark-Compatible?

Compatibility Notes

Acos

acos

Yes

Add

+

Yes

ANSI mode is not supported.

Asin

asin

Yes

Atan

atan

Yes

Atan2

atan2

Yes

BRound

bround

Yes

ANSI mode is not supported.

Ceil

ceil

Yes

Cos

cos

Yes

Divide

/

Yes

ANSI mode is not supported.

Exp

exp

Yes

Expm1

expm1

Yes

Floor

floor

Yes

Hex

hex

Yes

IntegralDivide

div

Yes

ANSI mode is not supported.

IsNaN

isnan

Yes

Log

log

Yes

Log2

log2

Yes

Log10

log10

Yes

Multiply

*

Yes

ANSI mode is not supported.

Pow

power

Yes

Rand

rand

Yes

Randn

randn

Yes

Remainder

%

Yes

ANSI mode is not supported.

Round

round

Yes

ANSI mode is not supported.

Signum

signum

Yes

Sin

sin

Yes

Sqrt

sqrt

Yes

Subtract

-

Yes

ANSI mode is not supported.

Tan

tan

Yes

TryAdd

try_add

Yes

Only integer inputs are supported

TryDivide

try_div

Yes

Only integer inputs are supported

TryMultiply

try_mul

Yes

Only integer inputs are supported

TrySubtract

try_sub

Yes

Only integer inputs are supported

UnaryMinus

-

Yes

Unhex

unhex

Yes

Hashing Functions

Expression

Spark-Compatible?

Md5

Yes

Murmur3Hash

Yes

Sha2

Yes

XxHash64

Yes

Bitwise Expressions

Expression

SQL

Spark-Compatible?

BitwiseAnd

&

Yes

BitwiseCount

Yes

BitwiseGet

Yes

BitwiseOr

|

Yes

BitwiseNot

~

Yes

BitwiseXor

^

Yes

ShiftLeft

<<

Yes

ShiftRight

>>

Yes

Aggregate Expressions

Expression

SQL

Spark-Compatible?

Compatibility Notes

Average

Yes, except for ANSI mode

BitAndAgg

Yes

BitOrAgg

Yes

BitXorAgg

Yes

BoolAnd

bool_and

Yes

BoolOr

bool_or

Yes

Corr

Yes

Count

Yes

CovPopulation

Yes

CovSample

Yes

First

No

This function is not deterministic. Results may not match Spark.

Last

No

This function is not deterministic. Results may not match Spark.

Max

Yes

Min

Yes

StddevPop

Yes

StddevSamp

Yes

Sum

Yes, except for ANSI mode

VariancePop

Yes

VarianceSamp

Yes

Array Expressions

Expression

Spark-Compatible?

Compatibility Notes

ArrayAppend

No

ArrayCompact

No

ArrayContains

Yes

ArrayDistinct

No

Behaves differently than spark. Comet first sorts then removes duplicates while Spark preserves the original order.

ArrayExcept

No

ArrayInsert

No

ArrayIntersect

No

ArrayJoin

No

ArrayMax

Yes

ArrayMin

Yes

ArrayRemove

Yes

ArrayRepeat

No

ArrayUnion

No

Behaves differently than spark. Comet sorts the input arrays before performing the union, while Spark preserves the order of the first array and appends unique elements from the second.

ArraysOverlap

No

CreateArray

Yes

ElementAt

Yes

Input must be an array. Map inputs are not supported.

Flatten

Yes

GetArrayItem

Yes

Map Expressions

Expression

Spark-Compatible?

GetMapValue

Yes

MapKeys

Yes

MapEntries

Yes

MapValues

Yes

MapFromArrays

Yes

Struct Expressions

Expression

Spark-Compatible?

CreateNamedStruct

Yes

GetArrayStructFields

Yes

GetStructField

Yes

StructsToJson

Yes

Conversion Expressions

Expression

Spark-Compatible

Compatibility Notes

Cast

Depends on specific cast

See the Comet Compatibility Guide for list of supported cast expressions and known issues

Other

Expression

Spark-Compatible?

Compatibility Notes

Alias

Yes

AttributeRefernce

Yes

BloomFilterMightContain

Yes

Coalesce

Yes

CheckOverflow

Yes

KnownFloatingPointNormalized

Yes

Literal

Yes

MakeDecimal

Yes

MonotonicallyIncreasingID

Yes

NormalizeNaNAndZero

Yes

PromotePrecision

Yes

RegExpReplace

No

Uses Rust regexp engine, which has different behavior to Java regexp engine

ScalarSubquery

Yes

SparkPartitionID

Yes

ToPrettyString

Yes

UnscaledValue

Yes