math_funcs Expression Audits#
Audit notes for expressions in this category that have been audited. Absence of an entry means the expression has not been audited yet, not that it is unsupported. See the user guide Spark Expression Support for current support status.
%#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Remainder(left, right, evalMode)signature identical across versions. Native path uses Rustspark_moduloUDF; non-ANSI returns NULL on divide-by-zero, ANSI raisesDIVIDE_BY_ZERO/REMAINDER_BY_ZERO.CometRemainderrejectsEvalMode.TRY, sotry_mod(Spark 4.0+) falls back to Spark (https://github.com/apache/datafusion-comet/issues/4484).
*#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Multiply(left, right, evalMode)signature identical. Decimal results exceedingDECIMAL128_MAX_PRECISIONgo throughWideDecimalBinaryExpr(Decimal256 intermediate); smaller decimals and primitives use DataFusionBinaryExpr. ANSI integer overflow uses Rustchecked_mul. Interval multiplication falls back.
+#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Add(left, right, evalMode)with the same Decimal / ANSI plumbing as*.Date + Int8/16/32dispatches to the Rustdate_addUDF to work around DataFusion’s Date32 + Interval-only kernel.
-#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Subtract(left, right, evalMode)mirrors+.Date - Int8/16/32uses the Rustdate_subUDF.
/#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Divide(left, right, evalMode). Non-ANSI mode wraps the divisor inIf(EqualTo(right, 0), null, right)so DataFusion never throws. Decimal output is wrapped inCheckOverflow(failOnError = ANSI); ANSI surfacesNUMERIC_VALUE_OUT_OF_RANGE, non-ANSI returns NULL.
abs#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Abs(child, failOnError)overNumericTypeplus the two interval types.failOnError(ANSI) is propagated to the nativeabsUDF, which throwsARITHMETIC_OVERFLOWonInt.MinValue/Long.MinValue/ Decimal MIN.DayTimeIntervalTypeandYearMonthIntervalTypefall back to Spark. Spark 4.0 / 4.1 do theNullIntolerant->nullIntolerant: Booleanrefactor; behaviour unchanged.
acos#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.acos, "ACOS")unchanged across versions; wired asCometScalarFunction("acos")to DataFusion’sacosUDF. NaN for|x| > 1.
acosh#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): custom
StrictMath.log(x + sqrt(x*x - 1))unchanged across versions. NaN forx < 1. Routes to DataFusion’sacosh.
asin#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.asin, "ASIN")unchanged. NaN for|x| > 1.
asinh#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): special-cases
Double.NegativeInfinityto avoidlog(NaN), otherwiseStrictMath.log(x + sqrt(x*x + 1)). Identical across versions.
atan#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.atan, "ATAN")unchanged.
atan2#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
BinaryMathExpression(math.atan2, "ATAN2")with both inputs adjusted by+0.0to flip-0.0to+0.0.CometAtan2reproduces this by wrapping each child inAdd(child, Literal.default(child.dataType))before dispatching to DataFusion’satan2.
atanh#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): custom
0.5 * (log1p(x) - log1p(-x))(SPARK-28519). NaN for|x| > 1, +/-Infinity forx = +/-1.
bin#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Bin(child)overLongType -> StringType. Spark 4.x gainsDefaultStringProducingExpressionand thenullIntolerant: Booleanrefactor; no behaviour change. Routes to datafusion-sparkSparkBin.
cbrt#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): passthrough to DataFusion
cbrt.
ceil#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): one-arg
ceil(expr)supported (LongType/DoubleType/DecimalTypewith scale >= 0). Decimal with negative scale falls back at convert time. The two-argceil(expr, scale)form (RoundCeil) is not wired and falls back to Spark.
ceiling#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): registry alias for
Ceil. Same support asceil.
cos#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.cos, "COS")unchanged across versions.
cosh#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.cosh, "COSH")unchanged.
cot#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): custom
1 / math.tan(x). DataFusion’scotis also1.0 / tan(x), so the result matches.
csc#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): custom
1 / math.sin(x). Routed to datafusion-spark’sSparkCsc(registered injni_api.rs).
degrees#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.toDegrees, "DEGREES")unchanged across versions.
div#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
IntegralDivide(left, right, evalMode). Non-decimal operands are cast toDecimalType(19, 0); result is recomputed perIntegralDivide.resultDecimalType, wrapped inCheckOverflow, then cast toLong. ANSI overflow forLong.MinValue div -1and decimal-overflow ANSI cases are covered by existing tests.
e#
Foldable; rewritten to a literal by ConstantFolding (like
pi).
exp#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(StrictMath.exp, "EXP")unchanged. ULP-level differences vs DataFusionexpare possible but unflagged.
expm1#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(StrictMath.expm1, "EXPM1")unchanged.
factorial#
3.4.3 (audited 2026-05-15): identical to v3.5.8.
3.5.8 (audited 2026-05-15): canonical reference;
extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant. Returns NULL for NULL input or values outside[0, 20].4.0.1 (audited 2026-05-15):
NullIntoleranttrait replaced bynullIntolerant: Booleanmethod override; behavior unchanged.4.1.1 (audited 2026-05-27): identical to 4.0.1.
floor#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): mirror of
ceil. Two-argfloor(expr, scale)form (RoundFloor) falls back to Spark.
greatest#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): NULL-skipping variadic. Wired as
CometScalarFunction("greatest")to DataFusion’sGreatestFunc. Comet does not gate input types, so interval inputs and other Spark-only orderings rely on the native UDF accepting them; no explicit fallback path.
hex#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): accepts
LongType/BinaryType/StringType. Spark 4.x widensStringTypetoStringTypeWithCollationand preserves collation indataType;CometHexpassesexpr.dataTypeto nativeSparkHex, which always returnsUtf8– collation propagation may diverge on Spark 4.x.
least#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): mirror of
greatest; same caveats. Spark 4.1.1 addscontextIndependentFoldable(no Comet impact).
ln#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): registry alias for
Log. Comet wires throughCometLogto DataFusionlnwith anullIfNegativerewrite to match Spark’s NULL behaviour forx <= 0.
log#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): one-arg
log(x)->CometLog(DataFusionln); two-arglog(base, x)->CometLogarithm(customspark_logUDF, returns NULL whenbase <= 0orx <= 0to matchLogarithm.nullSafeEval).
log10#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryLogExpression(StrictMath.log10, "LOG10"); returns NULL forx <= 0. Possible ULP differences fromStrictMath.
log2#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryLogExpression(StrictMath.log(x) / StrictMath.log(2), "LOG2"); returns NULL forx <= 0.
mod#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): registry alias for
Remainder. Same support as%.
negative#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMinus(child, failOnError)-> RustNegativeExpr. ANSI overflow is detected forInt8/Int16/Int32/Int64andIntervalYearMonth/IntervalDayTime. Float / Double / Decimal cannot overflow on negate. Spark 4.0NullIntolerant->nullIntolerant: Booleanrefactor; no impact.
pi#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
LeafMathExpression(math.Pi, "PI"); foldable, so SparkConstantFoldingrewrites it to aLiteralbefore Comet sees the plan. TheCometScalarFunction("pi")registration is exercised only whenConstantFoldingis excluded.
positive#
Spark 3.4.3, 3.5.8 (audited 2026-05-27):
UnaryPositive(child)is a regular expression. There is no Comet serde forUnaryPositive, so projections containing+colsilently disable Comet for the projection on 3.4/3.5.Spark 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryPositiveisRuntimeReplaceablewithreplacement = child; the optimizer removes it before Comet sees the plan, so the gap is transparent on 4.x.
pow#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Pow(left, right) extends BinaryMathExpression(StrictMath.pow, "POWER"); routes to DataFusionpow. ULP-level differences possible.
power#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): registry alias for
Pow. Same support aspow.
radians#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.toRadians, "RADIANS")unchanged across versions.
rand#
See
misc_funcs / rand.
randn#
See
misc_funcs / randn.
rint#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.rint, "ROUND")withfuncName = "rint". Passthrough to DataFusionrint(round-half-to-even).
round#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): HALF_UP rounding for integer / decimal. Float / Double child types always fall back to Spark because
BigDecimal-via-toStringrounding cannot be precisely matched (documented inline inCometRound). ANSIfailOnErroris propagated for integer overflow.BRound(HALF_EVEN) is not wired.
sec#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): custom
1 / math.cos(x). Routed to datafusion-spark’sSparkSec.
shiftleft#
See
bitwise_funcs / <<(audited in PR #4479). Same support as the operator alias added in 4.0.
sign#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): registry alias for
Signum. Same support assignum.
signum#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Signum(child)overDoubleType. Spark also restricts to the two interval types viainputTypes; Comet handles only theDoublecase via DataFusionsignum.
sin#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.sin, "SIN")unchanged.
sinh#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.sinh, "SINH")unchanged.
sqrt#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.sqrt, "SQRT")unchanged.
tan#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.tan, "TAN")unchanged.
tanh#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
UnaryMathExpression(math.tanh, "TANH")unchanged.
try_add#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
TryAddisRuntimeReplaceableand rewrites toAdd(.., EvalMode.TRY)for numeric inputs (datetime / interval go throughTryEval(Add(.., ANSI))and fall back). Numeric path uses the Rustchecked_addUDF, returning NULL on overflow. Decimal goes throughWideDecimalBinaryExprwithEvalMode.Try.
try_divide#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
TryDividerewrites toDivide(.., EvalMode.TRY). ThenullIfWhenPrimitivewrapper swaps zero divisors to NULL; integer / float divide useschecked_div; decimal usesdecimal_div+CheckOverflow(failOnError = false)returning NULL.
try_multiply#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): rewrites to
Multiply(.., EvalMode.TRY). Integer path useschecked_mul; decimal usesWideDecimalBinaryExprwithEvalMode.Try, returning NULL on overflow.
try_subtract#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27): rewrites to
Subtract(.., EvalMode.TRY). Integer path useschecked_sub; decimal usesWideDecimalBinaryExpras needed.
unhex#
Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1 (audited 2026-05-27):
Unhex(child, failOnError). Spark 4.x widens input toStringTypeWithCollationand wraps the inner call in try/catch; CometCometUnhexforwardsfailOnErrorto nativespark_unhexbut does not gate on collation.
width_bucket#
Spark 3.5.8 (audited 2026-05-27): introduced; not available in 3.4.3.
Spark 4.0.1, 4.1.1 (audited 2026-05-27): same semantics;
NullIntolerant->nullIntolerant: Booleanrefactor.Known limitation: wired via per-version
CometExprShimrather than aCometExpressionSerde, so it bypasses the support-level framework and the auto-generated compatibility doc (https://github.com/apache/datafusion-comet/issues/4485). Native path uses datafusion-sparkSparkWidthBucket; interval input types are not exercised by Comet tests.