Map Expressions#
MapSort (Spark 4.0+)#
Spark 4.0 inserts MapSort to normalize map values when they appear in shuffle hash partitioning
keys, in try_element_at, and in other contexts where map ordering must be deterministic. Comet
runs MapSort natively, so map shuffle and group-by-on-map stay on Comet under Spark 4.0.
When spark.comet.exec.strictFloatingPoint=true, MapSort falls back to Spark for maps whose
keys contain Float or Double (consistent with SortOrder and SortArray). Arrow’s sort uses
IEEE total ordering for floating-point, which differs from Spark’s Double.compare semantics for
NaN and -0.0.
MapFromEntries#
The following incompatibilities cause MapFromEntries to fall back to Spark by default. Set spark.comet.expression.MapFromEntries.allowIncompatible=true to enable Comet acceleration despite these differences.
Using BinaryType as Map keys is not allowed in map_from_entries
Using BinaryType as Map values is not allowed in map_from_entries