Spark Data Type Support#
This page is the complete reference for how Apache Comet handles each Spark data type. Comet’s native execution path is built on Apache Arrow, so the set of types Comet can express natively is constrained by Arrow’s type system. When a query references a type Comet does not support, the relevant operator falls back to Spark; results are unaffected.
For per-scan and per-operator type caveats (for example, Parquet read-time conversions or hash-aggregate group-key restrictions), see the Compatibility Guide.
Status legend#
Status |
Meaning |
|---|---|
✅ Supported |
Native support; enabled by default. |
⚠️ Supported (caveats) |
Works, but with limits: certain values, contexts, or configurations fall back to Spark. |
🔜 Planned |
Intended; tracked by an open issue or pull request. |
Not currently planned#
The following types fall back to Spark and are not on the current roadmap. They are omitted from the tables below and may be reconsidered based on demand:
UserDefinedType: user-defined types are application-specific and outside the scope of native acceleration; queries referencing UDTs fall back to Spark.
Numeric#
Type |
Status |
Notes |
|---|---|---|
|
✅ |
|
|
✅ |
|
|
✅ |
|
|
✅ |
|
|
✅ |
NaN and signed-zero handling can diverge from Spark in comparisons and aggregations. See Floating-point Compatibility. |
|
✅ |
NaN and signed-zero handling can diverge from Spark in comparisons and aggregations. See Floating-point Compatibility. |
|
✅ |
String and binary#
Type |
Status |
Notes |
|---|---|---|
|
✅ |
Default UTF-8 binary collation is supported. Non-default collations (Spark 4.0+) fall back (#2190). |
|
✅ |
|
|
✅ |
Spark normalizes |
|
✅ |
Spark normalizes |
Boolean#
Type |
Status |
Notes |
|---|---|---|
|
✅ |
Datetime#
Type |
Status |
Notes |
|---|---|---|
|
✅ |
|
|
✅ |
|
|
✅ |
|
|
⚠️ |
Spark 4.1+. Native serialization is in place; some operators (sort, shuffle, min/max) are still being wired up (#4288). |
Interval#
Interval types fall back to Spark today. Native acceleration is tracked by #4540.
Type |
Status |
Notes |
|---|---|---|
|
🔜 |
Tracked by #4540. |
|
🔜 |
Tracked by #4540. |
|
🔜 |
Tracked by #4540. |
Complex#
Type |
Status |
Notes |
|---|---|---|
|
✅ |
Empty structs (no fields) fall back. |
|
✅ |
|
|
✅ |
Hash aggregate group keys cannot contain a |
Variant#
Other#
Type |
Status |
Notes |
|---|---|---|
|
✅ |
See also#
Comet Compatibility Guide - known incompatibilities and edge cases.
Parquet Scan Compatibility - per-type behavior at scan time.
Supported Spark Operators - the equivalent reference for operators.
Supported Spark Expressions - the equivalent reference for expressions.