Comet Roadmap#
Comet is an open-source project and contributors are welcome to work on any issues at any time, but we find it helpful to have a roadmap for some of the major items that require coordination between contributors.
Major Initiatives#
Iceberg Integration#
Reads of Iceberg tables with Parquet data files are fully native and enabled by default, powered by a scan operator backed by Iceberg-rust (#2528). We anticipate major improvements in the next few releases, including bringing Iceberg table format V3 features (e.g., encryption) to the reader.
Spark 4.0 Support#
Comet has experimental support for Spark 4.0, but there is more work to do (#1637), such as enabling more Spark SQL tests and fully implementing ANSI support (#313) for all supported expressions.
Dynamic Partition Pruning#
Both Iceberg table scans and Parquet V1 native scans (CometNativeScanExec) support non-AQE Dynamic Partition Pruning
(DPP) filters generated by Spark’s PlanDynamicPruningFilters optimizer rule (#3349, #3511). However, Spark’s
PlanAdaptiveDynamicPruningFilters optimizer rule runs after Comet’s rules, so DPP with Adaptive Query Execution
requires a redesign of Comet’s plan translation. This effort can be tracked at #3510.
Ongoing Improvements#
In addition to the major initiatives above, we have the following ongoing areas of work:
Adding support for more Spark expressions
Moving more expressions to the
datafusion-sparkcrate in the core DataFusion repositoryPerformance tuning
Nested type support improvements