Comet Roadmap#

Comet is an open-source project and contributors are welcome to work on any issues at any time, but we find it helpful to have a roadmap for some of the major items that require coordination between contributors.

Major Initiatives#

Iceberg Integration#

Reads of Iceberg tables with Parquet data files are fully native and enabled by default, powered by a scan operator backed by Iceberg-rust (#2528). We anticipate major improvements in the next few releases, including bringing Iceberg table format V3 features (e.g., encryption) to the reader.

Spark 4.0 Support#

Comet has experimental support for Spark 4.0, but there is more work to do (#1637), such as enabling more Spark SQL tests and fully implementing ANSI support (#313) for all supported expressions.

Dynamic Partition Pruning#

Both Iceberg table scans and Parquet V1 native scans (CometNativeScanExec) support non-AQE Dynamic Partition Pruning (DPP) filters generated by Spark’s PlanDynamicPruningFilters optimizer rule (#3349, #3511). However, Spark’s PlanAdaptiveDynamicPruningFilters optimizer rule runs after Comet’s rules, so DPP with Adaptive Query Execution requires a redesign of Comet’s plan translation. This effort can be tracked at #3510.

Ongoing Improvements#

In addition to the major initiatives above, we have the following ongoing areas of work:

  • Adding support for more Spark expressions

  • Moving more expressions to the datafusion-spark crate in the core DataFusion repository

  • Performance tuning

  • Nested type support improvements