Comet Roadmap#

Comet is an open-source project and contributors are welcome to work on any issues at any time, but we find it helpful to have a roadmap for some of the major items that require coordination between contributors.

Major Initiatives#

Iceberg Integration#

Iceberg tables reads are now fully native, powered by a scan operator backed by Iceberg-rust (#2528). We anticipate major improvements expected in the next few releases, including bringing Iceberg table format V3 features (e.g., encryption) to the reader.

Spark 4.0 Support#

Comet has experimental support for Spark 4.0, but there is more work to do (#1637), such as enabling more Spark SQL tests and fully implementing ANSI support (#313) for all supported expressions.

Dynamic Partition Pruning#

Iceberg table scans support Dynamic Partition Pruning (DPP) filters generated by Spark’s PlanDynamicPruningFilters optimizer rule (#3349). However, we still need to bring this functionality to our Parquet reader. Furthermore, Spark’s PlanAdaptiveDynamicPruningFilters optimizer rule runs after Comet’s rules, so DPP with Adaptive Query Execution requires a redesign of Comet’s plan translation. We are focused on implementing DPP to keep Comet competitive with benchmarks that benefit from this feature like TPC-DS. This effort can be tracked at #3510.

Ongoing Improvements#

In addition to the major initiatives above, we have the following ongoing areas of work:

Adding support for more Spark expressions
Moving more expressions to the datafusion-spark crate in the core DataFusion repository
Performance tuning
Nested type support improvements