Comet Roadmap#
Comet is an open-source project and contributors are welcome to work on any issues at any time, but we find it helpful to have a roadmap for some of the major items that require coordination between contributors.
Major Initiatives#
Iceberg Integration#
Iceberg tables reads are now fully native, powered by a scan operator backed by Iceberg-rust (#2528). We anticipate major improvements expected in the next few releases, including bringing Iceberg table format V3 features (e.g., encryption) to the reader.
Spark 4.0 Support#
Comet has experimental support for Spark 4.0, but there is more work to do (#1637), such as enabling more Spark SQL tests and fully implementing ANSI support (#313) for all supported expressions.
Dynamic Partition Pruning#
Iceberg table scans support Dynamic Partition Pruning (DPP) filters generated by Spark’s PlanDynamicPruningFilters
optimizer rule (#3349). However, we still need to bring this functionality to our Parquet reader. Furthermore,
Spark’s PlanAdaptiveDynamicPruningFilters optimizer rule runs after Comet’s rules, so DPP with Adaptive Query
Execution requires a redesign of Comet’s plan translation. We are focused on implementing DPP to keep Comet competitive
with benchmarks that benefit from this feature like TPC-DS. This effort can be tracked at #3510.
Ongoing Improvements#
In addition to the major initiatives above, we have the following ongoing areas of work:
Adding support for more Spark expressions
Moving more expressions to the
datafusion-sparkcrate in the core DataFusion repositoryPerformance tuning
Nested type support improvements