Comet Roadmap¶
Comet is an open-source project and contributors are welcome to work on any issues at any time, but we find it helpful to have a roadmap for some of the major items that require coordination between contributors.
Major Initiatives¶
Iceberg Integration¶
Iceberg integration is still a work-in-progress (#2060), with major improvements expected in the next few
releases. Once this integration is complete, we plan on switching from the native_comet
scan to the
native_iceberg_compat
scan (#2189) so that complex types can be supported.
Spark 4.0.0 Support¶
Comet has experimental support for Spark 4.0.0, but there is more work to do (#1637), such as enabling more Spark SQL tests and fully implementing ANSI support (#313) for all supported expressions.
Removing the native_comet scan implementation¶
We are working towards deprecating (#2186) and removing (#2177) the native_comet
scan implementation, which
is the originally scan implementation that uses mutable buffers (which is incompatible with best practices around
Arrow FFI) and does not support complex types.
Once we are using the native_iceberg_compat
scan (which is based on DataFusion’s DataSourceExec
) in the Iceberg
integration, we will be able to remove the native_comet
scan implementation, and can then improve the efficiency
of our use of Arrow FFI (#2171).
Ongoing Improvements¶
In addition to the major initiatives above, we have the following ongoing areas of work:
Adding support for more Spark expressions
Moving more expressions to the
datafusion-spark
crate in the core DataFusion repositoryPerformance tuning
Nested type support improvements