Custom Table ProviderΒΆ

If you have a custom data source that you want to integrate with DataFusion, you can do so by implementing the TableProvider interface in Rust and then exposing it in Python. To do so, you must use DataFusion 43.0.0 or later and expose a FFI_TableProvider via PyCapsule.

A complete example can be found in the examples folder.

#[pymethods]
impl MyTableProvider {

    fn __datafusion_table_provider__<'py>(
        &self,
        py: Python<'py>,
    ) -> PyResult<Bound<'py, PyCapsule>> {
        let name = CString::new("datafusion_table_provider").unwrap();

        let provider = Arc::new(self.clone())
            .map_err(|e| PyRuntimeError::new_err(e.to_string()))?;
        let provider = FFI_TableProvider::new(Arc::new(provider), false);

        PyCapsule::new_bound(py, provider, Some(name.clone()))
    }
}

Once you have this library available, in python you can register your table provider to the SessionContext.

provider = MyTableProvider()
ctx.register_table_provider("my_table", provider)

ctx.table("my_table").show()