datafusion.catalog¶
Data catalog providers.
Classes¶
DataFusion data catalog. |
|
Abstract class for defining a Python based Catalog Provider. |
|
DataFusion Schema. |
|
Abstract class for defining a Python based Schema Provider. |
|
A DataFusion table. |
Module Contents¶
- class datafusion.catalog.Catalog(catalog: datafusion._internal.catalog.RawCatalog)¶
DataFusion data catalog.
This constructor is not typically called by the end user.
- __repr__() str¶
Print a string representation of the catalog.
- deregister_schema(name: str, cascade: bool = True) Schema | None¶
Deregister a schema from this catalog.
- names() set[str]¶
This is an alias for schema_names.
- register_schema(name: str, schema: Schema | SchemaProvider | SchemaProviderExportable) Schema | None¶
Register a schema with this catalog.
- schema_names() set[str]¶
Returns the list of schemas in this catalog.
- catalog¶
- class datafusion.catalog.CatalogProvider¶
Bases:
abc.ABCAbstract class for defining a Python based Catalog Provider.
- deregister_schema(name: str, cascade: bool) None¶
Remove a schema from this catalog.
This method is optional. If your catalog provides a fixed list of schemas, you do not need to implement this method.
- Parameters:
name – The name of the schema to remove.
cascade – If true, deregister the tables within the schema.
- register_schema(name: str, schema: SchemaProviderExportable | SchemaProvider | Schema) None¶
Add a schema to this catalog.
This method is optional. If your catalog provides a fixed list of schemas, you do not need to implement this method.
- abstract schema_names() set[str]¶
Set of the names of all schemas in this catalog.
- class datafusion.catalog.Schema(schema: datafusion._internal.catalog.RawSchema)¶
DataFusion Schema.
This constructor is not typically called by the end user.
- __repr__() str¶
Print a string representation of the schema.
- deregister_table(name: str) None¶
Deregister a table provider from this schema.
- names() set[str]¶
This is an alias for table_names.
- register_table(name: str, table: Table | datafusion.context.TableProviderExportable | datafusion.DataFrame | pyarrow.dataset.Dataset) None¶
Register a table in this schema.
- table_names() set[str]¶
Returns the list of all tables in this schema.
- _raw_schema¶
- class datafusion.catalog.SchemaProvider¶
Bases:
abc.ABCAbstract class for defining a Python based Schema Provider.
- deregister_table(name: str, cascade: bool) None¶
Remove a table from this schema.
This method is optional. If your schema provides a fixed list of tables, you do not need to implement this method.
- owner_name() str | None¶
Returns the owner of the schema.
This is an optional method. The default return is None.
- register_table(name: str, table: Table | datafusion.context.TableProviderExportable | Any) None¶
Add a table to this schema.
This method is optional. If your schema provides a fixed list of tables, you do not need to implement this method.
- abstract table_exist(name: str) bool¶
Returns true if the table exists in this schema.
- abstract table_names() set[str]¶
Set of the names of all tables in this schema.
- class datafusion.catalog.Table(table: Table | datafusion.context.TableProviderExportable | datafusion.DataFrame | pyarrow.dataset.Dataset)¶
A DataFusion table.
Internally we currently support the following types of tables:
Tables created using built-in DataFusion methods, such as reading from CSV or Parquet
pyarrow datasets
DataFusion DataFrames, which will be converted into a view
Externally provided tables implemented with the FFI PyCapsule interface (advanced)
Constructor.
- __repr__() str¶
Print a string representation of the table.
- static from_dataset(dataset: pyarrow.dataset.Dataset) Table¶
Turn a
pyarrow.datasetDatasetinto aTable.
- __slots__ = ('_inner',)¶
- _inner¶
- property kind: str¶
Returns the kind of table.
- property schema: pyarrow.Schema¶
Returns the schema associated with this table.