datafusion.html_formatter¶
HTML formatting utilities for DataFusion DataFrames.
Classes¶
Protocol for cell value formatters. |
|
Configurable HTML formatter for DataFusion DataFrames. |
|
Default implementation of StyleProvider. |
|
Manager class for the global DataFrame HTML formatter instance. |
|
Protocol for HTML style providers. |
Functions¶
|
Refresh formatter reference in any modules using it. |
|
Validate that a parameter is a boolean. |
|
Validate that a parameter is a positive integer. |
|
Configure the global DataFrame HTML formatter. |
|
Get the current global DataFrame HTML formatter. |
|
Reset the global DataFrame HTML formatter to default settings. |
|
Reset the styles loaded state to force reloading of styles. |
|
Set the global DataFrame HTML formatter. |
Module Contents¶
- class datafusion.html_formatter.CellFormatter¶
Bases:
Protocol
Protocol for cell value formatters.
- __call__(value: Any) str ¶
Format a cell value to string representation.
- class datafusion.html_formatter.DataFrameHtmlFormatter(max_cell_length: int = 25, max_width: int = 1000, max_height: int = 300, max_memory_bytes: int = 2 * 1024 * 1024, min_rows_display: int = 20, repr_rows: int = 10, enable_cell_expansion: bool = True, custom_css: str | None = None, show_truncation_message: bool = True, style_provider: StyleProvider | None = None, use_shared_styles: bool = True)¶
Configurable HTML formatter for DataFusion DataFrames.
This class handles the HTML rendering of DataFrames for display in Jupyter notebooks and other rich display contexts.
This class supports extension through composition. Key extension points: - Provide a custom StyleProvider for styling cells and headers - Register custom formatters for specific types - Provide custom cell builders for specialized cell rendering
- Parameters:
max_cell_length – Maximum characters to display in a cell before truncation
max_width – Maximum width of the HTML table in pixels
max_height – Maximum height of the HTML table in pixels
max_memory_bytes – Maximum memory in bytes for rendered data (default: 2MB)
min_rows_display – Minimum number of rows to display
repr_rows – Default number of rows to display in repr output
enable_cell_expansion – Whether to add expand/collapse buttons for long cell values
custom_css – Additional CSS to include in the HTML output
show_truncation_message – Whether to display a message when data is truncated
style_provider – Custom provider for cell and header styles
use_shared_styles – Whether to load styles and scripts only once per notebook session
Initialize the HTML formatter.
- Parameters:
max_cell_length (int, default 25) – Maximum length of cell content before truncation.
max_width (int, default 1000) – Maximum width of the displayed table in pixels.
max_height (int, default 300) – Maximum height of the displayed table in pixels.
max_memory_bytes (int, default 2097152 (2MB)) – Maximum memory in bytes for rendered data.
min_rows_display (int, default 20) – Minimum number of rows to display.
repr_rows (int, default 10) – Default number of rows to display in repr output.
enable_cell_expansion (bool, default True) – Whether to allow cells to expand when clicked.
custom_css (str, optional) – Custom CSS to apply to the HTML table.
show_truncation_message (bool, default True) – Whether to show a message indicating that content has been truncated.
style_provider (StyleProvider, optional) – Provider of CSS styles for the HTML table. If None, DefaultStyleProvider is used.
use_shared_styles (bool, default True) – Whether to use shared styles across multiple tables.
Raises
------
ValueError – If max_cell_length, max_width, max_height, max_memory_bytes, min_rows_display, or repr_rows is not a positive integer.
TypeError – If enable_cell_expansion, show_truncation_message, or use_shared_styles is not a boolean, or if custom_css is provided but is not a string, or if style_provider is provided but does not implement the StyleProvider protocol.
- _build_expandable_cell(formatted_value: str, row_count: int, col_idx: int, table_uuid: str) str ¶
Build an expandable cell for long content.
Build the HTML footer with JavaScript and messages.
- _build_html_header() list[str] ¶
Build the HTML header with CSS styles.
- _build_regular_cell(formatted_value: str) str ¶
Build a regular table cell.
- _build_table_body(batches: list, table_uuid: str) list[str] ¶
Build the HTML table body with data rows.
- _build_table_container_start() list[str] ¶
Build the opening tags for the table container.
- _build_table_header(schema: Any) list[str] ¶
Build the HTML table header with column names.
- _format_cell_value(value: Any) str ¶
Format a cell value for display.
Uses registered type formatters if available.
- Parameters:
value – The cell value to format
- Returns:
Formatted cell value as string
- _get_cell_value(column: Any, row_idx: int) Any ¶
Extract a cell value from a column.
- Parameters:
column – Arrow array
row_idx – Row index
- Returns:
The raw cell value
- _get_default_css() str ¶
Get default CSS styles for the HTML table.
- _get_javascript() str ¶
Get JavaScript code for interactive elements.
- format_html(batches: list, schema: Any, has_more: bool = False, table_uuid: str | None = None) str ¶
Format record batches as HTML.
This method is used by DataFrame’s _repr_html_ implementation and can be called directly when custom HTML rendering is needed.
- Parameters:
batches – List of Arrow RecordBatch objects
schema – Arrow Schema object
has_more – Whether there are more batches not shown
table_uuid – Unique ID for the table, used for JavaScript interactions
- Returns:
HTML string representation of the data
- Raises:
TypeError – If schema is invalid and no batches are provided
- classmethod is_styles_loaded() bool ¶
Check if HTML styles have been loaded in the current session.
This method is primarily intended for debugging UI rendering issues related to style loading.
- Returns:
True if styles have been loaded, False otherwise
Example
>>> from datafusion.html_formatter import DataFrameHtmlFormatter >>> DataFrameHtmlFormatter.is_styles_loaded() False
- register_formatter(type_class: type, formatter: CellFormatter) None ¶
Register a custom formatter for a specific data type.
- Parameters:
type_class – The type to register a formatter for
formatter – Function that takes a value of the given type and returns a formatted string
- set_custom_cell_builder(builder: Callable[[Any, int, int, str], str]) None ¶
Set a custom cell builder function.
- Parameters:
builder – Function that takes (value, row, col, table_id) and returns HTML
- set_custom_header_builder(builder: Callable[[Any], str]) None ¶
Set a custom header builder function.
- Parameters:
builder – Function that takes a field and returns HTML
- _custom_cell_builder: Callable[[Any, int, int, str], str] | None = None¶
- _custom_header_builder: Callable[[Any], str] | None = None¶
- _styles_loaded = False¶
- _type_formatters: dict[type, CellFormatter]¶
- custom_css = None¶
- enable_cell_expansion = True¶
- max_cell_length = 25¶
- max_height = 300¶
- max_memory_bytes = 2097152¶
- max_width = 1000¶
- min_rows_display = 20¶
- repr_rows = 10¶
- show_truncation_message = True¶
- style_provider¶
- class datafusion.html_formatter.DefaultStyleProvider¶
Default implementation of StyleProvider.
- get_cell_style() str ¶
Get the CSS style for table cells.
- Returns:
CSS style string
- get_header_style() str ¶
Get the CSS style for header cells.
- Returns:
CSS style string
- class datafusion.html_formatter.FormatterManager¶
Manager class for the global DataFrame HTML formatter instance.
- classmethod get_formatter() DataFrameHtmlFormatter ¶
Get the current global DataFrame HTML formatter.
- Returns:
The global HTML formatter instance
- classmethod set_formatter(formatter: DataFrameHtmlFormatter) None ¶
Set the global DataFrame HTML formatter.
- Parameters:
formatter – The formatter instance to use globally
- _default_formatter: DataFrameHtmlFormatter¶
- class datafusion.html_formatter.StyleProvider¶
Bases:
Protocol
Protocol for HTML style providers.
- get_cell_style() str ¶
Get the CSS style for table cells.
- get_header_style() str ¶
Get the CSS style for header cells.
- datafusion.html_formatter._refresh_formatter_reference() None ¶
Refresh formatter reference in any modules using it.
This helps ensure that changes to the formatter are reflected in existing DataFrames that might be caching the formatter reference.
- datafusion.html_formatter._validate_bool(value: Any, param_name: str) None ¶
Validate that a parameter is a boolean.
- Parameters:
value – The value to validate
param_name – Name of the parameter (used in error message)
- Raises:
TypeError – If the value is not a boolean
- datafusion.html_formatter._validate_positive_int(value: Any, param_name: str) None ¶
Validate that a parameter is a positive integer.
- Parameters:
value – The value to validate
param_name – Name of the parameter (used in error message)
- Raises:
ValueError – If the value is not a positive integer
- datafusion.html_formatter.configure_formatter(**kwargs: Any) None ¶
Configure the global DataFrame HTML formatter.
This function creates a new formatter with the provided configuration and sets it as the global formatter for all DataFrames.
- Parameters:
**kwargs – Formatter configuration parameters like max_cell_length, max_width, max_height, enable_cell_expansion, etc.
- Raises:
ValueError – If any invalid parameters are provided
Example
>>> from datafusion.html_formatter import configure_formatter >>> configure_formatter( ... max_cell_length=50, ... max_height=500, ... enable_cell_expansion=True, ... use_shared_styles=True ... )
- datafusion.html_formatter.get_formatter() DataFrameHtmlFormatter ¶
Get the current global DataFrame HTML formatter.
This function is used by the DataFrame._repr_html_ implementation to access the shared formatter instance. It can also be used directly when custom HTML rendering is needed.
- Returns:
The global HTML formatter instance
Example
>>> from datafusion.html_formatter import get_formatter >>> formatter = get_formatter() >>> formatter.max_cell_length = 50 # Increase cell length
- datafusion.html_formatter.reset_formatter() None ¶
Reset the global DataFrame HTML formatter to default settings.
This function creates a new formatter with default configuration and sets it as the global formatter for all DataFrames.
Example
>>> from datafusion.html_formatter import reset_formatter >>> reset_formatter() # Reset formatter to default settings
- datafusion.html_formatter.reset_styles_loaded_state() None ¶
Reset the styles loaded state to force reloading of styles.
This can be useful when switching between notebook sessions or when styles need to be refreshed.
Example
>>> from datafusion.html_formatter import reset_styles_loaded_state >>> reset_styles_loaded_state() # Force styles to reload in next render
- datafusion.html_formatter.set_formatter(formatter: DataFrameHtmlFormatter) None ¶
Set the global DataFrame HTML formatter.
- Parameters:
formatter – The formatter instance to use globally
Example
>>> from datafusion.html_formatter import get_formatter, set_formatter >>> custom_formatter = DataFrameHtmlFormatter(max_cell_length=100) >>> set_formatter(custom_formatter)