# EXPLAIN The `EXPLAIN` command shows the logical and physical execution plan for the specified SQL statement. ## Syntax
EXPLAIN [ANALYZE] [VERBOSE] [FORMAT format] statement## `EXPLAIN` Shows the execution plan of a statement. If you need more detailed output, use `EXPLAIN VERBOSE`. Note that `EXPLAIN VERBOSE` only supports the `indent` format. The optional `[FORMAT format]` clause controls how the plan is displayed as explained below. If this clause is not specified, the plan is displayed using the format from the [configuration value] `datafusion.explain.format`. ### `tree` format (default) The `tree` format is modeled after [DuckDB plans] and is designed to be easier to see the high level structure of the plan [duckdb plans]: https://duckdb.org/docs/stable/guides/meta/explain.html ```sql > EXPLAIN FORMAT TREE SELECT SUM(x) FROM t GROUP BY b; +---------------+-------------------------------+ | plan_type | plan | +---------------+-------------------------------+ | physical_plan | ┌───────────────────────────┐ | | | │ ProjectionExec │ | | | │ -------------------- │ | | | │ sum(t.x): sum(t.x)@1 │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ AggregateExec │ | | | │ -------------------- │ | | | │ aggr: sum(t.x) │ | | | │ group_by: b@0 as b │ | | | │ │ | | | │ mode: │ | | | │ FinalPartitioned │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ CoalesceBatchesExec │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ RepartitionExec │ | | | │ -------------------- │ | | | │ input_partition_count: │ | | | │ 16 │ | | | │ │ | | | │ partitioning_scheme: │ | | | │ Hash([b@0], 16) │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ RepartitionExec │ | | | │ -------------------- │ | | | │ input_partition_count: │ | | | │ 1 │ | | | │ │ | | | │ partitioning_scheme: │ | | | │ RoundRobinBatch(16) │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ AggregateExec │ | | | │ -------------------- │ | | | │ aggr: sum(t.x) │ | | | │ group_by: b@1 as b │ | | | │ mode: Partial │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ DataSourceExec │ | | | │ -------------------- │ | | | │ bytes: 224 │ | | | │ format: memory │ | | | │ rows: 1 │ | | | └───────────────────────────┘ | | | | +---------------+-------------------------------+ 1 row(s) fetched. Elapsed 0.016 seconds. ``` ### `indent` format The `indent` format shows both the logical and physical plan, with one line for each operator in the plan. Child plans are indented to show the hierarchy. See [Reading Explain Plans](../explain-usage.md) for more information on how to interpret these plans. ```sql > CREATE TABLE t(x int, b int) AS VALUES (1, 2), (2, 3); 0 row(s) fetched. Elapsed 0.004 seconds. > EXPLAIN FORMAT INDENT SELECT SUM(x) FROM t GROUP BY b; +---------------+-------------------------------------------------------------------------------+ | plan_type | plan | +---------------+-------------------------------------------------------------------------------+ | logical_plan | Projection: sum(t.x) | | | Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]] | | | TableScan: t projection=[x, b] | | physical_plan | ProjectionExec: expr=[sum(t.x)@1 as sum(t.x)] | | | AggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[sum(t.x)] | | | CoalesceBatchesExec: target_batch_size=8192 | | | RepartitionExec: partitioning=Hash([b@0], 16), input_partitions=16 | | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | | AggregateExec: mode=Partial, gby=[b@1 as b], aggr=[sum(t.x)] | | | DataSourceExec: partitions=1, partition_sizes=[1] | | | | +---------------+-------------------------------------------------------------------------------+ 2 row(s) fetched. Elapsed 0.004 seconds. ``` ### `pgjson` format The `pgjson` format is modeled after [Postgres JSON] format. You can use this format to visualize the plan in existing plan visualization tools, such as [dalibo] [postgres json]: https://www.postgresql.org/docs/current/sql-explain.html [dalibo]: https://explain.dalibo.com/ ```sql > EXPLAIN FORMAT PGJSON SELECT SUM(x) FROM t GROUP BY b; +--------------+----------------------------------------------------+ | plan_type | plan | +--------------+----------------------------------------------------+ | logical_plan | [ | | | { | | | "Plan": { | | | "Expressions": [ | | | "sum(t.x)" | | | ], | | | "Node Type": "Projection", | | | "Output": [ | | | "sum(t.x)" | | | ], | | | "Plans": [ | | | { | | | "Aggregates": "sum(CAST(t.x AS Int64))", | | | "Group By": "t.b", | | | "Node Type": "Aggregate", | | | "Output": [ | | | "b", | | | "sum(t.x)" | | | ], | | | "Plans": [ | | | { | | | "Node Type": "TableScan", | | | "Output": [ | | | "x", | | | "b" | | | ], | | | "Plans": [], | | | "Relation Name": "t" | | | } | | | ] | | | } | | | ] | | | } | | | } | | | ] | +--------------+----------------------------------------------------+ 1 row(s) fetched. Elapsed 0.008 seconds. ``` ### `graphviz` format The `graphviz` format uses the [DOT language] that can be used with [Graphviz] to generate a visual representation of the plan. [dot language]: https://graphviz.org/doc/info/lang.html [graphviz]: https://graphviz.org/ ```sql > EXPLAIN FORMAT GRAPHVIZ SELECT SUM(x) FROM t GROUP BY b; +--------------+------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +--------------+------------------------------------------------------------------------------------------------------------------------------+ | logical_plan | | | | // Begin DataFusion GraphViz Plan, | | | // display it online here: https://dreampuf.github.io/GraphvizOnline | | | | | | digraph { | | | subgraph cluster_1 | | | { | | | graph[label="LogicalPlan"] | | | 2[shape=box label="Projection: sum(t.x)"] | | | 3[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]"] | | | 2 -> 3 [arrowhead=none, arrowtail=normal, dir=back] | | | 4[shape=box label="TableScan: t projection=[x, b]"] | | | 3 -> 4 [arrowhead=none, arrowtail=normal, dir=back] | | | } | | | subgraph cluster_5 | | | { | | | graph[label="Detailed LogicalPlan"] | | | 6[shape=box label="Projection: sum(t.x)\nSchema: [sum(t.x):Int64;N]"] | | | 7[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]\nSchema: [b:Int32;N, sum(t.x):Int64;N]"] | | | 6 -> 7 [arrowhead=none, arrowtail=normal, dir=back] | | | 8[shape=box label="TableScan: t projection=[x, b]\nSchema: [x:Int32;N, b:Int32;N]"] | | | 7 -> 8 [arrowhead=none, arrowtail=normal, dir=back] | | | } | | | } | | | // End DataFusion GraphViz Plan | | | | +--------------+------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.010 seconds. ``` ## `EXPLAIN ANALYZE` Shows the execution plan and metrics of a statement. Note that `EXPLAIN ANALYZE` only supports the `indent` format. ```sql EXPLAIN ANALYZE SELECT SUM(x) FROM table GROUP BY b; +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | CoalescePartitionsExec, metrics=[] | | | ProjectionExec: expr=[SUM(table.x)@1 as SUM(x)], metrics=[] | | | HashAggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[SUM(x)], metrics=[outputRows=2] | | | CoalesceBatchesExec: target_batch_size=4096, metrics=[] | | | RepartitionExec: partitioning=Hash([Column { name: "b", index: 0 }], 16), metrics=[sendTime=839560, fetchTime=122528525, repartitionTime=5327877] | | | HashAggregateExec: mode=Partial, gby=[b@1 as b], aggr=[SUM(x)], metrics=[outputRows=2] | | | RepartitionExec: partitioning=RoundRobinBatch(16), metrics=[fetchTime=5660489, repartitionTime=0, sendTime=8012] | | | DataSourceExec: file_groups={1 group: [[/tmp/table.csv]]}, has_header=false, metrics=[] | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` By default `EXPLAIN ANALYZE` shows the aggregated metrics from all partitions for each operator. If you need to display per-partition metrics, use `EXPLAIN ANALYZE VERBOSE`. You can also set `datafusion.explain.analyze_level` from the [configuration value] to control the detail level for the metrics displayed. [configuration value]: ../configs.md