EXPLAIN¶
The EXPLAIN
command shows the logical and physical execution plan for the specified SQL statement.
Syntax¶
EXPLAIN [ANALYZE] [VERBOSE] [FORMAT format] statement
EXPLAIN
¶
Shows the execution plan of a statement.
If you need more detailed output, use EXPLAIN VERBOSE
.
Note that EXPLAIN VERBOSE
only supports the indent
format.
The optional [FORMAT format]
clause controls how the plan is displayed as
explained below. If this clause is not specified, the plan is displayed using
the format from the configuration value datafusion.explain.format
.
indent
format (default)¶
The indent
format shows both the logical and physical plan, with one line for
each operator in the plan. Child plans are indented to show the hierarchy.
See Reading Explain Plans for more information on how to interpret these plans.
> CREATE TABLE t(x int, b int) AS VALUES (1, 2), (2, 3);
0 row(s) fetched.
Elapsed 0.004 seconds.
> EXPLAIN SELECT SUM(x) FROM t GROUP BY b;
+---------------+-------------------------------------------------------------------------------+
| plan_type | plan |
+---------------+-------------------------------------------------------------------------------+
| logical_plan | Projection: sum(t.x) |
| | Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]] |
| | TableScan: t projection=[x, b] |
| physical_plan | ProjectionExec: expr=[sum(t.x)@1 as sum(t.x)] |
| | AggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[sum(t.x)] |
| | CoalesceBatchesExec: target_batch_size=8192 |
| | RepartitionExec: partitioning=Hash([b@0], 16), input_partitions=16 |
| | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 |
| | AggregateExec: mode=Partial, gby=[b@1 as b], aggr=[sum(t.x)] |
| | DataSourceExec: partitions=1, partition_sizes=[1] |
| | |
+---------------+-------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.004 seconds.
tree
format¶
The tree
format is modeled after DuckDB plans and is designed to be easier
to see the high level structure of the plan
> EXPLAIN FORMAT TREE SELECT SUM(x) FROM t GROUP BY b;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ sum(t.x): sum(t.x)@1 │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ AggregateExec │ |
| | │ -------------------- │ |
| | │ aggr: sum(t.x) │ |
| | │ group_by: b@0 as b │ |
| | │ │ |
| | │ mode: │ |
| | │ FinalPartitioned │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ CoalesceBatchesExec │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ RepartitionExec │ |
| | │ -------------------- │ |
| | │ output_partition_count: │ |
| | │ 16 │ |
| | │ │ |
| | │ partitioning_scheme: │ |
| | │ Hash([b@0], 16) │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ RepartitionExec │ |
| | │ -------------------- │ |
| | │ output_partition_count: │ |
| | │ 1 │ |
| | │ │ |
| | │ partitioning_scheme: │ |
| | │ RoundRobinBatch(16) │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ AggregateExec │ |
| | │ -------------------- │ |
| | │ aggr: sum(t.x) │ |
| | │ group_by: b@1 as b │ |
| | │ mode: Partial │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 224 │ |
| | │ format: memory │ |
| | │ rows: 1 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.016 seconds.
pgjson
format¶
The pgjson
format is modeled after Postgres JSON format.
You can use this format to visualize the plan in existing plan visualization tools, such as dalibo
> EXPLAIN FORMAT PGJSON SELECT SUM(x) FROM t GROUP BY b;
+--------------+----------------------------------------------------+
| plan_type | plan |
+--------------+----------------------------------------------------+
| logical_plan | [ |
| | { |
| | "Plan": { |
| | "Expressions": [ |
| | "sum(t.x)" |
| | ], |
| | "Node Type": "Projection", |
| | "Output": [ |
| | "sum(t.x)" |
| | ], |
| | "Plans": [ |
| | { |
| | "Aggregates": "sum(CAST(t.x AS Int64))", |
| | "Group By": "t.b", |
| | "Node Type": "Aggregate", |
| | "Output": [ |
| | "b", |
| | "sum(t.x)" |
| | ], |
| | "Plans": [ |
| | { |
| | "Node Type": "TableScan", |
| | "Output": [ |
| | "x", |
| | "b" |
| | ], |
| | "Plans": [], |
| | "Relation Name": "t" |
| | } |
| | ] |
| | } |
| | ] |
| | } |
| | } |
| | ] |
+--------------+----------------------------------------------------+
1 row(s) fetched.
Elapsed 0.008 seconds.
graphviz
format¶
The graphviz
format uses the DOT language that can be used with Graphviz to
generate a visual representation of the plan.
> EXPLAIN FORMAT GRAPHVIZ SELECT SUM(x) FROM t GROUP BY b;
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | |
| | // Begin DataFusion GraphViz Plan, |
| | // display it online here: https://dreampuf.github.io/GraphvizOnline |
| | |
| | digraph { |
| | subgraph cluster_1 |
| | { |
| | graph[label="LogicalPlan"] |
| | 2[shape=box label="Projection: sum(t.x)"] |
| | 3[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]"] |
| | 2 -> 3 [arrowhead=none, arrowtail=normal, dir=back] |
| | 4[shape=box label="TableScan: t projection=[x, b]"] |
| | 3 -> 4 [arrowhead=none, arrowtail=normal, dir=back] |
| | } |
| | subgraph cluster_5 |
| | { |
| | graph[label="Detailed LogicalPlan"] |
| | 6[shape=box label="Projection: sum(t.x)\nSchema: [sum(t.x):Int64;N]"] |
| | 7[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]\nSchema: [b:Int32;N, sum(t.x):Int64;N]"] |
| | 6 -> 7 [arrowhead=none, arrowtail=normal, dir=back] |
| | 8[shape=box label="TableScan: t projection=[x, b]\nSchema: [x:Int32;N, b:Int32;N]"] |
| | 7 -> 8 [arrowhead=none, arrowtail=normal, dir=back] |
| | } |
| | } |
| | // End DataFusion GraphViz Plan |
| | |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.010 seconds.
EXPLAIN ANALYZE
¶
Shows the execution plan and metrics of a statement. If you need more
information output, use EXPLAIN ANALYZE VERBOSE
. Note that EXPLAIN ANALYZE
only supports the indent
format.
EXPLAIN ANALYZE SELECT SUM(x) FROM table GROUP BY b;
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalescePartitionsExec, metrics=[] |
| | ProjectionExec: expr=[SUM(table.x)@1 as SUM(x)], metrics=[] |
| | HashAggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[SUM(x)], metrics=[outputRows=2] |
| | CoalesceBatchesExec: target_batch_size=4096, metrics=[] |
| | RepartitionExec: partitioning=Hash([Column { name: "b", index: 0 }], 16), metrics=[sendTime=839560, fetchTime=122528525, repartitionTime=5327877] |
| | HashAggregateExec: mode=Partial, gby=[b@1 as b], aggr=[SUM(x)], metrics=[outputRows=2] |
| | RepartitionExec: partitioning=RoundRobinBatch(16), metrics=[fetchTime=5660489, repartitionTime=0, sendTime=8012] |
| | DataSourceExec: file_groups={1 group: [[/tmp/table.csv]]}, has_header=false, metrics=[] |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+