<!---
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing,
  software distributed under the License is distributed on an
  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  KIND, either express or implied.  See the License for the
  specific language governing permissions and limitations
  under the License.
-->

# EXPLAIN

The `EXPLAIN` command shows the logical and physical execution plan for the specified SQL statement.

## Syntax

<pre>
EXPLAIN [ANALYZE] [VERBOSE] [FORMAT format] statement
</pre>

## `EXPLAIN`

Shows the execution plan of a statement.
If you need more detailed output, use `EXPLAIN VERBOSE`.
Note that `EXPLAIN VERBOSE` only supports the `indent` format.

The optional `[FORMAT format]` clause controls how the plan is displayed as
explained below. If this clause is not specified, the plan is displayed using
the format from the [configuration value] `datafusion.explain.format`.

### `tree` format (default)

The `tree` format is modeled after [DuckDB plans] and is designed to be easier
to see the high level structure of the plan

[duckdb plans]: https://duckdb.org/docs/stable/guides/meta/explain.html

```sql
> EXPLAIN FORMAT TREE SELECT SUM(x) FROM t GROUP BY b;
+---------------+-------------------------------+
| plan_type     | plan                          |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
|               | │       ProjectionExec      │ |
|               | │    --------------------   │ |
|               | │    sum(t.x): sum(t.x)@1   │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │       aggr: sum(t.x)      │ |
|               | │     group_by: b@0 as b    │ |
|               | │                           │ |
|               | │           mode:           │ |
|               | │      FinalPartitioned     │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │    CoalesceBatchesExec    │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │      RepartitionExec      │ |
|               | │    --------------------   │ |
|               | │   input_partition_count:  │ |
|               | │             1             │ |
|               | │                           │ |
|               | │    partitioning_scheme:   │ |
|               | │      Hash([b@0], 16)      │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │       aggr: sum(t.x)      │ |
|               | │     group_by: b@1 as b    │ |
|               | │       mode: Partial       │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       DataSourceExec      │ |
|               | │    --------------------   │ |
|               | │         bytes: 224        │ |
|               | │       format: memory      │ |
|               | │          rows: 1          │ |
|               | └───────────────────────────┘ |
|               |                               |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.016 seconds.
```

### `indent` format

The `indent` format shows both the logical and physical plan, with one line for
each operator in the plan. Child plans are indented to show the hierarchy.

See [Reading Explain Plans](../explain-usage.md) for more information on how to interpret these plans.

```sql
> CREATE TABLE t(x int, b int) AS VALUES (1, 2), (2, 3);
0 row(s) fetched.
Elapsed 0.004 seconds.

> EXPLAIN FORMAT INDENT SELECT SUM(x) FROM t GROUP BY b;
+---------------+-------------------------------------------------------------------------------+
| plan_type     | plan                                                                          |
+---------------+-------------------------------------------------------------------------------+
| logical_plan  | Projection: sum(t.x)                                                          |
|               |   Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]                |
|               |     TableScan: t projection=[x, b]                                            |
| physical_plan | ProjectionExec: expr=[sum(t.x)@1 as sum(t.x)]                                 |
|               |   AggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[sum(t.x)]       |
|               |     CoalesceBatchesExec: target_batch_size=8192                               |
|               |       RepartitionExec: partitioning=Hash([b@0], 16), input_partitions=1       |
|               |         AggregateExec: mode=Partial, gby=[b@1 as b], aggr=[sum(t.x)]          |
|               |           DataSourceExec: partitions=1, partition_sizes=[1]                   |
|               |                                                                               |
+---------------+-------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.004 seconds.
```

### `pgjson` format

The `pgjson` format is modeled after [Postgres JSON] format.

You can use this format to visualize the plan in existing plan visualization
tools, such as [dalibo]

[postgres json]: https://www.postgresql.org/docs/current/sql-explain.html
[dalibo]: https://explain.dalibo.com/

```sql
> EXPLAIN FORMAT PGJSON SELECT SUM(x) FROM t GROUP BY b;
+--------------+----------------------------------------------------+
| plan_type    | plan                                               |
+--------------+----------------------------------------------------+
| logical_plan | [                                                  |
|              |   {                                                |
|              |     "Plan": {                                      |
|              |       "Expressions": [                             |
|              |         "sum(t.x)"                                 |
|              |       ],                                           |
|              |       "Node Type": "Projection",                   |
|              |       "Output": [                                  |
|              |         "sum(t.x)"                                 |
|              |       ],                                           |
|              |       "Plans": [                                   |
|              |         {                                          |
|              |           "Aggregates": "sum(CAST(t.x AS Int64))", |
|              |           "Group By": "t.b",                       |
|              |           "Node Type": "Aggregate",                |
|              |           "Output": [                              |
|              |             "b",                                   |
|              |             "sum(t.x)"                             |
|              |           ],                                       |
|              |           "Plans": [                               |
|              |             {                                      |
|              |               "Node Type": "TableScan",            |
|              |               "Output": [                          |
|              |                 "x",                               |
|              |                 "b"                                |
|              |               ],                                   |
|              |               "Plans": [],                         |
|              |               "Relation Name": "t"                 |
|              |             }                                      |
|              |           ]                                        |
|              |         }                                          |
|              |       ]                                            |
|              |     }                                              |
|              |   }                                                |
|              | ]                                                  |
+--------------+----------------------------------------------------+
1 row(s) fetched.
Elapsed 0.008 seconds.
```

### `graphviz` format

The `graphviz` format uses the [DOT language] that can be used with [Graphviz] to
generate a visual representation of the plan.

[dot language]: https://graphviz.org/doc/info/lang.html
[graphviz]: https://graphviz.org/

```sql
> EXPLAIN FORMAT GRAPHVIZ SELECT SUM(x) FROM t GROUP BY b;
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| plan_type    | plan                                                                                                                         |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| logical_plan |                                                                                                                              |
|              | // Begin DataFusion GraphViz Plan,                                                                                           |
|              | // display it online here: https://dreampuf.github.io/GraphvizOnline                                                         |
|              |                                                                                                                              |
|              | digraph {                                                                                                                    |
|              |   subgraph cluster_1                                                                                                         |
|              |   {                                                                                                                          |
|              |     graph[label="LogicalPlan"]                                                                                               |
|              |     2[shape=box label="Projection: sum(t.x)"]                                                                                |
|              |     3[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]"]                                        |
|              |     2 -> 3 [arrowhead=none, arrowtail=normal, dir=back]                                                                      |
|              |     4[shape=box label="TableScan: t projection=[x, b]"]                                                                      |
|              |     3 -> 4 [arrowhead=none, arrowtail=normal, dir=back]                                                                      |
|              |   }                                                                                                                          |
|              |   subgraph cluster_5                                                                                                         |
|              |   {                                                                                                                          |
|              |     graph[label="Detailed LogicalPlan"]                                                                                      |
|              |     6[shape=box label="Projection: sum(t.x)\nSchema: [sum(t.x):Int64;N]"]                                                    |
|              |     7[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]\nSchema: [b:Int32;N, sum(t.x):Int64;N]"] |
|              |     6 -> 7 [arrowhead=none, arrowtail=normal, dir=back]                                                                      |
|              |     8[shape=box label="TableScan: t projection=[x, b]\nSchema: [x:Int32;N, b:Int32;N]"]                                      |
|              |     7 -> 8 [arrowhead=none, arrowtail=normal, dir=back]                                                                      |
|              |   }                                                                                                                          |
|              | }                                                                                                                            |
|              | // End DataFusion GraphViz Plan                                                                                              |
|              |                                                                                                                              |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.010 seconds.
```

## `EXPLAIN ANALYZE`

Shows the execution plan and metrics of a statement. Note that `EXPLAIN ANALYZE`
only supports the `indent` format.

```sql
EXPLAIN ANALYZE SELECT SUM(x) FROM table GROUP BY b;

+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                      |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalescePartitionsExec, metrics=[]                                                                                                                        |
|                   |   ProjectionExec: expr=[SUM(table.x)@1 as SUM(x)], metrics=[]                                                                                             |
|                   |     HashAggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[SUM(x)], metrics=[outputRows=2]                                                       |
|                   |       CoalesceBatchesExec: target_batch_size=4096, metrics=[]                                                                                             |
|                   |         RepartitionExec: partitioning=Hash([Column { name: "b", index: 0 }], 16), metrics=[sendTime=839560, fetchTime=122528525, repartitionTime=5327877] |
|                   |           HashAggregateExec: mode=Partial, gby=[b@1 as b], aggr=[SUM(x)], metrics=[outputRows=2]                                                          |
|                   |             RepartitionExec: partitioning=RoundRobinBatch(16), metrics=[fetchTime=5660489, repartitionTime=0, sendTime=8012]                              |
|                   |               DataSourceExec: file_groups={1 group: [[/tmp/table.csv]]}, has_header=false, metrics=[]                                                        |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
```

By default `EXPLAIN ANALYZE` shows the aggregated metrics from all partitions for each operator. If you need to display per-partition metrics, use `EXPLAIN ANALYZE VERBOSE`.

You can also set `datafusion.explain.analyze_level` from the [configuration value] to control the detail level for the metrics displayed.

[configuration value]: ../configs.md