EXPLAIN Command
The EXPLAIN
command is used to display the execution plan of an SQL query, showing how the query will be processed by the sql engine.
It provides insights into the relational operators used, their configuration, and the estimated number of rows processed at each step.
This information is essential for diagnosing performance bottlenecks and understanding query optimization decisions.
Syntax
Diagram( Terminal('EXPLAIN'), Optional( Sequence( Choice( 0, Terminal('PLAN'), Terminal('MAPPING') ), Terminal('FOR') ) ), NonTerminal('query_or_dml') )
If neither PLAN
nor MAPPING
is specified, then PLAN
is implicit.
Parameters:
-
PLAN
- explains query in terms of relational operators tree. This representation is suitable for investigation of performance issues related to the optimizer. -
MAPPING
- explains query in terms of mapping of query fragment to a particular node of the cluster. This representation is suitable for investigation of performance issues related to the data colocation.
Examples:
EXPLAIN SELECT * FROM lineitem;
EXPLAIN PLAN FOR SELECT * FROM lineitem;
EXPLAIN MAPPING FOR SELECT * FROM lineitem;
Understanding The Output
Each query plan is represented as a tree-like structure composed of relational operators.
A node in the plan includes:
-
A name, indicating the relational operator (e.g.,
TableScan
,IndexScan
,Sort
,Join
types) -
A set of attributes, relevant to that specific operator
OperatorName
attribute1: value1
attribute2: value2
Examples:
TableScan // Full table access
table: PUBLIC.EMP
fieldNames: [NAME, SALARY]
est: (rows=1)
IndexScan // Index-based access
table: PUBLIC.EMP
index: EMP_NAME_DESC_IDX
type: SORTED
fields: [NAME]
collation: [NAME DESC]
est: (rows=1)
Sort // Sort rows
collation: [NAME DESC NULLS LAST]
est: (rows=1)
Operator Naming
The operator name reflects the specific algorithm or strategy used. For example:
-
TableScan
– Full scan of a base table. -
IndexScan
– Access via index, possibly sorted. -
Sort
– Explicit sorting step. -
HashJoin
,MergeJoin
,NestedLoopJoin
– Types of join algorithms. -
Limit
,Project
,Exchange
– Execution-related transformations and controls.
Hierarchical Plan Structure
The plan is structured as a tree, where:
-
Leaf nodes represent data sources (e.g.,
TableScan
) -
Internal nodes represent data transformations (e.g.,
Join
,Sort
) -
The root node (topmost) is the final operator that produces the result
Examples
Example: Complex Join
EXPLAIN PLAN FOR
SELECT
U.UserName, P.ProductName, R.ReviewText, R.Rating
FROM Users U, Reviews R, Products P
WHERE U.UserID = R.UserID
AND R.ProductID = P.ProductID
AND P.ProductName = 'Product_' || ?::varchar
The resulting output is:
Project fieldNames: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING] projection: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING] est: (rows=16650) HashJoin predicate: =(USERID$0, USERID) fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING, PRODUCTID$0, PRODUCTNAME, USERID$0, USERNAME] type: inner est: (rows=16650) HashJoin predicate: =(PRODUCTID, PRODUCTID$0) fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING, PRODUCTID$0, PRODUCTNAME] type: inner est: (rows=16650) Exchange distribution: single est: (rows=50000) TableScan table: PUBLIC.REVIEWS fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING] est: (rows=50000) Exchange distribution: single est: (rows=1665) TableScan table: PUBLIC.PRODUCTS predicate: =(PRODUCTNAME, ||(_UTF-8'Product_', CAST(?0):VARCHAR CHARACTER SET "UTF-8")) fieldNames: [PRODUCTID, PRODUCTNAME] est: (rows=1665) Exchange distribution: single est: (rows=10000) TableScan table: PUBLIC.USERS fieldNames: [USERID, USERNAME] est: (rows=10000)
This execution plan represents a query that joins three tables: USERS
, REVIEWS
, and PRODUCTS
, and selects four fields after filtering by product name.
-
Project (root node): Outputs the final selected fields —
USERNAME
,PRODUCTNAME
,REVIEWTEXT
, andRATING
. -
HashJoins (two levels): Perform the inner joins.
-
The first (bottom-most) joins
REVIEWS
withPRODUCTS
onPRODUCTID
. -
The second joins the result with
USERS
onUSERID
.
-
-
TableScans: Each table is scanned:
-
REVIEWS
is fully scanned. -
PRODUCTS
is scanned with a filter onPRODUCTNAME
. -
USERS
is fully scanned.
-
-
Exchange nodes: Indicate data redistribution between operators.
Each node includes:
-
fieldNames
: Output columns at that stage. -
predicate
: Join or filter condition. -
est
: Estimated number of rows at that point in the plan.
Example: Query Mapping
A result of EXPLAIN MAPPING command includes additional metadata providing insight at how the query is mapped on cluster topology. So, for the command like below:
EXPLAIN MAPPING FOR
SELECT
U.UserName, P.ProductName, R.ReviewText, R.Rating
FROM Users U, Reviews R, Products P
WHERE U.UserID = R.UserID
AND R.ProductID = P.ProductID
AND P.ProductName = 'Product_' || ?::varchar
The resulting output is:
Fragment#0 root distribution: single executionNodes: [node_1] tree: Project fieldNames: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING] projection: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING] est: (rows=1) HashJoin predicate: =(USERID$0, USERID) fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING, PRODUCTID$0, PRODUCTNAME, USERID$0, USERNAME] type: inner est: (rows=1) HashJoin predicate: =(PRODUCTID, PRODUCTID$0) fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING, PRODUCTID$0, PRODUCTNAME] type: inner est: (rows=1) Receiver fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING] sourceFragmentId: 1 est: (rows=1) Receiver fieldNames: [PRODUCTID, PRODUCTNAME] sourceFragmentId: 2 est: (rows=1) Receiver fieldNames: [USERID, USERNAME] sourceFragmentId: 3 est: (rows=1) Fragment#1 distribution: random executionNodes: [node_1, node_2, node_3] partitions: [REVIEWS=[node_1={0, 2, 5, 6, 7, 8, 9, 10, 12, 13, 20}, node_2={1, 3, 11, 19, 21, 22, 23, 24}, node_3={4, 14, 15, 16, 17, 18}]] tree: Sender distribution: single targetFragmentId: 0 est: (rows=50000) TableScan table: PUBLIC.REVIEWS fieldNames: [PRODUCTID, USERID, REVIEWTEXT, RATING] est: (rows=50000) Fragment#2 distribution: table PUBLIC.PRODUCTS in zone "Default" executionNodes: [node_1, node_2, node_3] partitions: [PRODUCTS=[node_1={0, 2, 5, 6, 7, 8, 9, 10, 12, 13, 20}, node_2={1, 3, 11, 19, 21, 22, 23, 24}, node_3={4, 14, 15, 16, 17, 18}]] tree: Sender distribution: single targetFragmentId: 0 est: (rows=1665) TableScan table: PUBLIC.PRODUCTS predicate: =(PRODUCTNAME, ||(_UTF-8'Product_', CAST(?0):VARCHAR CHARACTER SET "UTF-8")) fieldNames: [PRODUCTID, PRODUCTNAME] est: (rows=1665) Fragment#3 distribution: table PUBLIC.USERS in zone "Default" executionNodes: [node_1, node_2, node_3] partitions: [USERS=[node_1={0, 2, 5, 6, 7, 8, 9, 10, 12, 13, 20}, node_2={1, 3, 11, 19, 21, 22, 23, 24}, node_3={4, 14, 15, 16, 17, 18}]] tree: Sender distribution: single targetFragmentId: 0 est: (rows=10000) TableScan table: PUBLIC.USERS fieldNames: [USERID, USERNAME] est: (rows=10000)
where:
-
Fragment#0 means fragment with id=0
-
A root marks a fragment which is considered as root fragment, i.e. a fragment which represents user’s cursor
-
A distribution attribute provides an insight into which mapping strategy was applied to this particular fragment
-
A executionNodes attribute provides a list of nodes this fragment will be executed on
-
A partitions attribute provides an insight into which partitions of which tables will be read from which nodes
-
A tree attribute specifies which part of the relational tree corresponds to this fragment
The output above shows how the query is broken into multiple execution fragments and distributed across the cluster. It gives insight into both the logical execution plan and how it maps to the physical topology.
The query starts execution in Fragment#0, which serves as the root of the plan — this is where the final result is produced. It runs on a single node (node_1
) and contains the main logic of the query, including the projection and two nested hash joins. Instead of scanning tables directly, it receives data from other fragments through Receiver
operators. These incoming streams correspond to the REVIEWS
, PRODUCTS
, and USERS
tables.
The actual table scans happen in Fragments 1 through 3, each responsible for one of the involved tables. These fragments operate in parallel across the cluster. Each performs a scan on its respective table and then sends the results back to Fragment#0.
-
Fragment#1 handles the
REVIEWS
table. It runs on all nodes and uses a random distribution strategy. Data is partitioned across nodes, and after scanning the table, results are sent upstream. -
Fragment#2 is in charge of the
PRODUCTS
table. It also spans all nodes but follows a zone-based distribution linked to the table’s partitioning. There’s a filter applied toPRODUCTNAME
, which limits the amount of data sent to the root. -
Fragment#3 covers the
USERS
table. Like the others, it’s distributed and reads from table partitions spread across the cluster.
Each fragment includes metadata such as the nodes it’s executed on, how data is partitioned, and how results are sent between fragments. This layout provides a clear view of not only how the query is logically processed, but also how the workload is split and coordinated in a distributed environment.
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.