Hello, everyone. Welcome to one more tutorial in the HDPCD certification series.
In the last tutorial, we saw how to enable vectorization in Hive.
In this tutorial, we are going to see how to print the execution plan of a Hive query.
Let us begin, then.
This is one of the simplest tutorials in this certification series. In this tutorial, we are going to see the EXPLAIN command which is used for printing out the execution plan of a Hive query.
The execution plan of a Hive query is useful while debugging that query. If the Hive query does not return the expected output, then you can run the EXPLAIN command followed by the Hive query to print the execution plan of that query.
The output of the EXPLAIN command is a step-by-step process of what operations are performed during the Hive query execution. These operations are shown in the reverse order starting from the last operation till the first one.
One of the examples of the EXPLAIN command is as follows.
explain select count(*) from post41;
The output of the above command is shown in the following screenshot.
The output shown in the above screenshot shows that the query execution starts from the bottom shown as Arrow 1 and ends at the Group by the operation shown as Arrow 5.
The same type of output is shown for all the queries depending on the operations performed for executing that Hive query.
This enables us to safely say that the objective of this tutorial is met and we can conclude this tutorial here.
In the next tutorial, we are going to see how to run a subquery inside a Hive query.
If you liked the content, share it with your friends. Please hit the Like button.