Post 46 | HDPCD | Run a Hive Query using Tez Execution Engine

Hello, everyone. Welcome to one more tutorial in the HDPCD certification series.

In the last tutorial, we saw how to perform a JOIN operation between two Hive tables.

In this tutorial, we are going to see how to run a Hive Query using TeZ execution engine.

Let us begin, then.

Apache Hive: Setting the execution engine to TeZ
Apache Hive: Setting the execution engine to TeZ

As you can see from the above info-graphics, this process follows some housekeeping activities, which might not be necessary always, but, it is good to check output of those commands before moving forward.

Having said that, we will follow the below steps to execute this objective.

  • CHECKING THE DEFAULT EXECUTION ENGINE FOR HDP SANDBOX

As soon as you login into Hive with the help of “hive” command, you can check the default value of the Execution Engine for the HDP Sandbox.

We use the following command for checking the value of the execution engine.

set hive.execution.engine;

The output of the above command is as follows.

Step 1: Checking the default execution engine for HDP Sandbox
Step 1: Checking the default execution engine for HDP Sandbox

As can be seen, the default value of the execution engine is tez in HDP Sandbox. This indicates that we do not need to do anything to perform this objective. But, we will not do that.

We will first change the default value of the execution engine to MapReduce and then re-change it to TeZ.

  • SETTING THE EXECUTION ENGINE TO MAPREDUCE

We use the “set” command to set the execution engine in Hive.

set hive.execution.engine=mr;

Notice that “mr” in the above command indicates the MapReduce execution engine.

We use the following command to check the effect of the above command.

set hive.execution.engine;

The output of the above two commands is as follows.

Step 2: Changing the execution engine to MapReduce
Step 2: Changing the execution engine to MapReduce

As you can see, the above screenshot indicates that the value of the execution engine was changed to MapReduce.

Let us confirm this change.

  • CONFIRMING THAT THE EXECUTION ENGINE CHANGED TO MAPREDUCE

We are going to run a complex command to check which execution engine is being used by Hive.

Whenever we run a complex query in Hive, that query is executed by the execution engine and hence we can figure out which execution engine is used by Hive to run that command.

We are going to run the following command in Hive.

select * from post41 order by id desc;

The output of the above command is as follows.

Step 3: Confirming the execution engine was changed to MapReduce
Step 3: Confirming the execution engine was changed to MapReduce

As you can see, the above command triggers a MapReduce job, evident from the prompts given by the command output.

This confirms that we were able to change the execution engine in Hive.

Now, let us re-change it to TeZ.

  • SETTING THE EXECUTION ENGINE TO TEZ

We use the following command to set the Hive execution engine to TeZ.

set hive.execution.engine=tez;

And the following command is used for checking the effect of the above command.

set hive.execution.engine;

The output of the above commands is as follows.

Step 4: Changing the execution engine to TeZ
Step 4: Changing the execution engine to TeZ

The above screenshot confirms that the Hive execution engine value was changed to TeZ from the existing value of MapReduce.

Let us confirm this change.

  • CONFIRMING THAT THE EXECUTION ENGINE CHANGED TO TEZ

As already stated, we use the same complex Hive query to check the effect of the execution engine.

select * from post41 order by id desc;

The output of the above command is as follows.

Step 5: Confirming the execution engine was changed to TeZ
Step 5: Confirming the execution engine was changed to TeZ

You can clearly see the difference between the output of the two execution engines. The above screenshot shows that the Hive used the TeZ components to execute the above-mentioned Hive query, which completes the objective of this tutorial.

We can conclude this tutorial here. In the next tutorial, we are going to see how to execute a Hive query using Vectorization.

Till then, stay tuned and keep on sharing the contents.

We are not just five more posts away from completing all the tutorials.

I hope you guys like the content.

You can check out my LinkedIn profile here. Please like my Facebook page here. Follow me on Twitter here and subscribe to my YouTube channel here for the video tutorials.

 

 

 

 

1 thought on “Post 46 | HDPCD | Run a Hive Query using Tez Execution Engine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: