Post 19 | HDPCD | Sort the output of a Pig Relation

Hi everyone, thanks for coming back again to continue with this tutorial series. We are almost there with this section, and once we are done with this, we will jump into Hive, which will not take much time. In the last tutorial, we saw the process to store the data from PIG to HIVE usingContinue reading “Post 19 | HDPCD | Sort the output of a Pig Relation”

Post 18 | HDPCD | Storing Pig Relation in Hive Table

Wassup everyone. Thanks for coming back once again. This section is coming to an end with less than 10 tutorials remaining. Once we are done with this section (Apache Pig), we will start with the next section which focuses on Apache Hive. In the last tutorial, we saw the process of storing the data storedContinue reading “Post 18 | HDPCD | Storing Pig Relation in Hive Table”

Post 17 | HDPCD | Storing Pig Relation in HDFS Directory

Thanks for coming back for the next tutorial in the HDPCD certification series. In the last tutorial, we saw how to remove the records with the NULL values, whereas in this tutorial, we are going to see the process of storing the output of a Pig Relation in the HDFS directory. This is one ofContinue reading “Post 17 | HDPCD | Storing Pig Relation in HDFS Directory”

Post 16 | HDPCD | Removing records with NULL values from a Pig Relation

Removing records with NULL values from a Pig relation

Post 13 | HDPCD | Data Transformation using Apache Pig

In the previous tutorial, we saw how to load the data from Apache Hive to Apache Pig. If you remember, we used HCatalog for performing that operation. In this tutorial, we are going to see the process of doing the data transformation using Apache Pig. The process of data transformation itself is too involved andContinue reading “Post 13 | HDPCD | Data Transformation using Apache Pig”

Post 12 | HDPCD | Load data from Hive to Pig

Hello, everyone. Thanks for coming back! I Hope the tutorials are inspiring you to take each task seriously and perform each operation by understanding why we are performing each step. In the last tutorial, we saw how to create the Pig Relation with a defined schema. This tutorial is about creating a Pig Relation, but insteadContinue reading “Post 12 | HDPCD | Load data from Hive to Pig”

Post 11 | HDPCD | Load Pig Relation WITH schema

In the previous tutorial, we saw how to load the Pig Relation without a defined schema. In this tutorial, we are going to load a Pig Relation with a properly defined schema. It is exactly similar to the last tutorial, except for one step, which I will discuss in a moment. Please have a look at theContinue reading “Post 11 | HDPCD | Load Pig Relation WITH schema”

Post 10 | HDPCD | Load Pig Relation WITHOUT schema

  Hello everyone, hope you are finding the tutorials useful. In the previous tutorial, we started off with Data Transformation category of the HDPCD certification. This tutorial, being the second objective in this category, focuses on creating a sample pig relation without the schema. Before, starting with the actual process, let us define what isContinue reading “Post 10 | HDPCD | Load Pig Relation WITHOUT schema”

Apache Drill: Installation and Configuration

In this tutorial, we are going to install and configure Apache Drill 1.3.0 on Ubuntu 16.04. But, before starting with the installation and configuration, let us get to know about Apache Drill. Following is the minimum information we should know before going ahead with Apache Drill Installation and Configuration. Drill converts CSV files, NoSQL DatabasesContinue reading “Apache Drill: Installation and Configuration”