Post 18 | HDPCD | Storing Pig Relation in Hive Table

Wassup everyone. Thanks for coming back once again. This section is coming to an end with less than 10 tutorials remaining. Once we are done with this section (Apache Pig), we will start with the next section which focuses on Apache Hive. In the last tutorial, we saw the process of storing the data stored … Continue reading Post 18 | HDPCD | Storing Pig Relation in Hive Table

Post 17 | HDPCD | Storing Pig Relation in HDFS Directory

Thanks for coming back for the next tutorial in the HDPCD certification series. In the last tutorial, we saw how to remove the records with the NULL values, whereas in this tutorial, we are going to see the process of storing the output of a Pig Relation in the HDFS directory. This is one of … Continue reading Post 17 | HDPCD | Storing Pig Relation in HDFS Directory

Post 16 | HDPCD | Removing records with NULL values from a Pig Relation

Removing records with NULL values from a Pig relation

Post 15 | HDPCD | Group Data in one or more PIG Relations

Grouping in Apache pig

Post 14 | HDPCD | Data Transformation to match Hive Schema using Apache Pig

The last tutorial talked about transforming data by reducing the number of columns from input to output records. This tutorial is kind of similar, in which, we are going to take the data transformation process one step further. This tutorial focuses on matching your input records with the Hive table schema. This includes splitting the … Continue reading Post 14 | HDPCD | Data Transformation to match Hive Schema using Apache Pig

Post 13 | HDPCD | Data Transformation using Apache Pig

In the previous tutorial, we saw how to load the data from Apache Hive to Apache Pig. If you remember, we used HCatalog for performing that operation. In this tutorial, we are going to see the process of doing the data transformation using Apache Pig. The process of data transformation itself is too involved and … Continue reading Post 13 | HDPCD | Data Transformation using Apache Pig

Post 12 | HDPCD | Load data from Hive to Pig

Hello, everyone. Thanks for coming back! I Hope the tutorials are inspiring you to take each task seriously and perform each operation by understanding why we are performing each step. In the last tutorial, we saw how to create the Pig Relation with a defined schema. This tutorial is about creating a Pig Relation, but instead … Continue reading Post 12 | HDPCD | Load data from Hive to Pig

Post 11 | HDPCD | Load Pig Relation WITH schema

In the previous tutorial, we saw how to load the Pig Relation without a defined schema. In this tutorial, we are going to load a Pig Relation with a properly defined schema. It is exactly similar to the last tutorial, except for one step, which I will discuss in a moment. Please have a look at the … Continue reading Post 11 | HDPCD | Load Pig Relation WITH schema

Post 10 | HDPCD | Load Pig Relation WITHOUT schema

  Hello everyone, hope you are finding the tutorials useful. In the previous tutorial, we started off with Data Transformation category of the HDPCD certification. This tutorial, being the second objective in this category, focuses on creating a sample pig relation without the schema. Before, starting with the actual process, let us define what is … Continue reading Post 10 | HDPCD | Load Pig Relation WITHOUT schema

Post 9 | HDPCD | Pig Script Execution

This is the first post in Data Transformation category which is essential to clear the HDPCD certification, given by Hortonworks Inc. In the last eight tutorials, we focused on Data Ingestion tasks. The next twenty-one, yeah, that's right, I said next twenty-one tutorial, including this one, will focus on the Data Transformation category of the … Continue reading Post 9 | HDPCD | Pig Script Execution