Post 20 | HDPCD | Removing Duplicate tuples from a PIG Relation

Hi everyone, welcome to one more tutorial in this HDPCD certification series. As you might notice, I have changed the blog layout a little bit, hope you like it. Kindly let me know your feedback on this in the COMMENT SECTION. In the last tutorial, we saw how to perform the SORT OPERATION in ApacheContinue reading “Post 20 | HDPCD | Removing Duplicate tuples from a PIG Relation”

Post 19 | HDPCD | Sort the output of a Pig Relation

Hi everyone, thanks for coming back again to continue with this tutorial series. We are almost there with this section, and once we are done with this, we will jump into Hive, which will not take much time. In the last tutorial, we saw the process to store the data from PIG to HIVE usingContinue reading “Post 19 | HDPCD | Sort the output of a Pig Relation”

Post 18 | HDPCD | Storing Pig Relation in Hive Table

Wassup everyone. Thanks for coming back once again. This section is coming to an end with less than 10 tutorials remaining. Once we are done with this section (Apache Pig), we will start with the next section which focuses on Apache Hive. In the last tutorial, we saw the process of storing the data storedContinue reading “Post 18 | HDPCD | Storing Pig Relation in Hive Table”

Post 17 | HDPCD | Storing Pig Relation in HDFS Directory

Thanks for coming back for the next tutorial in the HDPCD certification series. In the last tutorial, we saw how to remove the records with the NULL values, whereas in this tutorial, we are going to see the process of storing the output of a Pig Relation in the HDFS directory. This is one ofContinue reading “Post 17 | HDPCD | Storing Pig Relation in HDFS Directory”

Post 14 | HDPCD | Data Transformation to match Hive Schema using Apache Pig

The last tutorial talked about transforming data by reducing the number of columns from input to output records. This tutorial is kind of similar, in which, we are going to take the data transformation process one step further. This tutorial focuses on matching your input records with the Hive table schema. This includes splitting theContinue reading “Post 14 | HDPCD | Data Transformation to match Hive Schema using Apache Pig”

Post 12 | HDPCD | Load data from Hive to Pig

Hello, everyone. Thanks for coming back! I Hope the tutorials are inspiring you to take each task seriously and perform each operation┬áby understanding why we are performing each step. In the last tutorial, we saw how to create the Pig Relation with a defined schema. This tutorial is about creating a Pig Relation, but insteadContinue reading “Post 12 | HDPCD | Load data from Hive to Pig”

Hadoop Commands Example

cat => hadoop fs -cat /hdfs_home/input/abc.txt chmod => hadoop fs -chmod 777 /hdfs_home/input/abc.txt chown => hadoop fs -chown hduser:hadoop /hdfs_home/input/abc.txt copyFromLocal => hadoop fs -copyFromLocal /home/anoop/abc.txt /hdfs_home/input/ copyToLocal => hadoop fs -copyToLocal /hdfs_home/input/abc.txt /home/anoop/ cp => hadoop fs -cp /hdfs_home/input/abc.txt /hdfs_home/output/ du => hadoop fs -du /hdfs_home/input/ get => hadoop fs – get /hdfs_home/input/abc.txt /home/anoop/Continue reading “Hadoop Commands Example”