Post 7 | ML | Data Preprocessing – Part 5

Hello, everyone. Thanks for joining me in this 5th tutorial of the Data Preprocessing part of the Machine Learning tutorials. In the last tutorial, we saw how to convert the CATEGORICAL VARIABLES from the STRING format to an INTEGER format. In this tutorial, we are going a step ahead and are going to split the original data … Continue reading Post 7 | ML | Data Preprocessing – Part 5

Post 3 | ML | Data Preprocessing – Part 1

In the last tutorial, we saw the installation steps for both R and Python along with their respective IDEs. In this tutorial, we are going to start our actual journey of Machine Learning. We are going to start off with the Data Preprocessing part, which is one of the most important aspects of the Machine Learning. We … Continue reading Post 3 | ML | Data Preprocessing – Part 1

Post 1 | ML | Introduction

Hello, people. In this new tutorial series, we are going to talk about the different aspects of the Machine Learning. As an aspiring Data Scientist, I always wanted to get my hands dirty with the concepts of Machine Learning and the Summar Break gave me exactly what I wanted - "TIME TO LEARN MACHINE LEARNING … Continue reading Post 1 | ML | Introduction

Post 52 | HDPCD | The conclusion

Hi everyone. Finally, we have reached the end of this tutorial series. It's been so long. We started this journey together on January 15th, 2017, and, 276 days later this beautiful journey is coming to an end. But, we do not need to worry, because, I am working on something new and would love to … Continue reading Post 52 | HDPCD | The conclusion

Post 50 | HDPCD | Order Hive query output across multiple reducers

Hello, everyone. Welcome to one more tutorial in the HDPCD certification series. In the last tutorial, we saw how to enable vectorization in Hive. In this tutorial, we are going to see how to run a subquery within a Hive query. Let us begin, then. The following infographics show the step-by-step process of performing this operation. From … Continue reading Post 50 | HDPCD | Order Hive query output across multiple reducers

Post 39 | HDPCD | Load data into a Hive table from an HDFS directory

Hello, everyone. Thanks for returning for the next tutorial in the HDPCD certification series. In the last tutorial, we saw how to load data into a Hive table from a local directory. In this tutorial, we are going to see how to load the data from the local Directory into the Hive table. Let us begin then. … Continue reading Post 39 | HDPCD | Load data into a Hive table from an HDFS directory

Post 37 | HDPCD | Specifying delimiter of a Hive table

Hello, everyone. Thanks for coming back for one more tutorial in this HDPCD certification series. In the last tutorial, we saw how to specify the storage format of a Hive table. In this tutorial, we are going to see how to specify the delimiter of a Hive table. We are going to follow the process … Continue reading Post 37 | HDPCD | Specifying delimiter of a Hive table

Post 34 | HDPCD | Defining Hive Table using an ORC File Format

Hi, everyone. Thanks for joining me today for this tutorial. In the last tutorial, we saw how to create a hive table using the SELECT query. In this tutorial, we are going to see how to create a hive table which stores the data in the ORC File Format. The process of creating this table … Continue reading Post 34 | HDPCD | Defining Hive Table using an ORC File Format

Post 29 | HDPCD | Define a Hive-managed Table

Hello, everyone. Welcome to the second post in the Data Analysis section of the HDPCD certification series. In the last tutorial, we saw the three ways in which we run the hive commands. In this tutorial, we are going to create the hive-managed table i.e. hive internal table. For creating a hive-managed or internal table, … Continue reading Post 29 | HDPCD | Define a Hive-managed Table

Post 26 | HDPCD | Define an ALIAS for a User Defined Function

Hi, everyone. Thank you for returning again to this certification series. In the last tutorial, we saw the process of registering the jar file in the Apache PIG session. This tutorial is an extension to the previous one and in this, we are going to see how to define an alias for the UDF present … Continue reading Post 26 | HDPCD | Define an ALIAS for a User Defined Function