Post 2 | Installations – R and Python

Hello, everyone, we are going to start off learning the concepts of Machine Learning. If you are following my blog posts on Hadoop and Big Data Analytics, then you will come to know I do give more importance on performing the hands-on exercises. Same is going to be the case for these tutorials. Here, weContinue reading “Post 2 | Installations – R and Python”

Post 52 | HDPCD | The conclusion

Hi everyone. Finally, we have reached the end of this tutorial series. It’s been so long. We started this journey together on January 15th, 2017, and, 276 days later this beautiful journey is coming to an end. But, we do not need to worry, because, I am working on something new and would love toContinue reading “Post 52 | HDPCD | The conclusion”

Post 27 | HDPCD | Invoke a User Defined Function in Apache Pig

Hello everyone, thanks for coming back to the last tutorial in the DATA TRANSFORMATION category of the HDPCD certification. We are going to pick-off things from the last tutorial, in which, we saw how to define an ALIAS to a function present in the JAR file. In this tutorial, we are going to see how toContinue reading “Post 27 | HDPCD | Invoke a User Defined Function in Apache Pig”

Post 26 | HDPCD | Define an ALIAS for a User Defined Function

Hi, everyone. Thank you for returning again to this certification series. In the last tutorial, we saw the process of registering the jar file in the Apache PIG session. This tutorial is an extension to the previous one and in this, we are going to see how to define an alias for the UDF presentContinue reading “Post 26 | HDPCD | Define an ALIAS for a User Defined Function”

Spark + Python : Passing Function

In this tutorial, we are going to various ways in which we pass functions in Spark using Python API. I have shown two ways in which functions can be called/created (for user-defined function). We are going to do the comparison based on filtering capabilities of Spark. For doing this I have created a user-defined function called containsMilind() whichContinue reading “Spark + Python : Passing Function”