Post 2 | Installations – R and Python

Hello, everyone, we are going to start off learning the concepts of Machine Learning. If you are following my blog posts on Hadoop and Big Data Analytics, then you will come to know I do give more importance on performing the hands-on exercises. Same is going to be the case for these tutorials. Here, weContinue reading “Post 2 | Installations – R and Python”

Post 52 | HDPCD | The conclusion

Hi everyone. Finally, we have reached the end of this tutorial series. It’s been so long. We started this journey together on January 15th, 2017, and, 276 days later this beautiful journey is coming to an end. But, we do not need to worry, because, I am working on something new and would love toContinue reading “Post 52 | HDPCD | The conclusion”

Spark + Python : reduce action

This tutorial is sort of an introduction to the action in spark. We have seen transformations like map() and flatMap() till now. reduce is one of the actions provided by spark. In this, we are going to perform an addition operation with the help of reduce action. We are going to follow below steps for achievingContinue reading “Spark + Python : reduce action”

Spark + Python : Passing Function

In this tutorial, we are going to various ways in which we pass functions in Spark using Python API. I have shown two ways in which functions can be called/created (for user-defined function). We are going to do the comparison based on filtering capabilities of Spark. For doing this I have created a user-defined function called containsMilind() whichContinue reading “Spark + Python : Passing Function”