Spark + Python : Union Operation

In this tutorial, we are going to see how the Union operation works. In English Language, union means combining two things. Here, we are also going to do the same thing. The difference is, we are going to attach two RDDs using Union operation. We are using the same input.txt file we used in last tutorial.Continue reading “Spark + Python : Union Operation”

Spark + Python – Filter Operation

This is the first program in Spark + Python series. In this tutorial, we are going to see the Filter operation. The objective of this tutorial is to print only those lines containing specified keyword. For doing this, we are going to follow below steps. But before diving into actual operations, please look for theContinue reading “Spark + Python – Filter Operation”

Spark + Python – Tools Setup

In this series, we are going to talk about the simple concepts and basic spark programming with Python API. For doing our development work faster and easier, we are going to use some basic tools and software. The tools that we are talking about are Notepad ++ Putty We use Putty to connect to theContinue reading “Spark + Python – Tools Setup”