Post 21 | HDPCD| Specify number of reduce tasks for Pig MapReduce job

Hello everyone. Thanks for coming back to one more tutorial in this HDPCD certification series. In the last tutorial, we saw how to remove the duplicate tuples from a pig relation. In this tutorial, we are going to see how to specify the number of reduce tasks for a Pig MapReduce job. Let us get started […]

Read Excel File using MapReduce

The below code is used for reading excel files using MapReduce API. Entire source code has been taken from this link.   ExcelDriver.java ExcelInputFormat.java ExcelMapper.java ExcelParser.java ExcelRecordReader.java pom.xml If you clean and build above project, it will create two jar files, out of which we have to use the jar file with dependencies. I have used followingContinue reading “Read Excel File using MapReduce”