Post 52 | HDPCD | The conclusion

Hi everyone. Finally, we have reached the end of this tutorial series. It's been so long. We started this journey together on January 15th, 2017, and, 276 days later this beautiful journey is coming to an end. But, we do not need to worry, because, I am working on something new and would love to … Continue reading Post 52 | HDPCD | The conclusion

Post 48 | HDPCD | Printing the execution plan of a Hive query

Hello, everyone. Welcome to one more tutorial in the HDPCD certification series. In the last tutorial, we saw how to enable vectorization in Hive. In this tutorial, we are going to see how to print the execution plan of a Hive query. Let us begin, then. This is one of the simplest tutorials in this certification series. In … Continue reading Post 48 | HDPCD | Printing the execution plan of a Hive query

Read Excel File using MapReduce

The below code is used for reading excel files using MapReduce API. Entire source code has been taken from this link.   ExcelDriver.java https://gist.github.com/milindjagre/84cc1c230ffd10b7ec0b5db5a47f4c80 ExcelInputFormat.java https://gist.github.com/milindjagre/a8f2f35908ad0ce2d0d63a0725e3de4f ExcelMapper.java https://gist.github.com/milindjagre/3a77b5430111ead3eb3f538a5db72210 ExcelParser.java https://gist.github.com/milindjagre/34966d289da2e6d33dfbf0f76fc75271 ExcelRecordReader.java https://gist.github.com/milindjagre/d45935abc259d594e1ed495ca2a67d7a pom.xml https://gist.github.com/milindjagre/f95e366cf4766070652608c05783be0f If you clean and build above project, it will create two jar files, out of which we have to use the jar file … Continue reading Read Excel File using MapReduce