Hello people,
Welcome to third and last installment of Prerequisites for Hadoop.
This post will put light on following topics
- Apache Maven
- Core Java
- Java Project
- Java Packages
- Java Classes
- NetBeans IDE Installation
Apache Maven
Maven is an Apache project which is used for Project Management and Comprehension.
It is based on Project Object Model(POM).
In simplest words, we as Java Developers, use pom.xml file in order to specify the dependencies that we are going use during the course of the Project Development.
Now you might be wondering, why we specify the dependencies, can’t we directly copy the jar files in the classpath and get done with it.
The answer to above query is YES, you can do that. But what if there are 100s of Jar Files, so will you copy those jar files every time you create a new Project? The answer definitely is NO.
When I started working on MapReduce, I always fell in trap of missing either one or multiple jar files. I always found it tedious to go with this approach, so adopting Apache Maven for project configuration helped me a lot and this is the reason I encourage people to use Maven instead of manually copying and pasting the jar files in classpaths.
Core Java
We use JAVA mainly for MapReduce coding and writing UDFs in hive or pig, therefore we must know the core part of JAVA and not the advanced part.
Following are the contents which we are going to look through Core Java part of Hadoop Prerequisites.
- NetBeans IDE Installation
- Java Project
- Java Packages
- Java Classes
NetBeans IDE Installation
We are going to use NetBeans IDE for doing the development in JAVA and MapReduce.
You can use Eclipse IDE also which totally depends on your choice.
We are going to use following steps in order to install NetBeans IDE 8.0.2 on Ubuntu 14.04
cd | |
wget http://download.netbeans.org/netbeans/8.0.2/final/bundles/netbeans-8.0.2-javase-linux.sh | |
chmod +x netbeans* | |
sh netbeans-8.0.2-javase-linux.sh |
JAVA Project
As already discussed, we are going to use Apache Maven Java Project.
We can create Apache Maven Java Project in following way.
- Open NetBeans IDE
- Click on File -> New Project

- Select Maven Category -> Java Application Project -> Click Next

- Give suitable Project Name -> Click Finish

Once you follow all above mentioned steps, you will be able to see the newly created project on the left panel of the NetBeans IDE window which is shown in below figure.

You can see the default package that gets created, dependency jar files if any and pom.xml which is used for specifying the dependency in XML format.
JAVA Packages
We use JAVA Packages in order to differentiate components in a single project.
For example, let us say we are working on Online Shopping Website for which code needs to be written, so instead of keeping all the code in one package, we can create different packages and write code related to the particular department in corresponding packages. In simple English Language, we will write code for Electronics Department in electronics package, Fashion Department in fashion package, so on and so forth.
Using packages, debugging and project structuring is made very simple. We can narrow down the errors and exceptions quite easily when we use packages.
In below screenshot, com.mycompany.test_project is the package name.

JAVA Classes
JAVA Class is the place where we actually write our code.
If you want any class to execute, it must include main() method.
public static void main(String args) is the main() method signature.
If you do not have main() method in your class, then you cannot execute that class.
You can create a class by following steps.
- Right click on Project -> New -> Java Class

- Give Class Name -> Click on Finish

- Class Definition

- Class Without main() method cannot be run
You can see in the screenshot that Run File option is disabled, because it does not contain main() method.

- We can run Java Class With main() method
Below screenshot shows, as soon as you include main() method, Run File option is enabled.

I think this much information is sufficient for introductory part of JAVA for hadoop.
Hope you people have a great read.
Please do give some feedback, so that I can improvise on the content of this blog.