Prerequisites for Hadoop – Part 1

Hello everyone,

welcome back to another post related to Hadoop Ecosystem.

Lot of people have approached me regarding the prerequisites required before learning Hadoop, so this is for those poeple who are beginners and want to learn Hadoop Ecosystem.

I consider following three components as the prerequisites for learning Hadoop.

  • Linux File System and Commands
  • SQL
  • Core JAVA

It might sound a little bit intimidating, but these prerequisites are easy to capture if we go through these one by one on daily basis.

We will start with Linux File System and Commands, followed by SQL and will finish this post with Core JAVA.

Linux File System and Commands

Please go through this link in order to go through linux file system and its components.

Now lets talk about Linux Commands.
Consider we have following directory structure.

Directory Structure
Linux Directory Structure

For above shown directory structure, we will run essential commands for linux.
Assumptions :

  1. We have logged in with hduser as the username.
  2. We are currently at hduser’s home directory.
  3. There are no other users logged into the system.

Commands Description

  • who : print currently logged in users
  • whoami : print current user name
  • pwd : print current working directory location
  • ifconfig : print system IP Address
  • hostname : print system’s host name
  • ls : print current directy’s files and subdirectories names
  • cp : copy a file
  • mv : either rename or move a file or directory
  • rm : remove a file
  • mkdir : create a directory
  • rmdir : remove a directory
  • touch : create an empty file
  • chmod : change permissions of files and directories
  • chown : change owner and group of files and directories
  • du : print file system size information
  • head : print top 10 lines of a file
  • tail : print last 10 lines of a file
  • cat : print the contents of a file

Commands Examples and Output

  • $who => hduser
  • $whoami => hduser
  • $pwd => /home/hduser
  • $ifconfig => (ASSUMPTION)
  • $hostname => MilindUbuntu (ASSUMPTION)
  • $ls => (NOTHING – refer to directory structure diagram)
  • $cp => cp /usr/local/* /home/hduser
  • $mv => mv /usr/bin/* /home/hduser
  • $rm => rm /home/hduser/*
  • $mkdir => mkdir /home/hduser/test
  • $rmdir => rmdir /home/hduser/test
  • $touch => touch /home/hduser/abc.txt
  • $chmod => chmod 777 /home/hduser/abc.txt
  • $chown => chown hduser:hadoop /home/hduser/abc.txt
  • $du => du /home/hduser
  • $head => head /home/hduser/abc.txt
  • $tail => tail /home/hduser/abc.txt
  • $cat => cat /home/hduser/abc.txt

P.S. I have inserted $ symbol before every command to indicate its a command. You do not have to type $ while executing this command.

Hope you enjoy the read.
I will post the Part 2 containing SQL very soon.
Stay Tuned. Cheers.

Published by milindjagre

I founded my blog four years ago and am currently working as a Data Scientist Analyst at the Ford Motor Company. I graduated from the University of Connecticut pursuing Master of Science in Business Analytics and Project Management. I am working hard and learning a lot of new things in the field of Data Science. I am a strong believer of constant and directional efforts keeping the teamwork at the highest priority. Please reach out to me at for further information. Cheers!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: