Load CSV File in Hive Table

We can load CSV data into hive table with the help of CSV SERDE JAR FILE which is freely available. You can download it manually by clicking below text.

Download CSV SERDE Jar File

Here, we are trying to load two types of CSV data in hive table.

First type of data contains header i.e. first line in the file is header information and Second type of CSV file contains only data and no header information is given.

Following script will load CSV data containing header as first line in hive table called csvtohive

# Assuming we are using hduser for doing Hadoop Operations
cd /home/hduser
# Loading .bashrc file in order to access HIVE Installation Directory Path
source .bashrc
# Removing already existing CSV SERDE JAR FILE
rm csv-serde-1.1.2-0.11.0-all.jar
# Downloading CSV SERDE JAR FILE during run time
wget https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
# Copying Downloaded CSV SERDE FILE to HIVE lib Directory
cp csv-serde-1.1.2-0.11.0-all.jar $HIVE_HOME/lib
# Creating Input Data File
echo "column1,column2" > input.txt
echo "1,this is first line" >> input.txt
echo "2,this is second line" >> input.txt
# Creating Hive Table by using CSV SERDE and skipping the first line since it is an header. At last printing records in hive table.
hive -e "drop table csvtohive;create table csvtohive(column1 string,column2 string) row format serde 'com.bizo.hive.serde.csv.CSVSerde' stored as textfile tblproperties ('skip.header.line.count'='1');LOAD DATA LOCAL INPATH '/home/hduser/input.txt' INTO TABLE csvtohive;select * from csvtohive;"

After this,
Below script will load CSV data (without containing header) in hive table called csvtohive.

# Assuming we are using hduser for doing Hadoop Operations
cd /home/hduser
# Loading .bashrc file in order to access HIVE Installation Directory Path
source .bashrc
# Removing already existing CSV SERDE JAR FILE
rm csv-serde-1.1.2-0.11.0-all.jar
# Downloading CSV SERDE JAR FILE during run time
wget https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
# Copying Downloaded CSV SERDE FILE to HIVE lib Directory
cp csv-serde-1.1.2-0.11.0-all.jar $HIVE_HOME/lib
# Creating Input Data File
echo "1,this is first line" > input.txt
echo "2,this is second line" >> input.txt
# Creating Hive Table by using CSV SERDE. At last printing records in hive table.
hive -e "drop table csvtohive;create table csvtohive(column1 string,column2 string) row format serde 'com.bizo.hive.serde.csv.CSVSerde' stored as textfile;LOAD DATA LOCAL INPATH '/home/hduser/input.txt' INTO TABLE csvtohive;select * from csvtohive;"

Thanks for having a read.
I hope the explanation makes the scripts easier to understand.

Published by milindjagre

I founded my blog www.milindjagre.co four years ago and am currently working as a Data Scientist Analyst at the Ford Motor Company. I graduated from the University of Connecticut pursuing Master of Science in Business Analytics and Project Management. I am working hard and learning a lot of new things in the field of Data Science. I am a strong believer of constant and directional efforts keeping the teamwork at the highest priority. Please reach out to me at milindjagre@gmail.com for further information. Cheers!

One thought on “Load CSV File in Hive Table

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: