Hello, everyone. Thanks for returning for the next tutorial in the HDPCD certification series. In the last tutorial, we saw how to specify the delimiter of the Hive table. In this tutorial, we are going to see how to load the data from the local Directory into the Hive table.
Let us begin then.

The above infographics show the step by step process to perform this activity.
We will perform these tasks as shown below.
- CREATING INPUT CSV FILE IN THE LOCAL FILE SYSTEM
We use the vi editor for creating the input CSV file in the local file system.
I have uploaded this input CSV file to my GitHub profile under HDPCD repository with name “49_input_to_load_from_local.csv“. You can download this file by clicking here and it looks as follows.
1 | tina | F | |
---|---|---|---|
2 | jerry | M | |
3 | tom | M | |
4 | wonder woman | F | |
5 | scoobydoo | M | |
6 | donald duck | M | |
7 | pink panther | F | |
8 | oggy | M | |
9 | shinchan | M |
The following commands are used for creating this input CSV file.
vi post38.csv
################################
PASTE THE COPIED CONTENTS HERE
################################
cat post38.csv
The output of the above commands looks as follows.

The above screenshot shows that the file post38.csv was created successfully and it is present in the local file system.
Let us check whether table post38 exists or not.
- CHECKING THE PRE-EXISTENCE OF THE HIVE TABLE POST38
We use the “show tables;” command to get the list of tables present in the “default” databse of hive.
show tables;
The output of the above command is as follows.

As you can see from the above screenshot, the list of tables does not contain a table with name post38. This indicates that we can go ahead and create the hive table with name post38.
- CREATING HIVE TABLE WITH NAME POST38
We use the “CREATE” command to create the hive table to load the data from the local file system.
I have uploaded this input CSV file to my GitHub profile under HDPCD repository with name “50_create_hive_table_for_local_load.sql“. You can download this file by clicking here and it looks as follows.
create table post38 ( | |
id int, | |
name string, | |
gender string | |
) | |
row format delimited | |
fields terminated by ',' | |
stored as textfile; |
The execution of the above command looks as follows.

The above screenshot shows that we get “OK” message after executing this CREATE statement. This indicates that the table post38 was created successfully.
Let us confirm the existence of the hive table post38.
- CONFIRMING THE EXISTENCE OF THE HIVE TABLE POST38
We use the same command “show tables;” to check the existence of the hive table.
show tables;
The output of the above command is as follows.

As you can see, the hive table post38 was created successfully. This allows us to continue with the further checks.
Let us check the schema of the hive table post38.
- CHECKING THE SCHEMA OF THE HIVE TABLE POST38
We use the “desc” command to check the detailed schema of the hive table post38.
desc formatted post38;
The output of the above command is as follows.

As you can see from the above screenshot, the table post38 contains three columns with defined name and data types. The input format of the hive table post38 is also as expected and that is the TextInputFormat.
This confirms that the hive table post38 got created successfully with correct schema.
It is time to check how many records exists in the table post38 before actually loading the data.
- RECORDS BEFORE DATA LOADING FROM LOCAL FILE SYSTEM
We use the “SELECT” command to get the records stored in the hive table.
select * from post38;
The output of the above command is as follows.

As you can see, we got zero records after executing the above command. This indicates that, by default, there are no records in the hive table post38.
From this, we can conclude that after loading the data from post38.csv file, the hive table post38 should have 9 records with 3 columns.
Let us load the data into the hive table post38 from the local file system.
- LOADING DATA INTO HIVE TABLE FROM LOCAL FILE SYSTEM
We use the “LOAD” command to load the data from the local file system into the hive table. This command is as follows.
load data local inpath ‘post38.csv’ into table post38;
The output of the above command is as follows.

The above screenshot shows that the input local file post38.csv was successfully loaded into the hive table post38. The screenshot shows the stats like the number of files and the size of the input file(s).
This enables us to look for the records stored in the hive table post38.
- CHECKING THE RECORDS IN HIVE TABLE POST38
We use the same command “SELECT” command to check the records that got stored in the hive table.
select * from post38;
The output of the above command is as follows.

The above screenshot shows the that there are a total of 9 records in the hive table post38 with 3 columns. This completes our expectations and matches exactly with the input file post38.csv.
We can conclude this tutorial here.
I hope you guys like the contents. Keep sharing it with others.
In the next tutorial, we are going to see how to load data into a Hive table from an HDFS Directory.
You can check out my LinkedIn profile here. Please like my Facebook page here. Follow me on Twitter here and subscribe to my YouTube channel here for the video tutorials.
One thought on “Post 38 | HDPCD | Load data into Hive table from a Local Directory”