Hello, everyone. Thanks for coming back for one more tutorial in this HDPCD certification series.
In the last tutorial, we saw how to specify the storage format of a Hive table. In this tutorial, we are going to see how to specify the delimiter of a Hive table.
We are going to follow the process mentioned in the following infographics.

This process is similar to the most of the Hive tutorials we have seen in this certification series. Let us start with this process.
- CREATING INPUT FILE WITH TAB DELIMITER IN LOCAL FILE SYSTEM AND PUSHING IT INTO HDFS
We can create the input file in the local file system with the help of vi editor. We are going to use the following commands to create the input file in local file system.
I have uploaded this input TSV file to my GitHub profile under HDPCD repository with name “47_input_delimiter_hive.tsv“. You can download this file by clicking here. This file looks as follows.
1 | tina | F | |
---|---|---|---|
2 | jerry | M | |
3 | tom | M | |
4 | wonder woman | F | |
5 | scoobydoo | M | |
6 | donald duck | M | |
7 | pink panther | F | |
9 | shinchan | M | |
8 | oggy | M |
We are going to use the following commands to create the input file in local file system.
vi post37.tsv
################################
PASTE THE COPIED CONTENTS HERE
################################
cat post37.tsv
The above commands create the input file post37.tsv in the local file system.
Once the file is created in the local file system, you can push this file to HDFS with the help of put command.
The commands look as follows.
hadoop fs -mkdir /hdpcd/input/post37
hadoop fs -put post37.tsv /hdpcd/input/post37
hadoop fs -cat /hdpcd/input/post37/post37.tsv
The output of the above commands is shown in the following screenshot.

As you can see from the above screenshot, the file post37.tsv was successfully created in HDFS. This file will be inserted into post37 Hive table as an input.
Now, let us check the existence of the Hive table post37.
- CHECKING THE PRE-EXISTENCE OF THE HIVE TABLE
You can use the “show tables;” command to check the existence of the Hive table post37.
show tables;
The output of the above command looks like this.

As you can see from the above screenshot, the table post37 does not exist. This indicates that we can go ahead and create the Hive table post37.
Let us create the Hive table post37, now.
- CREATING THE HIVE TABLE WITH TAB DELIMITER
We use the CREATE command for creating a Hive table with TAB delimiter for the records in post37.tsv.
I have uploaded the SQL file to my GitHub profile under HDPCD repository with name “48_hive_table_tab_delimiter.sql“. You can download this file by clicking here. This file looks as follows.
create external table post37 ( | |
id int, | |
name string, | |
gender string | |
) | |
row format delimited | |
fields terminated by '\t' | |
stored as textfile | |
location '/hdpcd/input/post37/'; |
The output of the above command is shown in the following screenshot.

As it is shown in the above screenshot, the “OK” message shows that the Hive table post37 was created successfully. Having said that, let us confirm the existence of the Hive table post37.
You can use the following command to check the existence of the Hive table post37.
show tables;
The output of the above command is as follows.

The above screenshot confirms the existence of the Hive table post37. This enables us to go for the schema check of the Hive table post37.
- CHECK THE SCHEMA OF THE NEWLY CREATED HIVE TABLE
You can use the “desc” command to check the schema of the Hive table post37.
desc formatted post37;
The output of the above command is shown in the following screenshot.

The above screenshot confirms the schema of the Hive table post37. At the bottom of the screenshot, as you can see, the FIELD DELIMITER is the TAB character (\t).
Now, the only thing remaining is the records check.
- CHECK THE RECORDS STORED IN THE HIVE TABLE
You can use the “SELECT” command to check the records stored in the Hive table post37.
The command is as follows.
select * from post37;
The output of the above command is as follows.

The above screenshot confirms that we were successful in loading the TAB delimited records into the Hive table post37.
This completes the tutorial here.
I hope you guys like the content.
In the next tutorial, we are going to see how to load data into a Hive table from a Local Directory.
You can check out my LinkedIn profile here. Please like my Facebook page here. Follow me on Twitter here and subscribe to my YouTube channel here for the video tutorials.
One thought on “Post 37 | HDPCD | Specifying delimiter of a Hive table”