Create a new table with Apache Hive

We are going to create a new table with Apache Hive from a previous one, populate it and then perform a UNION ALL of both tables. Below is the script that will create the new table.

-- Below script creates a new table
USE testdb;
-- show current tables
SHOW tables;
-- describe mytable2, table we will use to create mytable4
DESCRIBE mytable2;
-- create new table copying format from mytable2
CREATE TABLE mytable4 LIKE mytable2 ;

SHOW tables;
-- describe newly created table
DESCRIBE mytable4;
-- select content from newly created table
SELECT * FROM mytable4;

We proceed executing via hive in a linux shell.

hive    -f  create-new-table.hql

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
OK
Time taken: 0.898 seconds
OK
mytable
mytable2
newtable3
Time taken: 0.206 seconds, Fetched: 3 row(s)
OK
id                  	int                 	                    
lname               	string              	                    
fname               	string              	                    
Time taken: 0.263 seconds, Fetched: 3 row(s)
OK
Time taken: 0.272 seconds
OK
mytable
mytable2
mytable4
newtable3
Time taken: 0.043 seconds, Fetched: 4 row(s)
OK
id                  	int                 	                    
lname               	string              	                    
fname               	string              	                    
Time taken: 0.166 seconds, Fetched: 3 row(s)
OK
Time taken: 0.666 seconds

We now need to populate the newly created mytable4, because right now it’s empty. We will perform this task using hdfs. We will populate with text file with the below content.

4	New	John
5	Milan	Donald

Populating from hadoop cli

hdfs dfs -put   mytable4-data.txt   /user/hive/warehouse/testdb.db/mytable4

We now proceed to check the content of the newly populated table.

hive -e 'use testdb; SELECT * FROM mytable4;'

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
OK
Time taken: 0.875 seconds
OK
4	New	John
5	Milan	Donald
Time taken: 0.953 seconds, Fetched: 2 row(s)

Which corresponds to the content we have copied from local to hadoop fs.

Now we will use UNION ALL to perform a query on tables mytable2 and mytable4.

hive -S -e 'use testdb; SELECT * FROM mytable4 UNION ALL SELECT * FROM mytable2;'
1	Doe	John
2	Lancaster	William
3	Gentoo	Burp
4	New	John
5	Milan	Donald

We can see above that we have performed a query on both tables.

Leave a Reply