How to export complete Hive table to a local file?

Generally, solutions provided over the internet point towards:

hive -e ‘select * from dbname.tablename;’ > /path/to/datadump.csv

However, this will not work for Big Data!

For a table with multi million rows, I generally have to use a command similar to this:

INSERT OVERWRITE LOCAL DIRECTORY 
'/path/to/datadump_FOLDER' 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
select * from dbname.tablename;

The important thing to notice here is that you are providing path to a folder and not just the filename. (Why? discussed in a bit!) Also, as you notice, it is easy to change the delimiter to ‘\t’ or ‘|’. The files in the folder location will have the delimiters of your choice.

Now, about the folder. Make sure your user have necessary permissions to write to that folder (duh!). Since we are dealing with Big Data here, the data will be written in parallel to this folder as multiple files. As, the output is from reducers, it is written as 000_001 … 000_00n.

You may then navigate to that folder and generate a single file with all the files combined.

cd /path/to/datadump_FOLDER 
cat 00* >> datadump.csv

Following this blog post will be “How to exclude certain column(s) while exporting a Hive Table to local file?”

How to export complete Hive table to a local file?

Published by Piyush

Leave a comment Cancel reply

Share this:

Published by Piyush

Leave a comment Cancel reply