As a Data Engineer, one faces the need to share files securely over the internet. An easy way of doing … More
Author: Piyush
Optimized Import of Text Data for Analytics
Text data analysis is a staple use case for the Data Analytics world! There are multiple firms which enable capture … More
How to copy a file, recreating the directory structure, using python?
To copy “Folder1/Folder2/file1” to “Folder3” source structure: Folder1/FolderA ………………….(not to be copied)Folder1/fileX…………………………(not to be copied)Folder1/Folder2/file1 desired destination structure: Folder3/Folder1/Folder2/file1 P.S. … More
Copying different S3 bucket folders with space(s) in their names.
s3cmd is an usual and handy tool for dealing with data in S3 from the command line. Usage Doc.For simple … More
Dealing with ‘Blocks with no live replicas’ in the HDFS
In a previous post, we dealt about ‘Under-replicated blocks in the HDFS‘. However, while decommissioning a couple of worker nodes, … More
How to exclude certain column(s) while exporting a Hive Table to local file?
In a previous post, I documented How to export complete Hive table to a local file? While being required to get … More
How to export complete Hive table to a local file?
Generally, solutions provided over the internet point towards: hive -e ‘select * from dbname.tablename;’ > /path/to/datadump.csv However, this will not … More
How to color code ‘diff’ output on CLI?
TL;DR : vim -d <file1> <file2> Comparing changes in files from time to time is very much the part and … More