Skip to content

Piyush Routray

  • Twitter
  • LinkedIn
  • About Piyush

Author: Piyush

How to encrypt files before sharing online?

As a Data Engineer, one faces the need to share files securely over the internet. An easy way of doing … More

Data Engineering, PGP, security

Optimized Import of Text Data for Analytics

Text data analysis is a staple use case for the Data Analytics world! There are multiple firms which enable capture … More

AWS Firehose, AWS S3, Data Engineering, Snowflake

How to copy a file, recreating the directory structure, using python?

To copy “Folder1/Folder2/file1” to “Folder3” source structure: Folder1/FolderA ………………….(not to be copied)Folder1/fileX…………………………(not to be copied)Folder1/Folder2/file1 desired destination structure: Folder3/Folder1/Folder2/file1 P.S. … More

Python, stackoverflow

Copying different S3 bucket folders with space(s) in their names.

s3cmd is an usual and handy tool for dealing with data in S3 from the command line. Usage Doc.For simple … More

AWS S3

Dealing with ‘Blocks with no live replicas’ in the HDFS

In a previous post, we dealt about ‘Under-replicated blocks in the HDFS‘. However, while decommissioning a couple of worker nodes, … More

datanode, hadoop, HDP

How to exclude certain column(s) while exporting a Hive Table to local file?

In a previous post, I documented How to export complete Hive table to a local file? While being required to get … More

hive

How to export complete Hive table to a local file?

Generally, solutions provided over the internet point towards: hive -e ‘select * from dbname.tablename;’ > /path/to/datadump.csv However, this will not … More

hive

How to color code ‘diff’ output on CLI?

TL;DR :  vim -d <file1> <file2> Comparing changes in files from time to time is very much the part and … More

Posts navigation

Older posts
Newer posts

Browse by Category

ambari AWS Firehose AWS S3 centOS 7 cloudera cron Databricks Data Engineering datanode DevOps docker hadoop HDP hive Hortonworks java jenkins kubernetes linux OpenJDK OracleJDK PGP platform administration PySpark Python s3cmd security Snowflake stackoverflow Windows

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 9 other subscribers
My Tweets
Create a website or blog at WordPress.com
  • Subscribe Subscribed
    • Piyush Routray
    • Already have a WordPress.com account? Log in now.
    • Piyush Routray
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...