Dealing with ‘under-replicated’ blocks in the HDFS

PiyushJanuary 29, 2018April 30, 2019Hadoop Administration

Capture One fine evening, I came across a notification in our HDP cluster. This was result of planned node modification which resulted in the ephemeral disks losing data.

HDP would have managed this over time and replicated the relevant blocks. However, I wanted to trigger the process as soon as possible, so that I could take advantage of lower traffic to our cluster.

rm /tmp/under_replicated_files

hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files

ls -lh /tmp/under_replicated_files

for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done

Fixing  /apps/hive/warehouse/database.db/tablename/000161_0 :Replication 3 set: /apps/hive/warehouse/database.db/tablename/000161_0.
.
.

Published by Piyush

View all posts by Piyush

1 Comment

Pingback: Dealing with ‘Blocks with no live replicas’ in the HDFS – Piyush Routray

Leave a comment Cancel reply