One fine evening, I came across a notification in our HDP cluster. This was result of planned node modification which resulted in the ephemeral disks losing data.
HDP would have managed this over time and replicated the relevant blocks. However, I wanted to trigger the process as soon as possible, so that I could take advantage of lower traffic to our cluster.
rm /tmp/under_replicated_files
hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files
ls -lh /tmp/under_replicated_files
for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done
Fixing /apps/hive/warehouse/database.db/tablename/000161_0 :Replication 3 set: /apps/hive/warehouse/database.db/tablename/000161_0.
.
.
1 Comment