Providing Elasticity to Data Node Storage in Hadoop through LVM

Akurathi Sri Krishna Sagar

5 min readDec 27, 2020

--

Set up Master Node :

Configuring /etc/hadoop/hdfs-site.xml file :

Configuring /etc/hadoop/core-site.xml file :

Stop the firewall temporarily :

systemctl stop firewalld

Make a directory for namenode :

mkdir /namenode

Format the namenode directory :

hadoop namenode -format

Start the namenode :

hadoop-daemon.sh start namenode

Namenode has been started successfully :

Set up Data Node :

Configuring /etc/hadoop/hdfs-site.xml file :

Configuring /etc/hadoop/core-site.xml file :

Make a directory for Data Node :

mkdir /datanode

Start the Data Node services :

hadoop-daemon.sh start datanode

Data Node started successfully :

Check the hadoop cluster report :

hadoop dfsadmin -report

If you notice, entire hard disk has been allocated to the Hadoop Name Node!!

Integrating LVM with the Data Node Storage Directory :

Now, I’ve attached 2 new hard disks of sizes 5GiB and 4GiB. Run the below command to view the hard disks :

fdisk -l

In my VM, the names of hard disks are “nvme0n2” and “nvme0n3”.

First step is to create physical volumes from hard disks :

pvcreate /dev/nvme0n2
pvcreate /dev/nvme0n3

Check if the physical volumes are created or not :

pvdisplay /dev/nvme0n2
pvdisplay /dev/nvme0n3

Next step is to create a Volume Group for the two physical volumes created :

vgcreate hadoop_VG /dev/nvme0n2 /dev/nvme0n3

Check if the volume group is created or not :

vgdisplay hadoop_VG

Next, create Logical Volume from the Volume Group. Here I’m creating a Logical Volume of 6GiB :

lvcreate --size 6G --name hadoop_LV hadoop_VG

Check if the Logical Volume is created or not :

lvdisplay hadoop_VG/hadoop_LV

Now format the Logical Volume created above :

mkfs.ext4 /dev/hadoop_VG/hadoop_LV

Now, create a directory and mount the logical volume to that directory :

mkdir /dir
mount /dev/hadoop_VG/hadoop_LV

Check if the LV is mounted or not with the command “df -h” :

Now, update the directory name inside /etc/hadoop/hdfs-site.xml file of Data Node :

Stop and then Start the Data Node :

hadoop-daemon.sh stop datanode
hadoop-daemon.sh start datanode

Check the hadoop report :

hadoop dfsadmin -report

See, the DataNode is contributing only ~ 6GiB Storage to the cluster !!

Increasing the size of Logical Volume :

Command to increase the size of Logical Volume by 2GiB :

lvextend --size +2G /dev/hadoop_VG/hadoop_LV

We have to format the newly extended 2GiB storage with the command :

resize2fs /dev/hadoop_VG/hadoop_LV

Check the hadoop cluster report. The size of storage will be increased :

Let’s upload a file to the cluster :

You can view the file in web view of hadoop :

Decreasing the size of Logical Volume :

Stop the datanode :

hadoop-daemon.sh stop datanode

First unmount the logical volume :

umount /dir

Clean and scan the logical volume :

e2fsck -f /dev/hadoop_VG/hadoop_LV

Recreate Inode table upto the required size. I’m decreasing it to 3GiB :

resize2fs /dev/hadoop_VG/hadoop_LV 3G

Now reduce the size of Logical Volume with lvreduce command :

lvreduce --size 3G /dev/hadoop_VG/hadoop_LV

Now, mount the LV :

mount /dev/hadoop_VG/hadoop_LV /dir

Start the datanode services :

hadoop-daemon.sh start datanode

Check the hadoop cluster report :

So, we have successfully reduced the storage to 3GiB. Let’s check the Data inside file.txt created previously :

Wow!! The file and the Data inside it is safe!!

So, in this way we can provide elasticity to the Data Node storage.

Thanks for reading

Akurathi Sri Krishna Sagar

Written by Akurathi Sri Krishna Sagar

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams