Providing Elasticity to Data Node Storage in Hadoop through LVM

Akurathi Sri Krishna Sagar
5 min readDec 27, 2020

Set up Master Node :

Configuring /etc/hadoop/hdfs-site.xml file :

Configuring /etc/hadoop/core-site.xml file :

Stop the firewall temporarily :

systemctl stop firewalld

Make a directory for namenode :

mkdir /namenode

Format the namenode directory :

hadoop namenode -format

Start the namenode :

hadoop-daemon.sh start namenode

Namenode has been started successfully :

Set up Data Node :

Configuring /etc/hadoop/hdfs-site.xml file :

Configuring /etc/hadoop/core-site.xml file :

Make a directory for Data Node :

mkdir /datanode

Start the Data Node services :

hadoop-daemon.sh start datanode

Data Node started successfully :

Check the hadoop cluster report :

hadoop dfsadmin -report

If you notice, entire hard disk has been allocated to the Hadoop Name Node!!

Integrating LVM with the Data Node Storage Directory :

Now, I’ve attached 2 new hard disks of sizes 5GiB and 4GiB. Run the below command to view the hard disks :

fdisk -l

In my VM, the names of hard disks are “nvme0n2” and “nvme0n3”.

First step is to create physical volumes from hard disks :

pvcreate /dev/nvme0n2
pvcreate /dev/nvme0n3

Check if the physical volumes are created or not :

pvdisplay /dev/nvme0n2
pvdisplay /dev/nvme0n3

Next step is to create a Volume Group for the two physical volumes created :

vgcreate hadoop_VG /dev/nvme0n2 /dev/nvme0n3

Check if the volume group is created or not :

vgdisplay hadoop_VG

Next, create Logical Volume from the Volume Group. Here I’m creating a Logical Volume of 6GiB :

lvcreate --size 6G --name hadoop_LV hadoop_VG

Check if the Logical Volume is created or not :

lvdisplay hadoop_VG/hadoop_LV

Now format the Logical Volume created above :

mkfs.ext4 /dev/hadoop_VG/hadoop_LV

Now, create a directory and mount the logical volume to that directory :

mkdir /dir
mount /dev/hadoop_VG/hadoop_LV

Check if the LV is mounted or not with the command “df -h” :

Now, update the directory name inside /etc/hadoop/hdfs-site.xml file of Data Node :

Stop and then Start the Data Node :

hadoop-daemon.sh stop datanode
hadoop-daemon.sh start datanode

Check the hadoop report :

hadoop dfsadmin -report

See, the DataNode is contributing only ~ 6GiB Storage to the cluster !!

Increasing the size of Logical Volume :

Command to increase the size of Logical Volume by 2GiB :

lvextend --size +2G /dev/hadoop_VG/hadoop_LV

We have to format the newly extended 2GiB storage with the command :

resize2fs /dev/hadoop_VG/hadoop_LV

Check the hadoop cluster report. The size of storage will be increased :

Let’s upload a file to the cluster :

You can view the file in web view of hadoop :

Decreasing the size of Logical Volume :

Stop the datanode :

hadoop-daemon.sh stop datanode

First unmount the logical volume :

umount /dir

Clean and scan the logical volume :

e2fsck -f /dev/hadoop_VG/hadoop_LV

Recreate Inode table upto the required size. I’m decreasing it to 3GiB :

resize2fs /dev/hadoop_VG/hadoop_LV 3G

Now reduce the size of Logical Volume with lvreduce command :

lvreduce --size 3G /dev/hadoop_VG/hadoop_LV

Now, mount the LV :

mount /dev/hadoop_VG/hadoop_LV /dir

Start the datanode services :

hadoop-daemon.sh start datanode

Check the hadoop cluster report :

So, we have successfully reduced the storage to 3GiB. Let’s check the Data inside file.txt created previously :

Wow!! The file and the Data inside it is safe!!

So, in this way we can provide elasticity to the Data Node storage.

Thanks for reading

--

--