Providing Elasticity to Data Node Storage in Hadoop through LVM
Set up Master Node :
Configuring /etc/hadoop/hdfs-site.xml file :
Configuring /etc/hadoop/core-site.xml file :
Stop the firewall temporarily :
systemctl stop firewalld
Make a directory for namenode :
mkdir /namenode
Format the namenode directory :
hadoop namenode -format
Start the namenode :
hadoop-daemon.sh start namenode
Namenode has been started successfully :
Set up Data Node :
Configuring /etc/hadoop/hdfs-site.xml file :
Configuring /etc/hadoop/core-site.xml file :
Make a directory for Data Node :
mkdir /datanode
Start the Data Node services :
hadoop-daemon.sh start datanode
Data Node started successfully :
Check the hadoop cluster report :
hadoop dfsadmin -report
If you notice, entire hard disk has been allocated to the Hadoop Name Node!!
Integrating LVM with the Data Node Storage Directory :
Now, I’ve attached 2 new hard disks of sizes 5GiB and 4GiB. Run the below command to view the hard disks :
fdisk -l
In my VM, the names of hard disks are “nvme0n2” and “nvme0n3”.
First step is to create physical volumes from hard disks :
pvcreate /dev/nvme0n2
pvcreate /dev/nvme0n3
Check if the physical volumes are created or not :
pvdisplay /dev/nvme0n2
pvdisplay /dev/nvme0n3
Next step is to create a Volume Group for the two physical volumes created :
vgcreate hadoop_VG /dev/nvme0n2 /dev/nvme0n3
Check if the volume group is created or not :
vgdisplay hadoop_VG
Next, create Logical Volume from the Volume Group. Here I’m creating a Logical Volume of 6GiB :
lvcreate --size 6G --name hadoop_LV hadoop_VG
Check if the Logical Volume is created or not :
lvdisplay hadoop_VG/hadoop_LV
Now format the Logical Volume created above :
mkfs.ext4 /dev/hadoop_VG/hadoop_LV
Now, create a directory and mount the logical volume to that directory :
mkdir /dir
mount /dev/hadoop_VG/hadoop_LV
Check if the LV is mounted or not with the command “df -h” :
Now, update the directory name inside /etc/hadoop/hdfs-site.xml file of Data Node :
Stop and then Start the Data Node :
hadoop-daemon.sh stop datanode
hadoop-daemon.sh start datanode
Check the hadoop report :
hadoop dfsadmin -report
See, the DataNode is contributing only ~ 6GiB Storage to the cluster !!
Increasing the size of Logical Volume :
Command to increase the size of Logical Volume by 2GiB :
lvextend --size +2G /dev/hadoop_VG/hadoop_LV
We have to format the newly extended 2GiB storage with the command :
resize2fs /dev/hadoop_VG/hadoop_LV
Check the hadoop cluster report. The size of storage will be increased :
Let’s upload a file to the cluster :
You can view the file in web view of hadoop :
Decreasing the size of Logical Volume :
Stop the datanode :
hadoop-daemon.sh stop datanode
First unmount the logical volume :
umount /dir
Clean and scan the logical volume :
e2fsck -f /dev/hadoop_VG/hadoop_LV
Recreate Inode table upto the required size. I’m decreasing it to 3GiB :
resize2fs /dev/hadoop_VG/hadoop_LV 3G
Now reduce the size of Logical Volume with lvreduce command :
lvreduce --size 3G /dev/hadoop_VG/hadoop_LV
Now, mount the LV :
mount /dev/hadoop_VG/hadoop_LV /dir
Start the datanode services :
hadoop-daemon.sh start datanode
Check the hadoop cluster report :
So, we have successfully reduced the storage to 3GiB. Let’s check the Data inside file.txt created previously :
Wow!! The file and the Data inside it is safe!!
So, in this way we can provide elasticity to the Data Node storage.
Thanks for reading