Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Gursimar Singh
2 min readMar 26, 2021

--

Logical Volume Management(LVM) enables the combining of multiple individual hard drives or disk partitions into a single volume group (VG). That volume group can then be subdivided into logical volumes (LV) or used as a single large volume. Regular file systems, such as ext3 or ext4, can then be created on a logical volume.

Step 1 : Add physical Harddisks to our datanode, here I have added two HD:

/dev/sdb (20GiB) and

/dev/sdc (20GiB)

* To check it is successfully attached or not run the command :

# fdisk -l

Step 2 : We have to convert this HD into Physical Volume (PV)

# pvcreate/dev/sdb(first HD) /dev/sdc (second HD)

Step 3 : Create Volume Group (VG) with physical volumes

# vgcreate vg_name /dev/sdb /dev/sdc

* To see whether the VG is created or not use command :

# vgdisplay vg_name

Step 4 : Create partition i.e. Logical Volume (LV) of a volume group of size you want to contribute to namenode. Here we will be contributing 25GB.

# lvcreate — size 25G — name LV_name VG

For using the new partition for storing any data we have to format it first.

Step 5 : Format the partition using command :

# mkfs.ext4 /dev/VG_name/LV_name

Step 6 : Mount that partition on datanode folder (/dn) use command :

# mount /dev/VG_name/LV_name /dn

Step 7 : Start the datanode daemon service and check the volume contribution to namenode.

# hadoop-daemon.sh start datanode

On the fly we can increase/decrease the storage to be contributed to namenode without unmounting or stopping any services.

We can only able to increase the size upto the space available currently in volume group (here 40GB). So check for size availability .

Step 8 : For extending the volume contribution use command :

# lvextend — size +7G /dev/VG_name/LV_name

Step 9 : Format the extended part use the command as:

# resize2fs /dev/VG_name/LV_name

Step 10 : Now again check the size of volume contribution of datanode to namenode.

# hadoop dfsadmin -report

We can clearly see that on the fly we have increased the size of storage from 25 GB to 32 GB.

--

--

Gursimar Singh
Gursimar Singh

Written by Gursimar Singh

Google Developers Educator | Speaker | Consultant | Author @ freeCodeCamp | DevOps | Cloud Computing | Data Science and more

No responses yet