Managing computer storage has been a long-standing challenge among system administrators, especially setups involving multiple disks and accommodating increasing needs for disk space. Solving these issues with physical disks alone will often be too inflexible for the specific demands of
What is LVM?
LVM stands for Logical Volume Management and is a technology to ease the management of multiple physical storage devices by turning them into virtual storage pools. To achieve this, LVM abstracts storage devices on three levels:
- Physical Volumes (PV): Represents a physical storage device, like an entire disk or a partition
- Volume Groups (VG): Combines multiple PVs into a single pool of storage
- Logical Volumes (LV): A logical partition from a VG that is used by the filesystem
The abstractions allow all disks to be turned into usable storage pools, which can then be allocated dynamically into logical volumes. Using this approach, a logical volume can span multiple disks without needing any further configuration.
Physical volumes (PV)
As noted earlier, a PV is created from a physical block storage device. This device can be almost anything, from a physical disk to a partition or even network attached storage. A PV is created with pvcreate
:
sudo pvcreate /dev/sdb
The disk /dev/sdb
is now ready to be used by a VG. To view it, you can use pvs
for a quick overview:
sudo pvs
which prints a list of all created PVs
PV VG Fmt Attr PSize PFree
/dev/sdb lvm2 --- 20.00g 20.00g
For more detailed information on a PV, use pvdisplay
:
sudo pvdisplay /dev/sdb
The output contains more useful information around the storage device:
"/dev/sdb" is a new physical volume of "20.00 GiB"
--- NEW Physical volume ---
PV Name /dev/sdb
VG Name
PV Size 20.00 GiB
Allocatable NO
PE Size 0
Total PE 0
Free PE 0
Allocated PE 0
PV UUID 95yrzQ-FW8f-G7TE-E5Fk-tOoV-ItCH-7NYhOS
If a PV is not needed anymore, remove it with pvremove
:
sudo pvremove /dev/sdb
If successful, the output will confirm the operation:
Labels on physical volume "/dev/sdb" successfully wiped.
Volume Groups (VG)
VGs represent a virtual storage pool by combining one or more PVs:
sudo vgcreate mypool /dev/sdb
The first argument mypool
is the name of the newly created VG. The example provided /dev/sdb
as the only PV, but you could also create a VG from multiple PVs at once:
sudo vgcreate mypool /dev/sdb /dev/sdc /dev/sdd
A short list of all available VGs can be viewed with vgs
:
sudo vgs
The output provides a brief summary of all VGs on the system:
VG #PV #LV #SN Attr VSize VFree
mypool 1 0 0 wz--n- <20.00g <20.00g
For more information about any single VG, use vgdisplay
instead:
sudo vgdisplay mypool
The output contains details about the VG in question:
--- Volume group ---
VG Name mypool
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size <20.00 GiB
PE Size 4.00 MiB
Total PE 5119
Alloc PE / Size 0 / 0
Free PE / Size 5119 / <20.00 GiB
VG UUID ePY7PB-mOfL-tlw6-yGJa-n4Rm-JTE4-0cNsTD
If a VG is not needed anymore, it can also be removed:
sudo vgremove mypool
Note that removing a VG does not change any of the PVs assigned to it. They must be manually removed or assigned to a different VG.
Adding and removing disks from volume groups
A volume group remains dynamic after creation, allowing PVs to be added or removed without downtime - as long as enough space remains to store all LVs.
Adding a disk is done through vgextend
:
sudo vgextend mypool /dev/sdc
The PV is now available to the mypool
VG. Removing it is done just as easily:
sudo vgreduce mypool /dev/sdc
Note that all data from LVs stored on the PV will be moved to a different PV in the pool. This operation may cause significant write overhead, and frequent resizing of VGs or LVs can lead to fragmentation of the VG. When decommissioning a disk from a VG, for example because SMART data indicates imminent failure or because it reached end of life, you may want to move it's contents to a different PV manually using pvmove:
sudo pvmove /dev/sdc /dev/sdb -i 1
The -i
flag instructs the command to report progress in 1 second intervals:
/dev/sdc: Moved: 0.78%
/dev/sdc: Moved: 11.99%
/dev/sdc: Moved: 24.69%
/dev/sdc: Moved: 37.23%
/dev/sdc: Moved: 49.34%
/dev/sdc: Moved: 61.72%
/dev/sdc: Moved: 73.91%
/dev/sdc: Moved: 85.90%
/dev/sdc: Moved: 97.19%
/dev/sdc: Moved: 100.00%
Note that both PVs need to be part of the same VG for pvmove
to work. After moving the PV contents, the old one can be safely removed from the VG.
Logical Volumes (LV)
A logical volume represents a virtual disk backed by the storage of a VG. An LV can span the entire VG, being physically stored across multiple PVs without any additional configuration. It can be created from a VG with lvcreate
:
sudo lvcreate --name data --size 10G mypool
The command creates an LV named data
in the mypool
VG, with a size of 10GB.
A list of all existing LVs can be obtained with lvs
:
sudo lvs
More details about any one LV are available through lvdisplay
:
sudo lvdigplay mypool/data
LVs are typically referred to prefixed with their VG name.
An LV can be used just like any other disk, for example by creating an ext4
filesystem and mounting it to a directory:
sudo mkfs.ext4 /dev/mypool/data sudo mount /dev/mypool/data /mnt
When an LV is no longer needed, you can remove it:
sudo lvremove mypool/data
Running lvremove
will typically prompt you to confirm the loss of data stored on the LV before removing it:
Do you really want to remove active logical volume mypool/data? [y/n]: y
Logical volume "data" successfully removed
Logical volume types
While simple LVs are already quite useful, different types of LVs can accommodate more complex needs as well. By providing a --type to the lvcreate command, the LV data handling is changed:
sudo lvcreate --type mirror --size 1G --name data2 mypool
The most common types are
linear
: The default when--type
is missing. Simply writes data linearly across all PVs in the VGmirror
: Writes data to two PVs, keeping one as backup in case of PV failurestriped
: Writes data across PVs in stripes, ensuring that read/write operations are distributed among multiple PVs to combine their I/O speedsraid5
,raid6
,raid10
: Common RAID levels. May need minimum numbers of PVs in the VGcache
,writecache
: Turns the LV into a read or write cache for other LVs in the VG. Usually done to speed up LVs relying on slow disks (hdd) by adding one or more SSDs as cachevdo
: Supports virtual data optimizer to provide inline compression and deduplication features to the LVsnapshot
: a copy-on-write snapshot storing changes made to the origin LV
Using redundant types (mirror, raid5/6) comes with additional restrictions: They all need evenly sized underlying PVS, and may not use all available space on larger PVs when mixed with smaller ones, reducing the available storage space. Additionally, raid5 needs at least 3 PVs, raid6 at least 4. Using partitions instead of disks for the PVs backing raid volumes may undo their data protection capabilities: If a RAID5 LV is created from 3 LVs using partitions on the same pyhsical disk, then a disk failure will destroy the entirety of the RAID5 data, rendering the RAID setup useless.
More advanced LV types may use multiple LVs internally. These internal LVs are hidden from lvs
by default, but can be enabled with the -a flag:
sudo lvs -a
The output will mark internal LVs by encapsulating them in square brackets:
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data2 mypool mwi-a-m--- 1.00g [data2_mlog] 84.38
[data2_mimage_0] mypool Iwi-aom--- 1.00g
[data2_mimage_1] mypool Iwi-aom--- 1.00g
[data2_mlog] mypool lwi-aom--- 4.00m
The example above shows data2 as the actual usable LV, then provides a list of the internal LVs needed to support it. Since data2 is a mirrored LV, it consists of two mirror images ("mimage") and a mirror log ("mlog") to keep track of syncing progress between the mirrors.
Recovering from a failed disk
For LVM types that provide redundancy (like mirror or raid5/6), a logical volume can be repaired through lvm tools. Let's assume a server has a volume group mypool
, consisting of 3 physical volumes. The mypool
volume group is used to create a logical volume called data
that uses RAID 5 for redundancy:
sudo pvcreate /dev/sd{b,c,d}
sudo vgcreate mypool /dev/sd{b,c,d}
sudo lvcreate --type raid5 --name data --size 30g mypool
If a disk in this volume group fails, the RAID5 algorithm will keep the data available, but the disk definitely needs to be replaced. Disk failure can take many forms, for our example we are simulating a failure by destroying data on one of the disks:
sudo dd if=/dev/zero of=/dev/sdd bs=4M count=1000
LVM will immediately pick up on this error when running the next command. Check the status of the logical volume:
sudo lvdisplay -m mypool/data
The output should start with several error lines indicating that the disk is damaged:
WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
WARNING: VG mypool is missing PV 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633 (last written to /dev/sdd).
WARNING: Couldn't find all devices for LV mypool/data_rimage_2 while checking used and assumed devices.
WARNING: Couldn't find all devices for LV mypool/data_rmeta_2 while checking used and assumed devices.
Look for the line starting with "LV Status":
LV Status available (partial)
If the status is "NOT available", you will have to first enable the LV before repairing it:
sudo lvchange -ay mypool/data
This may not be necessary, depending on how the disk failed. Our sample should not need this step, but a physical disk failure typically will.
Now that the logical volume is available again, we can start repairs by adding a new disk /dev/sde
to the mypool
volume group:
sudo pvcreate /dev/sde sudo vgextend mypool /dev/sde
If you don't want to replace the failed disk and have enough disk space, you may be able to skip this step.
Once enough disks and space are available to the mypool
volume group, the mypool/data
logical volume can be repaired:
sudo lvconvert --repair mypool/data
The errors in the output are expected, the command worked fine as long as the last message indicates success:
WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
WARNING: VG mypool is missing PV 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633 (last written to [unknown]).
WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
Faulty devices in mypool/data successfully replaced.
Looking at the logical volume again:
sudo lvdisplay -m mypool/data
the status should be "available" again:
LV Status available
Finally, remove the failed device from the volume group:
sudo vgreduce --removemissing mypool
The last line of the output should read
Wrote out consistent volume group mypool.
You may still see errors when running LVM commands after all the recovery steps. This is likely caused by cached data from lvmdevices. Check all cached LVM device data:
sudo lvmdevices
The output should list all devices used by LVM:
Device /dev/sdb IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB6c60d6e4-f66d3928 DEVNAME=/dev/sdb PVID=wK5s6SO9DB4UfAYzMrDkd2cs8uEnxgQ6
Device /dev/sdc IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB602df9bf-7b938418 DEVNAME=/dev/sdc PVID=W63QkONHeYPlc5gCHVnar4fqzyJ3wGaH
Device /dev/sdd IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB1b058113-2773e437 DEVNAME=/dev/sdd PVID=none
Device /dev/sde IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB3ccacebc-3c8c610c DEVNAME=/dev/sde PVID=mjvAs52GbIL0TX73l9njpfZa4YDdjGdw
Compare the devices in the output to find the faulty disk, it will typically contain PVID=none
at the end of the line. The sample above clearly shows /dev/sdd
as the problematic device, so we remove it from the cache:
sudo lvmdevices --deldev /dev/sdd
The errors should now be gone from all LVM commands.
Snapshots
Storage managed through LVM offers copy-on-write snapshots to support consistent backups or point-in-time recovery. A snapshot is treated like an LV, but will only store the changes made to the origin LV since it's creation, so it can be significantly smaller than the LV it is created from.
sudo lvcreate --type snapshot --name snap --size 100m
Snapshots behave like any other LV when listing:
sudo lvs
The output includes the new mypool/snap
snapshot LV:
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data mypool owi-a-s--- 10.00g
snap mypool swi-a-s--- 100.00m data 0.00
Note that the snapshot includes a value for the Data%
column. This value indicates how much space of the snapshot is used (by storing changes to origin data). Once the space usage becomes 100%, the snapshot will become unusable.
The snapshot can be accessed just like the origin PV, allowing backup programs a consistent view of data as of time of the snapshot, without running into issues with partial data changes during the backup process.
An LV can also be reverted back to the state of a snapshot:
sudo lvconvert --merge /dev/mypool/snap
The origin LV's contents will be reverted to the state they were in at the time mypool/snap
was created. Using this point in time recovery feature can be useful during system upgrades or when making large changes to the system configuration, adding the ability to undo the operations in case of errors.