A primer on LVM

Table of contents

Managing computer storage has been a long-standing challenge among system administrators, especially setups involving multiple disks and accommodating increasing needs for disk space. Solving these issues with physical disks alone will often be too inflexible for the specific demands of

What is LVM?

LVM stands for Logical Volume Management and is a technology to ease the management of multiple physical storage devices by turning them into virtual storage pools. To achieve this, LVM abstracts storage devices on three levels:


  • Physical Volumes (PV): Represents a physical storage device, like an entire disk or a partition

  • Volume Groups (VG): Combines multiple PVs into a single pool of storage

  • Logical Volumes (LV): A logical partition from a VG that is used by the filesystem

The abstractions allow all disks to be turned into usable storage pools, which can then be allocated dynamically into logical volumes. Using this approach, a logical volume can span multiple disks without needing any further configuration.

Physical volumes (PV)

As noted earlier, a PV is created from a physical block storage device. This device can be almost anything, from a physical disk to a partition or even network attached storage. A PV is created with pvcreate:

sudo pvcreate /dev/sdb

The disk /dev/sdb is now ready to be used by a VG. To view it, you can use pvs for a quick overview:

sudo pvs

which prints a list of all created PVs

PV        VG Fmt  Attr PSize  PFree 
/dev/sdb     lvm2 ---  20.00g 20.00g

For more detailed information on a PV, use pvdisplay:

sudo pvdisplay /dev/sdb

The output contains more useful information around the storage device:

"/dev/sdb" is a new physical volume of "20.00 GiB"
--- NEW Physical volume ---
PV Name              /dev/sdb
VG Name              
PV Size              20.00 GiB
Allocatable          NO
PE Size              0  
Total PE             0
Free PE              0
Allocated PE         0
PV UUID              95yrzQ-FW8f-G7TE-E5Fk-tOoV-ItCH-7NYhOS

If a PV is not needed anymore, remove it with pvremove:

sudo pvremove /dev/sdb

If successful, the output will confirm the operation:

Labels on physical volume "/dev/sdb" successfully wiped.

Volume Groups (VG)

VGs represent a virtual storage pool by combining one or more PVs:

sudo vgcreate mypool /dev/sdb

The first argument mypool is the name of the newly created VG. The example provided /dev/sdb as the only PV, but you could also create a VG from multiple PVs at once:

sudo vgcreate mypool /dev/sdb /dev/sdc /dev/sdd

A short list of all available VGs can be viewed with vgs:

sudo vgs

The output provides a brief summary of all VGs on the system:

VG     #PV #LV #SN Attr   VSize   VFree 
mypool   1   0   0 wz--n- <20.00g <20.00g

For more information about any single VG, use vgdisplay instead:

sudo vgdisplay mypool

The output contains details about the VG in question:

 --- Volume group ---
 VG Name              mypool
 System ID            
 Format               lvm2
 Metadata Areas       1
 Metadata Sequence No 1
 VG Access            read/write
 VG Status            resizable
 MAX LV               0
 Cur LV               0
 Open LV              0
 Max PV               0
 Cur PV               1
 Act PV               1
 VG Size              <20.00 GiB
 PE Size              4.00 MiB
 Total PE             5119
 Alloc PE / Size      0 / 0  
 Free PE / Size       5119 / <20.00 GiB
 VG UUID              ePY7PB-mOfL-tlw6-yGJa-n4Rm-JTE4-0cNsTD

If a VG is not needed anymore, it can also be removed:

sudo vgremove mypool

Note that removing a VG does not change any of the PVs assigned to it. They must be manually removed or assigned to a different VG.

Adding and removing disks from volume groups

A volume group remains dynamic after creation, allowing PVs to be added or removed without downtime - as long as enough space remains to store all LVs.

Adding a disk is done through vgextend:

sudo vgextend mypool /dev/sdc

The PV is now available to the mypool VG. Removing it is done just as easily:

sudo vgreduce mypool /dev/sdc

Note that all data from LVs stored on the PV will be moved to a different PV in the pool. This operation may cause significant write overhead, and frequent resizing of VGs or LVs can lead to fragmentation of the VG. When decommissioning a disk from a VG, for example because SMART data indicates imminent failure or because it reached end of life, you may want to move it's contents to a different PV manually using pvmove:

sudo pvmove /dev/sdc /dev/sdb -i 1

The -i flag instructs the command to report progress in 1 second intervals:

 /dev/sdc: Moved: 0.78%
 /dev/sdc: Moved: 11.99%
 /dev/sdc: Moved: 24.69%
 /dev/sdc: Moved: 37.23%
 /dev/sdc: Moved: 49.34%
 /dev/sdc: Moved: 61.72%
 /dev/sdc: Moved: 73.91%
 /dev/sdc: Moved: 85.90%
 /dev/sdc: Moved: 97.19%
 /dev/sdc: Moved: 100.00%

Note that both PVs need to be part of the same VG for pvmove to work. After moving the PV contents, the old one can be safely removed from the VG.

Logical Volumes (LV)

A logical volume represents a virtual disk backed by the storage of a VG. An LV can span the entire VG, being physically stored across multiple PVs without any additional configuration. It can be created from a VG with lvcreate:

sudo lvcreate --name data --size 10G mypool

The command creates an LV named data in the mypool VG, with a size of 10GB.

A list of all existing LVs can be obtained with lvs:

sudo lvs

More details about any one LV are available through lvdisplay:

sudo lvdigplay mypool/data

LVs are typically referred to prefixed with their VG name.

An LV can be used just like any other disk, for example by creating an ext4 filesystem and mounting it to a directory:

sudo mkfs.ext4 /dev/mypool/data
sudo mount /dev/mypool/data /mnt

When an LV is no longer needed, you can remove it:

sudo lvremove mypool/data

Running lvremove will typically prompt you to confirm the loss of data stored on the LV before removing it:

Do you really want to remove active logical volume mypool/data? [y/n]: y
Logical volume "data" successfully removed

Logical volume types

While simple LVs are already quite useful, different types of LVs can accommodate more complex needs as well. By providing a --type to the lvcreate command, the LV data handling is changed:

sudo lvcreate --type mirror --size 1G --name data2 mypool

The most common types are

  • linear: The default when --type is missing. Simply writes data linearly across all PVs in the VG
  • mirror: Writes data to two PVs, keeping one as backup in case of PV failure
  • striped: Writes data across PVs in stripes, ensuring that read/write operations are distributed among multiple PVs to combine their I/O speeds
  • raid5, raid6, raid10: Common RAID levels. May need minimum numbers of PVs in the VG
  • cache, writecache: Turns the LV into a read or write cache for other LVs in the VG. Usually done to speed up LVs relying on slow disks (hdd) by adding one or more SSDs as cache
  • vdo: Supports virtual data optimizer to provide inline compression and deduplication features to the LV
  • snapshot: a copy-on-write snapshot storing changes made to the origin LV

Using redundant types (mirror, raid5/6) comes with additional restrictions: They all need evenly sized underlying PVS, and may not use all available space on larger PVs when mixed with smaller ones, reducing the available storage space. Additionally, raid5 needs at least 3 PVs, raid6 at least 4. Using partitions instead of disks for the PVs backing raid volumes may undo their data protection capabilities: If a RAID5 LV is created from 3 LVs using partitions on the same pyhsical disk, then a disk failure will destroy the entirety of the RAID5 data, rendering the RAID setup useless.

More advanced LV types may use multiple LVs internally. These internal LVs are hidden from lvs by default, but can be enabled with the -a flag:

sudo lvs -a

The output will mark internal LVs by encapsulating them in square brackets:

LV               VG      Attr        LSize  Pool Origin Data% Meta% Move Log          Cpy%Sync Convert
data2            mypool  mwi-a-m---  1.00g                               [data2_mlog] 84.38          
[data2_mimage_0] mypool  Iwi-aom---  1.00g                                                            
[data2_mimage_1] mypool  Iwi-aom---  1.00g                                                            
[data2_mlog]     mypool  lwi-aom---  4.00m                                                        

The example above shows data2 as the actual usable LV, then provides a list of the internal LVs needed to support it. Since data2 is a mirrored LV, it consists of two mirror images ("mimage") and a mirror log ("mlog") to keep track of syncing progress between the mirrors.

Recovering from a failed disk

For LVM types that provide redundancy (like mirror or raid5/6), a logical volume can be repaired through lvm tools. Let's assume a server has a volume group mypool, consisting of 3 physical volumes. The mypool volume group is used to create a logical volume called data that uses RAID 5 for redundancy:

sudo pvcreate /dev/sd{b,c,d}
sudo vgcreate mypool /dev/sd{b,c,d}
sudo lvcreate --type raid5 --name data --size 30g mypool

If a disk in this volume group fails, the RAID5 algorithm will keep the data available, but the disk definitely needs to be replaced. Disk failure can take many forms, for our example we are simulating a failure by destroying data on one of the disks:

sudo dd if=/dev/zero of=/dev/sdd bs=4M count=1000

LVM will immediately pick up on this error when running the next command. Check the status of the logical volume:

sudo lvdisplay -m mypool/data

The output should start with several error lines indicating that the disk is damaged:

 WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
 WARNING: VG mypool is missing PV 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633 (last written to /dev/sdd).
 WARNING: Couldn't find all devices for LV mypool/data_rimage_2 while checking used and assumed devices.
 WARNING: Couldn't find all devices for LV mypool/data_rmeta_2 while checking used and assumed devices.

Look for the line starting with "LV Status":

LV Status             available (partial)

If the status is "NOT available", you will have to first enable the LV before repairing it:

sudo lvchange -ay mypool/data

This may not be necessary, depending on how the disk failed. Our sample should not need this step, but a physical disk failure typically will.

Now that the logical volume is available again, we can start repairs by adding a new disk /dev/sde to the mypool volume group:

sudo pvcreate /dev/sde
sudo vgextend mypool /dev/sde

If you don't want to replace the failed disk and have enough disk space, you may be able to skip this step.

Once enough disks and space are available to the mypool volume group, the mypool/data logical volume can be repaired:

sudo lvconvert --repair mypool/data

The errors in the output are expected, the command worked fine as long as the last message indicates success:

 WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
 WARNING: VG mypool is missing PV 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633 (last written to [unknown]).
 WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
 WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
 WARNING: Couldn't find device with uuid 0fEfvu-QJ7m-oarc-Kma9-Qnoj-6lDT-d8x633.
 Faulty devices in mypool/data successfully replaced.

Looking at the logical volume again:

sudo lvdisplay -m mypool/data

the status should be "available" again:

LV Status             available

Finally, remove the failed device from the volume group:

sudo vgreduce --removemissing mypool

The last line of the output should read

 Wrote out consistent volume group mypool.

You may still see errors when running LVM commands after all the recovery steps. This is likely caused by cached data from lvmdevices. Check all cached LVM device data:

sudo lvmdevices

The output should list all devices used by LVM:

Device /dev/sdb IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB6c60d6e4-f66d3928 DEVNAME=/dev/sdb PVID=wK5s6SO9DB4UfAYzMrDkd2cs8uEnxgQ6
Device /dev/sdc IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB602df9bf-7b938418 DEVNAME=/dev/sdc PVID=W63QkONHeYPlc5gCHVnar4fqzyJ3wGaH
Device /dev/sdd IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB1b058113-2773e437 DEVNAME=/dev/sdd PVID=none
Device /dev/sde IDTYPE=sys_wwid IDNAME=t10.ATA_VBOX_HARDDISK_VB3ccacebc-3c8c610c DEVNAME=/dev/sde PVID=mjvAs52GbIL0TX73l9njpfZa4YDdjGdw

Compare the devices in the output to find the faulty disk, it will typically contain PVID=none at the end of the line. The sample above clearly shows /dev/sdd as the problematic device, so we remove it from the cache:

sudo lvmdevices --deldev /dev/sdd

The errors should now be gone from all LVM commands.

Snapshots

Storage managed through LVM offers copy-on-write snapshots to support consistent backups or point-in-time recovery. A snapshot is treated like an LV, but will only store the changes made to the origin LV since it's creation, so it can be significantly smaller than the LV it is created from.

sudo lvcreate --type snapshot --name snap --size 100m

Snapshots behave like any other LV when listing:

sudo lvs

The output includes the new mypool/snap snapshot LV:

LV    VG      Attr       LSize   Pool Origin Data% Meta% Move Log         Cpy%Sync Convert
data  mypool  owi-a-s---  10.00g                                                           
snap  mypool  swi-a-s--- 100.00m      data   0.00                                           

Note that the snapshot includes a value for the Data% column. This value indicates how much space of the snapshot is used (by storing changes to origin data). Once the space usage becomes 100%, the snapshot will become unusable.

The snapshot can be accessed just like the origin PV, allowing backup programs a consistent view of data as of time of the snapshot, without running into issues with partial data changes during the backup process.

An LV can also be reverted back to the state of a snapshot:

sudo lvconvert --merge /dev/mypool/snap

The origin LV's contents will be reverted to the state they were in at the time mypool/snap was created. Using this point in time recovery feature can be useful during system upgrades or when making large changes to the system configuration, adding the ability to undo the operations in case of errors.

More articles

Automated security hardening on RockyLinux with OpenSCAP

Securing enterprise linux in less than a minute

Choosing the right RAID setup

Making sense of pros and cons for RAID configurations

A gentle introduction to systemd

A broad overview of systemd components and features

Modern linux networking basics

Getting started with systemd-networkd, NetworkManager and the iproute2 suite

Understanding how RAID 5 works

Striking a balance between fault tolerance, speed and usable disk space

A practical guide to filesystems

Picking the best option to format your next drive