Programster's Blog

Tutorials focusing on Linux, programming, and open source

LVM Snapshots

Snapshosts and COW

A snapshot is a point-in-time "image" of your filesystem or partition. LVM, BTRFS, AND ZFS all use a copy-on-write (COW) concept for creating snapshots, so that creating one is instant. Instant snapshotting allows you to create backups without having to turn off your server/processes in order to ensure data integrity/consistency.

Although ZFS, BTRFS, and LVM each use copy-on-write, each have their own different implementations. BTRFS and ZFS are filesystems that also manage your volumes, whereas LVM only works on a volume level on which you put the filesystem of your choice. Thus they are quite different. With LVM, you cannot rely on the snapshot being the backup because it may eventually disappear (more on that later), it just allows you to take take a consistent backup easily.

LVM Snapshot Concept

LVM is able to take a snapshot instantly because whenever the original blocks are about to be changed for the first time, the block is copied over to the snapshot partition before the change is applied. Subsequent changes to the same block do not cause this additional write. Thus, the snapshot partition starts out requiring no space, but will grow as more more changes are being written to the origin. Unfortunately, one needs to specify the size of the snapshot partition at the point of taking the snapshot, however, you can always extend the partition later. If your snapshot partition runs out of space as more of the original blocks are copied across, the snapshot is "dropped" as it is becomes unusable. Thus, you want to specify a size that will accommodate the number of changes that are likely to be made for the amount of time you need the snapshot for. Use the snapshot to perform a backup using rsync/duplicity and then remove it.

Filesystem Utilization Confusion

LVM snapshots can cause confusion because utilization at the volume level is not the same as the utilization of the filesystem. Tools like df and pydf tools output the filesystem utilization level, not the volume utilization. The snapshot partition will grow in usage and may run out of space but the filesystem is always static.

Getting Started With Installation

This section is for users who want to be able to snapshot their root filesystem by performing a fresh Ubuntu Server 12 installation so if you don't need to do this then skip ahead.

At the partitioning screen, it is a good idea to select "guided partitioning with LVM". You could also use encrypted LVM, it's up to you. I spent over 30 minutes trying to figure out how to manually configure an LVM setup with the root partition being on LVM, but gave up. One day I may revisit this. This is the only time I have found CentOS much easier than Ubuntu.

The important part is ensuring that you do NOT use the default maximum allocation size for guided partitioning. In fact it is a good idea to enter the minimum that you feel is reasonable and then grow your filesystem later as you need it! It is incredibly difficult to shrink a root filesystem (has to be done "offline"), but very easy to grow it. By not using the entire disk, you are allowing yourself the ability to create other LVMs later that will act as your snapshot volume(s).

Do NOT use the default of the maximum possible size as shown on the above screenshot!

Taking a Snapshot!

Now that you have an LVM filesystem set up with some free space on your volume group, you can take a snapshot. This is done by creating a new logical volume of a specified size and name. I will refer to this as the "snapshot volume" even though in my mind, the snapshot volume is the original that is never going to be touched now.

sudo lvcreate -L[SIZE_IN_MB]m -s -n [SNAPSHOT_NAME] /dev/[VG_NAME]/[ORIGINAL_LVM]

Don't skimp on the allocation size. If the original filesystem changes by more than this amount, the volume will be invalidated and not work The -s switch tells LVM that this is a snapshot, and the -n is for inputting the name of the snapshot

Mount The Snapshot

If you want to look at the original files, then all you need to do is mount the snapshot just like any other filesystem.

mkdir /my/mount/point
mount /dev/[VG_NAME]/[SNAPSHOT_NAME] /my/mount/point

Now if you run a tool such as df or pydf, then it will appear that you have two filesysems of near-exactly the same size, which may be a physical impossibility because it adds up to more than your drive's capacity

It's at this point that I would run my backup program (duplicity/scp/rsync etc) on that mounted volume before deleting the snapshot. This means that you can use incredibly small volumes for your snapshots as you only need enough to hold the changes that will take place whilst the backup is running.