Posts Tagged ‘openfiler’

Backup KVM Virtual Machines on OCFS2 Filesystems on an OpenFiler SAN Appliance

April 13th, 2010

We’re using a large OCFS2 partition shared across multiple hosts to provide shared storage for our KVM virtual machines.  We chose OCFS2 because it has a relatively easy setup when compared to GFS (especially since we use Ubuntu as the host OS). The main drawback however is you can’t mount a snapshot of an OCFS2 volume on the same host as the machine that has the live filesystem mounted.

For a long time now we’ve been backing up our OpenFiler-based SAN by taking snapshots using the OpenFiler web interface, and then on a separate machine using a combination of dd and ssh to take a block-level image of the SAN and archive it off for possible disaster recovery.

It works quite well but restoring anything from that image is a total pain. I compress the image using gzip too so firstly I have to find enough disk space on a different machine to extract the archive and then a significant amount of hassle to get OCFS2 setup on a standalone PC and to mount the file image. It follows then that recovering one VM from the image can take days to complete – which isn’t very satisfactory.

Another option would be to share the snapshot devices on the SAN as a new target and then mount them remotely on a server outside the cluster. That’s ideal, except that when you modify iSCSI targets or luns on OpenFiler it reloads the ietd daemon which has a nasty habit of causing the two Ubuntu hosts to stop seeing the target and taking the disk offline.

After some work I discovered you can manually tell iet to start sharing a new target or lun from the command line without needing to restart the service.

Here then is the command history I used:

  1. Take a manual snapshot on OpenFiler (I called it “backup”)
  2. From the command line on the OpenFiler appliance, add a new iSCSI target: ietadm –op new –tid=0 –params Name=iqn.2008-05.com.example:backup
  3. Find the target ID for the new target we just created. There’s a list in the /proc/net/iet/volume file. In my case I got tid:3
  4. Add the snapshot to the target. I’m using blockio with my iscsi targets. The default is fileio:  ietadm –op new –tid=3 –lun=0 –params Type=blockio,Path=/dev/vm/of.snapshot.vm1.backup
  5. Check the volume file again. You should see the export complete with the path to the snapshot.
  6. Connect to, mount and backup the snapshot filesystem using a different PC.
  7. Unmount the filesystem and disconnect from the iSCSI target
  8. Remove the lun: ietadm –op delete –tid=3 –lun=0
  9. Remove the target: ietadm –op delete –tid=3
  10. Remove the snapsh

You can now connect to the iSCSI target from a remote machine, mount the OCFS2 filesystem as needed and backup the files from the filesystem. There’s no guarantee that the images will be consistent but my experience is the machines normally come up fine after an fsck. Any lost data can be restored from the nightly backups of the machines that we take using BackupPC.

My next move is to script all this so that the OpenFiler appliance can kick off a full backup on a schedule.

Update

Here’s the finished script:

#!/bin/bash

# Create a snapshot
echo Creating Backup Snapshot
lvcreate -L100000M -s -n backup /dev/vm/vm1

# Export the snapshot
echo Exporting Snapshot over iSCSI
ietadm –op new –tid=3 –params Name=iqn.2008-05.com.example:backup
ietadm –op new –tid=3 –lun=0 –params Type=blockio,Path=/dev/vm/backup

# Sleep a few seconds to make sure iet is sharing the device
sleep 2

# Connect to nas-backup and connect to the share
echo Connecting to iSCSI target
ssh root@backup iscsiadm -m discovery -t sendtargets -p 192.168.0.10
ssh root@backup iscsiadm -m node –targetname iqn.2008-05.com.example:backup –portal 192.168.0.10:3260 –login
sleep 30

# Mount the share
echo Mounting filesystem
ssh root@backup mount /dev/sde1 /mnt/vm

# Backup
echo Backup Starting
read -p “Press a key when your backup is complete!”

# Unmount the share
echo Unmounting filesystem
ssh root@backup umount /mnt/vm
sleep 2

# Disconnect
echo Disconnecting from iSCSI share
ssh root@nas-backup iscsiadm -m node –targetname iqn.2008-05.com.example:backup –portal 192.168.0.10 –logout
sleep 10

# Remove the iscsi lun and target
echo Removing iSCSI share
ietadm –op delete –tid=3 –lun=0
ietadm –op delete –tid=3

# Remove the snapshot
echo Removing Snapshot
lvremove -f /dev/vm/backup

# Finished