We’re running an OpenFiler based iSCSI SAN at work and have been very pleased with both its speed and reliability, however I was copying some VM images around the other day and it seemed to me that the disk IO was a little laboured.
A quick Google for lvm2 snapshot performance brought me to an article by Dennis van Dok which prompted me to do a little testing of my own.
Here’s the raw results of the tests Dennis proposes on our system:
Without Snapshot
root@virgil:/mnt/vm# sync; time sh -c “dd if=/dev/zero of=/mnt/vm/test bs=1M count=1000; sync”
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.79423 s, 584 MB/sreal 0m9.697s
user 0m0.000s
sys 0m2.530sWith Snapshot
root@virgil:/mnt/vm# rm test
root@virgil:/mnt/vm# sync; time sh -c “dd if=/dev/zero of=/mnt/vm/test bs=1M count=1000; sync”
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.67537 s, 626 MB/sreal 0m49.165s
user 0m0.000s
sys 0m2.020s
root@virgil:/mnt/vm#
Now I didn’t carry that out in optimal test conditions. The filesystem is mounted over iSCSI and there’s caching and multipathing in the mix too so this is far from scientific. It’s quite possible that some real traffic from one of the VMs has skewed the result somewhat, but none the less that’s a 5x slow down on write with a snapshot compared to without.
So it’s a tricky dilema. Snapshots are so convenient yet appear to impose such a penalty. We’re going away to rethink how we “do” snapshots.