[aklug] Creepy fast backups using block level snapshots.. a bit like CDP but slower and cheaper.

From: Shane Spencer <shane@bogomip.com>
Date: Thu Aug 02 2012 - 17:14:56 AKDT

I'm putting this out there as a brainstorming brainfart. I've been
playing a bit with accessing the COW device of an LVM snapshot
directly and reading it in order to offload changed blocks to another
host quickly. It removes the need to do file system scans completely
when attempting to back up a drive using a full and increment backup
plan. Instead of focusing on files it focuses on the block device
underneath the filesystem. It is nowhere near OS or filesystem
agnostic.. my testing has been using LVM and XFS.

So here's my quicky log of commands that I'm running right now in
order to do the following:

  make a sparse file on a dedicated "snapshot" disk for consistently
running snapshots.
  make a loopback out of the file
  add the loopbacked file as an LVM PV
  extend my main VG
  make a snapshot using the entire size of the loopback file as the
COW backing device
  back up the original block device (costly.. left out for testing reasons)
  back up the current COW layer (fast!)

---
# dd if=/dev/zero of=snaptest.image bs=1024 seek=1000000 count=0
# losetup /dev/loop0 ./snaptest.image
# pvcreate /dev/loop0
  Writing physical volume data to disk "/dev/loop0"
  Physical volume "/dev/loop0" successfully created
# vgextend buckaroobanzai /dev/loop0
  Volume group "buckaroobanzai" successfully extended
# du -h -c -s snaptest.image
8.0K	snaptest.image
8.0K	total
# xfs_freeze -f /
# lvcreate --permission r -n snaptest -l 100%PVS --snapshot
buckaroobanzai/root /dev/loop0
  Logical volume "snaptest" created
# xfs_freeze -u /
## Here's the LV info
  --- Logical volume ---
  LV Path                /dev/buckaroobanzai/snaptest
  LV Name                snaptest
  VG Name                buckaroobanzai
  LV UUID                wVuODo-m9zA-C2st-5ZMH-KJGW-XCL4-HEgnym
  LV Write Access        read only
  LV Creation host, time buckaroobanzai, 2012-08-02 16:48:11 -0800
  LV snapshot status     active destination for root
  LV Status              available
  # open                 0
  LV Size                18.62 GiB
  Current LE             4768
  COW-table size         972.00 MiB
  COW-table LE           243
  Allocated to snapshot  0.00%
  Snapshot chunk size    4.00 KiB
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:6
  --- Segments ---
  Logical extent 0 to 242:
    Type		linear
    Physical volume	/dev/loop0
    Physical extents	0 to 242
## Lets copy some data around!
# rsync -avPHSh /boot/ /root/.deletemejusttesting/
  ...
sent 21.32M bytes  received 4.36K bytes  947.81K bytes/sec
total size is 21.31M  speedup is 1.00
# du -h -s snaptest.image
20M	snaptest.image
## Here's the tricky part where I force XFS to freeze a bit.
# xfs_freeze -f /
## Just testing
# md5sum snaptest.image
615b0d57989004074b06a6c4b03af37b  snaptest.image
# ddrescue -S snaptest.image snaptest.image.backup
# xfs_freeze -u /
# md5sum snaptest.image.backup;
615b0d57989004074b06a6c4b03af37b  snaptest.image.backup
---
Tada!  Now you would have also backed up, using ddrescue + some
compression, /dev/buckaroobanzai/snaptest itself.. as well as the
physical volume backing it.. you would have a before and after where
the before is gigantic.. and the after is incremental and ultimately
restorable.. plus it's 100% ready to ship off to a storage server
elsewhere.
The cons..
  Tools for this suck and mostly don't exist.
  It's pretty slow copying old blocks over to the backing device as
blocks are overwritten.. but not horribly slow.
  Restoring is HAAARRDDDDD
The pros..
  It's REALLY FAST TO BACK UP INCREMENTALLY!
The fuuutuuure..
  Block level deduplication and compression since the COW format is
incredibly easy to read and recreate.
So to get something like this into a position where it could be
considered awesome.. during the incremental XFS freeze you would
create a new snapshot and remove the old snapshot after it is
offloaded.
I'm seriously toying with the idea of just using the network block
device and MD or LVM in order to send changes directly off to backup
land.. and a cheeky python server on a backup system somewhere
emulating nbdserver :)
Also.. you'd have to zeroify free space on your filesystems quite a
bit in order to keep backups sparse... or have a good backup system
that stores blocks individually by hash and then deletes unused ones
as fulls are redone. (2 fulls at any time sounds like a good plan to
me)
Cheers!
- Shane
---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Thu Aug 2 17:15:05 2012

This archive was generated by hypermail 2.1.8 : Thu Aug 02 2012 - 17:15:05 AKDT