Description of problem: Running a virsh migrate --copy-storage-all, will take a 'sparse' image on the source, and create a bloated image on destination. Walkthrough of my steps are below. How reproducible: 100% Steps to Reproduce: migration from iron3 -> iron4 without using shared storage. # first see my sparse file [root@iron3 images]# ls -lAhs [...snip] 7.6G -rwxr-xr-x. 1 root root 500G Apr 29 18:51 ncr.raw # apparently you need to "prepare" the destination. if not, the migrate fails with: error: unable to set user and group to '107:107' on '/var/lib/libvirt/images/ncr.raw': No such file or directory # should I be doing this, or is there a better way ? Please note, this new image *is* sparse (and empty). [root@iron4 images] time qemu-img create -f raw ncr.raw 500G Formatting 'ncr.raw', fmt=raw size=536870912000 # okay, go time! [root@iron3 ~]# time virsh migrate --verbose --live --copy-storage-all ncr qemu+ssh://iron4/system Migration: [ 0 %] ... [...finishes successfully after 78 minutes!] # note: this took so long because it didn't copy it in a "sparse" way (such as rsync -S can) [root@iron4 images]# ls -lAhs total 501G 501G -rw-r--r-- 1 qemu qemu 500G Apr 29 21:04 ncr.raw [root@iron4 images]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_00-lv_root 621G 1009M 588G 1% / tmpfs 24G 0 24G 0% /dev/shm /dev/sda1 504M 42M 437M 9% /boot /dev/mapper/vg_00-lv_home 504M 17M 462M 4% /home /dev/mapper/vg_00-lv_var 1008G 501G 457G 53% /var # I can "fix" the image with qemu-img, but ideally it should stay sparse to begin with! What have I done wrong? Maybe this is a bug. [root@iron4 images]# mv ncr.raw ncr.raw.original [root@iron4 images]# time qemu-img convert ncr.raw.original ncr.raw Actual results: Destination file is not sparse. Expected results: Destination file should also be sparse. As an intermediary hack, it could even make sense to pipe the incoming image through the library that qemu-img uses to do it's 'convert', so that the output comes out "clean" (aka sparse) Additional info: Thank you for looking into this.
Another work-around is to use rsync rather than --copy-storage-all. The following example assumes - a single qcow2 file in the default pool - ssh transport #!/bin/bash VM=$1 TARGET=$2 vm_path=/var/lib/libvirt/images/$VM.img rsync -S --progress $vm_path root@$TARGET:$vm_path && \ virsh migrate --live --suspend --verbose $VM qemu+ssh://$TARGET/system && \ rsync -S --progress $vm_image root@$TARGET:$vm_path && \ virsh -c qemu+ssh://$TARGET/system resume $VM # end script Results in a slightly longer suspended period, but saves disk space.
The same ballooning of sparse images occurs when sparse qcow2 images are used as well.
This issue is still under investigation.
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2015-April/msg00130.html
During review it was pointed out that this can hardly be a libvirt issue since user pre-creates the storage himself. I suspect qemu does not preserve sparse files during storage migration. Switching over to qemu then.
I think the issue is that drive-mirror (on the source) and run-time NBD server (on the destination) don't have anything like the has_zero_init logic that qemu-img convert uses to preserve sparseness. The legacy block migration (migrate -b) feature preserved sparseness. It checked for zeroes on the source host and sent a special flag to the destination host instead of the full zero data.
This is workaround, to quickly migrate an live VM with sparsed qcow2 file at destination (btw, tar -S copies it very fast) #!/bin/bash VM=$1 TARGET=$2 STOR="/home/guest_images/" cd $STOR virsh snapshot-create-as $VM mig --disk-only --atomic virsh snapshot-delete $VM mig --metadata tar --totals --checkpoint=.8192 -Scvf - $VM.qcow2 | ssh $TARGET "tar -C $STOR -xf -" virsh suspend $VM tar --totals -Scvf - $VM.mig | ssh $TARGET "tar -C $STOR -xf -" virsh migrate --live --undefinesource --persistent --verbose $VM qemu+ssh://$TARGET/system virsh -c qemu+ssh://$TARGET/system blockcommit $VM vda --active --pivot --verbose virsh -c qemu+ssh://$TARGET/system resume $VM ssh $TARGET "cd $STOR; rm -f $VM.mig"
Since this is essentially a feature request, tracking this bug in the fedora tracker isn't going accomplish much. There's already a RHEL bug[1] and an upstream qemu[2] bug, that should be sufficient [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1219541 [2]: https://bugs.launchpad.net/qemu/+bug/1449687 *** This bug has been marked as a duplicate of bug 1219541 ***