Description of problem:
When a volume attached to an active instance is migrated between NFS shares, sparseness is lost.
(This occurs on the customer site with NetApp, but the same principal should apply to any cinder driver that generally respects spareness, e.g. NFS)
The actual volume migration is orchestrated by cinder, but is actually acheived by delegating to nova via the swap_volume operation. Where libvirt is the hypervisor, this effectively boils down to the blockRebase operation. Maintaining sparseness in this context will also require changes to libvirt & qemu: https://bugzilla.redhat.com/1221468#c9
Steps to Reproduce:
1. Create a 1GB volume on NFS from the stock cirros image
2. Boot an instance from that volume, so that it's attached as the root volume
3. Check that the volume size on the disk is around 20MB
4. Migrate that volume to another NFS share
5. Check size again
The volume should remain the same size
The size is changed to ~1GB, which means the volume lost the sparseness.
Cleaved off from BZ 1221468 to cover the active VM case (leaving the original bug to solely represent the case where the VM is inactive).
Note bug #1219541 is tracking at least a very similar issue for libvirt and qemu.
That bug is currently against the qemu component, though there is an upstream patch proposed against libvirt (though not yet merged):
(In reply to Pádraig Brady from comment #3)
> Note bug #1219541 is tracking at least a very similar issue for libvirt and
> That bug is currently against the qemu component, though there is an
> upstream patch proposed against libvirt (though not yet merged):
NB volume migration != guest migration.
Volume migration is the code using the driveMirror libvirt API, but that quoted patch is for guest migration.
Yes, this bug is purely for the online *volume* migration case.
The guest stays put, but the volume needed to be moved between cinder backends in order to balance across multiple NetApps (in the customer usecase).
live block volume migration has been disabled as of Kilo/RHOS 7 as per http://pad.lv/1398999
(and also in Juno if https://review.openstack.org/176768 is merged)
Enablement will require:
1. Changes to qemu to detect zeros on NFS and propagate as holes
2. API changes to libvirt to make the operation safe
3. Nova changes to renable the feature and use the newer libvirt APIs
(tracked in this bug)
@pbrady, once again that (In reply to Pádraig Brady from comment #6)
> live block volume migration has been disabled as of Kilo/RHOS 7 as per
> (and also in Juno if https://review.openstack.org/176768 is merged)
These are again about *live migration* with block storage copy. ie the VM is relocated from one host to another host, and the storage is copied.
This bug is about volume migration. The VM stays running on the current host, and the volume is swapped out from beneath it.
Development is continuing in qemu, with another solution proposed, which is however a bit more invasive:
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=v2.3.0-1612-g0fc9f8e is scheduled to be backported next week to address this
So this is still predicated on work in two lower layer components:
(1) [libvirt] https://bugzilla.redhat.com/show_bug.cgi?id=1297255 —
add possibility to sparsify image during block copy
(2) [QEMU] https://bugzilla.redhat.com/show_bug.cgi?id=1533975 —
detect-zeroes=unmap/on does not produce a sparse file on NFS v4.1
when attempting blockdev/drive-mirror
I am closing this bug as it has not been addressed for a very long time. Please feel free to reopen if it is still relevant.