Red Hat Bugzilla – Bug 1139567
virsh cmd will hang when remove blockcopy file
Last modified: 2016-01-20 00:07:23 EST
Description of problem: virsh cmd will hang when remove blockcopy file. Version-Release number of selected component (if applicable): host: libvirt-1.2.8-1.el7.x86_64 qemu-kvm-rhev-2.1.0-3.el7.x86_64 kernel-3.10.0-147.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.prepare a guest 2.do blockcopy # virsh blockcopy rhel6 vda /var/lib/libvirt/images/bak --verbose --wait Block Copy: [100 %] Now in mirroring phase 3.remove copy file #rm /var/lib/libvirt/images/bak -f 4.do pivot # virsh blockjob rhel6 vda --pivot 5.do other block cmd, all will hang. # virsh blockjob rhel6 vda 6. check xml: <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/var/lib/libvirt/images/kvm-rhel7.0-x86_64-qcow2v3.img'/> <mirror type='file' file='/var/lib/libvirt/images/bak' format='qcow2' job='copy' ready='yes'> <format type='qcow2'/> <source file='/var/lib/libvirt/images/bak'/> </mirror> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> Actual results: virsh will hang up. Expected results: better to provide error msg. Additional info:
Fixed upstream: commit fe3691f66348d55e88c9811fd79ff9314e053977 Author: Erik Skultety <eskultet@redhat.com> Date: Wed Dec 3 13:56:47 2014 +0100 qemu: Fix virsh freeze when blockcopy storage file is removed If someone removes blockcopy storage file when still in mirroring phase and then requesting blockjob abort using pivot, virsh cmd freezes. This is not an issue with older qemu versions which did not support asynchronous jobs (which we prefer by default). As we have reached the mirroring phase successfully, polling monitor for blockjob info always returns 1 and the loop never ends. This fix introduces a check for qemuDomainBlockPivot return code, possibly skipping the asynchronous waiting completely, if an error occurred and asynchronous waiting was the preferred method. v1.2.10-265-gfe3691f
Verify this bug with libvirt-1.2.8-11.el7.x86_64 Steps: 1. Prepare a transient guest # virsh list --transient Id Name State ---------------------------------------------------- 14 r7 running 2. Do blockcopy # virsh blockcopy r7 vda /var/lib/libvirt/images/r7.clone --verbose --wait Block Copy: [100 %] Now in mirroring phase # virsh dumpxml r7 |grep mirror -A 2 -B 6 <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source protocol='gluster' name='gluster-vol1/r7-raw.img'> <host name='10.66.5.38'/> </source> <backingStore/> <mirror type='file' file='/var/lib/libvirt/images/r7.clone' format='raw' job='copy' ready='yes'> <format type='raw'/> <source file='/var/lib/libvirt/images/r7.clone'/> </mirror> <target dev='vda' bus='virtio'/> 3. Remove the destination file # rm -fr /var/lib/libvirt/images/r7.clone 4. Do pivot # virsh blockjob r7 vda --pivot error: Cannot access storage file '/var/lib/libvirt/images/r7.clone' (as uid:107, gid:107): No such file or directory The error is expected, so change to VERIFIED
*** Bug 1134294 has been marked as a duplicate of this bug. ***
The hang occurred not just from removing a backing file, but also from the much more likely attempt to attempt a virDomainBlockJobAbort operation with the pivot flag prior to an active commit completing (see the reproduction example in bug 1134294)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html