Hide Forgot
Description of problem: Migration hang when the guest is with glusterfs volume and the destination host can't access to this glusterfs volume. Version-Release number of selected component (if applicable): libvirt-1.1.1-10.el7.x86_64 qemu-kvm-1.5.3-10.el7.x86_64 glusterfs-3.4.0.36rhs-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a guest with xml below on glusterfs client A: # virsh dumpxml r7|grep disk -A 6 <disk type='network' device='disk'> <driver name='qemu' type='qcow2'/> <source protocol='gluster' name='gluster-vol1/rh7-qcow2.img'> <host name='10.66.82.251' port='24007'/> </source> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> 2. Do live migration to another glusterfs client B successfully. # virsh migrate --live --verbose r7 qemu+ssh://10.66.5.93/system --unsafe root.5.93's password: Migration: [100 %] 3. Do live migration back to this machine A successfully. # virsh migrate --live --verbose r7 qemu+ssh://10.66.4.217/system --unsafe root.4.217's password: Migration: [100 %] 4. On the glusterfs client B, uninstall the glusterfs package. # rpm -qa|grep gluster glusterfs-libs-3.4.0.36rhs-1.el7.x86_64 glusterfs-api-3.4.0.36rhs-1.el7.x86_64 5. Redo the live migration to glusterfs client B, the migration hang. # virsh migrate --live --verbose r7 qemu+ssh://10.66.5.93/system --unsafe root.5.93's password: Actual results: In step5, the migration hang. Expected results: In step5, the virsh migrate command line should return back with suitable error message.
After playing with this for some time, I don't think there's any real bug here. By default, virsh does not use keepalive protocol to detect broken connections and relies completely on TCP timeouts, which are much longer. Closing to match the RHEL 6 clone.