Red Hat Bugzilla – Bug 799478
libvirt emits inappropriate error when using domabortjob to abort stuck migration
Last modified: 2012-06-20 02:49:33 EDT
+++ This bug was initially created as a clone of Bug #725373 +++ Created attachment 515009 [details] libvirtd log Description of problem: when using domabortjob to abort stuck migration (destination host blocked during migration via iptables), the migration command reports inappropriate error. --- Additional comment from weizhan@redhat.com on 2012-01-09 22:31:35 EST --- Test with kernel-2.6.32-223.el6.x86_64 qemu-kvm-0.12.1.2-2.213.el6.x86_64 libvirt-0.9.9-1.el6.x86_64 1. Do migration #virsh migrate --live --p2p kvm-rhel6u2-x86_64-new qemu+tls://10.66.83.197/system At the same time, on other console do #iptables -A OUTPUT -d 10.66.83.197 -j REJECT Then do #virsh domjobabort kvm-rhel6u2-x86_64-new The migration job will not return immediately, but will wait for a while then return error: An error occurred, but the cause is unknown --- Additional comment from weizhan@redhat.com on 2012-02-29 06:49:56 EST --- still can reproduce the phenomemon on comment 14 on qemu-kvm-0.12.1.2-2.232.el6.x86_64 kernel-2.6.32-225.el6.x86_64 libvirt-0.9.10-3.el6.x86_64 so re-assign this bug --- Additional comment from jdenemar@redhat.com on 2012-02-29 10:46:32 EST --- Oh I see, what it is. virsh domjobabort correctly cancels the migration and source libvirtd tries to call Finish API on destination libvirtd to tell it about the abortion. Since all packets to destination libvirtd are discarded, the API waits for some time (~30 seconds) until the broken connection is detected and then source libvirtd correctly resumes the domain and reports the migration API finished. So far, everything worked as expected. However, it seems the error message was eaten somewhere on the way and virsh has nothing useful to report. --- Additional comment from jdenemar@redhat.com on 2012-03-02 12:30:57 EST --- Actually the eaten error message is a minor issue which doesn't need to block this bug. I'm moving this bz back to ON_QA and I'll create a new bug requesting the error message to be fixed.
This bug was fixed as part of a patch series required to fix bug 807907. Namely by the first patch in the series: http://post-office.corp.redhat.com/archives/rhvirt-patches/2012-April/msg00856.html
test on qemu-kvm-0.12.1.2-2.292.el6.x86_64 kernel-2.6.32-269.el6.x86_64 libvirt-0.9.10-20.el6.x86_64 For migration and domjobabort 1 time , wait several minute, it will report error error: operation aborted: migration job: canceled by client But for several times abort, libvirt will hang. Do I need to file a new bug?
Created attachment 585107 [details] the backtrace for libvirtd hang
Oh, as confirmed with another backtrace, it looks like we have a reproducer for bug 821468. I'll move the backtraces and investigation there and leave this bug for the error message fix.
For the new issue will continue followed on bug 821468, so verify PASS for this bug as comment 3 shows
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0748.html