Red Hat Bugzilla – Bug 892079
Libvirtd crash when destroyed the windows guest which was excuting s3/s4 operation
Last modified: 2014-03-25 05:59:10 EDT
after step 5 ,i did some further check # service libvirtd status libvirtd dead but pid file exists # virsh list error: Failed to reconnect to the hypervisor error: no valid connection error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused # ps aux|grep qemu root 59813 0.0 0.0 103244 856 pts/5 S+ 01:29 0:00 grep qemu start the libvirtd service ,then check the guest's status,the guest has been destroyed # service libvirtd start Starting libvirtd daemon: [ OK ] # service libvirtd status libvirtd (pid 60193) is running... # virsh list --all Id Name State ---------------------------------------------------- - win7-32 shut of
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2013-January/msg00520.html
Moving to POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-January/msg00175.html
Created attachment 682311 [details] libvirtd crash log
crash log attached.
I can also still reproduce this bug with libvirt-0.10.2-16.el6.x86_64.
Okay guys, I've created a scratch build before claiming I fixed this: https://brewweb.devel.redhat.com/taskinfo?taskID=5298609 Can you please give it a try?
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2013-January/msg01487.html
Since this is targeted for 6.5 now, and I've just pushed the patch upstream, I am moving this one to POST: commit d960d06fc06a448f495c465caf06d3d0c74ea587 Author: Michal Privoznik <mprivozn@redhat.com> AuthorDate: Mon Jan 21 11:52:44 2013 +0100 Commit: Michal Privoznik <mprivozn@redhat.com> CommitDate: Wed Jan 23 15:35:44 2013 +0100 qemu_agent: Ignore expected EOFs https://bugzilla.redhat.com/show_bug.cgi?id=892079 One of my previous patches (f2a4e5f176c408) tried to fix crashing libvirtd on domain detroy. However, we need to copy pattern from qemuProcessHandleMonitorEOF() instead of decrementing reference counter. The rationale for this is, if qemu process is dying due to domain being destroyed, we obtain EOF on both the monitor and agent sockets. However, if the exit is expected, qemuProcessStop is called, which cleans both agent and monitor sockets up. We want qemuAgentClose() to be called iff the EOF is not expected, so we don't leak an FD and memory. Moreover, there could be race with qemuProcessHandleMonitorEOF() which could have already closed the agent socket, in which case we don't want to do anything. v1.0.1-401-gd960d06
*** Bug 915653 has been marked as a duplicate of this bug. ***
Marking with TestBlocker since it fails our Jenkins Jobs testing RHEV
(In reply to comment #19) > Marking with TestBlocker since it fails our Jenkins Jobs testing RHEV Fair enough, thank you for the explanation of what's failing.
Verify this bug on libvirt-0.10.2-21.el6.x86_64, the following was my verification steps pkg info kernel-2.6.32-396.el6.x86_64 libvirt-0.10.2-21.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.386.el6.x86_64 virtio-win-1.6.5-6.el6_4.noarch steps 1. prepare the test environment as step1 and step2 in coment 0 2 Excute the s3 in the host, quit the command before it was finshed # virsh dompmsuspend win7 --target mem ^C 3 Excute the s4 in the host ,quit the command before it was finished # virsh dompmsuspend win7 --target disk ^C 4 Then destroy the guest in the host # virsh destroy win7 Domain win7 destroyed 5.check the libvirtd status # ps aux|grep libvirtd root 5251 0.0 0.0 103244 836 pts/0 S+ 18:13 0:00 grep libvirtd root 30067 1.7 0.1 1027604 15896 ? Sl 16:18 1:58 libvirtd --daemon # service libvirtd status libvirtd (pid 30067) is running... since the libvirtd was not crashed, also i can reproduce this bug in libvirt-0.10.2-13.el6.x86_64, so mark this bug verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1581.html