Hide Forgot
Description of problem: Upstream commit e6b68d7 introduced a regression where the libvirtd daemon could try to free active event handles (fds being polled or timeouts); if the event subsequently occurs, this can result in arbitrary behavior including crashes. Version-Release number of selected component (if applicable): libvirt-0.8.7-3.el6 How reproducible: 100% Steps to Reproduce: 1. start 60 VMs: for i in `seq 60`; do virsh start vm$i; done 2. stop 60 VMs: for i in `seq 60`; do virsh destroy vm$i; done 3. list all known VMs: virsh list --all Actual results: libvirtd crashed immediately after stopping the 60th VM Expected results: libvirt should never crash Additional info: This may be the root cause of bug 670848, although the stack trace from the above test does not match that bug report, so this is opened as a separate BZ until we are sure that there aren't any other problems. Initial upstream patch to fix this and one other bug: https://www.redhat.com/archives/libvir-list/2011-January/msg00921.html
ACK, we really need to get this fixed ! thanks for chasing this, hopefully it will also solve 670848 ! Daniel
Verified with Passed in below environment: # uname -a Linux intel-5405-32-4.englab.nay.redhat.com 2.6.32-107.el6.x86_64 #1 SMP Thu Jan 27 23:11:23 EST 2011 x86_64 x86_64 x86_64 GNU/Linux libvirt-0.8.7-4.el6.x86_64 kernel-2.6.32-107.el6.x86_64 qemu-kvm-0.12.1.2-2.132.el6.x86_64 Steps 1. start 60 VMs: # for i in `seq 60`; do virsh start a$i; done Domain a1 started Domain a2 started Domain a3 started ... 2. stop 60 VMs: # for i in `seq 60`; do virsh destroy a$i; done Domain a1 destroyed Domain a2 destroyed Domain a3 destroyed .... 3. list all VMs: # virsh list --all # virsh list --all Id Name State ---------------------------------- - a1 shut off - a10 shut off - a11 shut off - a12 shut off - a13 shut off - a14 shut off - a15 shut off - a16 shut off - a17 shut off - a18 shut off - a19 shut off - a2 shut off - a20 shut off ..... -------------- I have reproduced this bug with libvirt-0.8.7-3.el6. It also applys for 60 guests. # for i in `seq 40`; do virsh start a$i; done # for i in `seq 40`; do virsh destroy a$i; done ... Domain a38 destroyed Domain a39 destroyed error: cannot recv data: : Connection reset by peer error: failed to connect to the hypervisor # virsh list --all error: unable to connect to '/var/run/libvirt/libvirt-sock', libvirtd may need to be started: Connection refused error: failed to connect to the hypervisor # service libvirtd status libvirtd dead but pid file exists
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0596.html