Bug 875788 - Deadlock on libvirt when playing with hotplug and add/remove vm
Summary: Deadlock on libvirt when playing with hotplug and add/remove vm
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.3
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 876102 (view as bug list)
Depends On: 856950
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-12 15:17 UTC by Chris Pelland
Modified: 2012-11-22 09:40 UTC (History)
18 users (show)

Fixed In Version: libvirt-0.9.10-21.el6_3.6
Doc Type: Bug Fix
Doc Text:
Cause: When libvirt is tearing qemu process up, it does a clean up of some internal structures, free some locks, and so on. Since users may destroy qemu processes in parallel, libvirt holds what we call 'qemu driver lock'. It's lock that protects the most important internal structure where we keep list of domains among with their state. Consequence: One function tried to lock qemu driver even though it was already locked. This lead to unresolvable deadlock. Fix: Code was rewritten and the locking was moved after unlocking the qemu driver. Result: Libvirt doesn't deadlock anymore.
Clone Of:
Environment:
Last Closed: 2012-11-22 09:40:32 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:1484 normal SHIPPED_LIVE libvirt bug fix update 2012-11-22 14:39:12 UTC

Description Chris Pelland 2012-11-12 15:17:55 UTC
This bug has been copied from bug #856950 and has been proposed
to be backported to 6.3 z-stream (EUS).

Comment 6 Michal Privoznik 2012-11-13 12:38:11 UTC
*** Bug 876102 has been marked as a duplicate of this bug. ***

Comment 7 weizhang 2012-11-15 08:34:47 UTC
I test with steps in https://bugzilla.redhat.com/show_bug.cgi?id=856950#c13
version
qemu-kvm-rhev-0.12.1.2-2.330.el6.x86_64
kernel-2.6.32-335.el6.x86_64
libvirt-0.9.10-21.el6_3.6.x86_64

It may report error, but after that sometimes it can succeed
The message is get from the attach-detach loop

Disk attached successfully

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot write data: Broken pipe

error: Failed to attach disk
error: operation failed: target vdb already exists

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot write data: Broken pipe

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot recv data: Connection reset by peer

Disk detached successfully

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot recv data: Connection reset by peer

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot write data: Broken pipe

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot recv data: Connection reset by peer

error: No found disk whose source path or target is vdb

Is that still have problem ?

Comment 8 weizhang 2012-11-15 08:40:21 UTC
And after 15 minutes, all messages are like 

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot write data: Broken pipe

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Cannot recv data: Connection reset by peer

Comment 9 Michal Privoznik 2012-11-15 08:58:12 UTC
No, we don't have a problem.

Just for the record, I've managed to log in into the machine and found the source of those error messages:

2012-11-15 08:40:17.614+0000: 26234: error : virNetServerDispatchNewClient:246 : Too many active clients (20), dropping connection from 127.0.0.1;0

So I think this is okay.

Comment 10 weizhang 2012-11-15 12:19:34 UTC
Thanks for Michal's help.

Verify pass on
qemu-kvm-0.12.1.2-2.295.el6.x86_64
kernel-2.6.32-279.el6.x86_64
libvirt-0.9.10-21.el6_3.6.x86_64


Steps
on one console do
#  while true; do for i in {1..10}; do virsh create /tmp/test$i.xml; done ; for i in {1..10}; do virsh destroy test$i; done; done

on another console do
# while true;do virsh attach-disk tt /var/lib/libvirt/images/disk.img vdb; sleep 2; virsh detach-disk tt vdb;sleep 2; done

Running about 1.5 hours, libvirtd still running, no error


Can reproduce on 
libvirt-0.10.1-2.el6.x86_64

After about 1.5 hours, libvirtd crash.

Comment 12 errata-xmlrpc 2012-11-22 09:40:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-1484.html


Note You need to log in before you can comment on or make changes to this bug.