Bug 875788
Summary: | Deadlock on libvirt when playing with hotplug and add/remove vm | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Chris Pelland <cpelland> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.3 | CC: | acathrow, ajia, berrange, cpelland, dallan, dyasny, dyuan, gcheresh, jpallich, mavital, mprivozn, mzhan, ohochman, pm-eus, rwu, weizhan, ydu, ykaul |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-0.9.10-21.el6_3.6 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
When libvirt is tearing qemu process up, it does a clean up of some internal structures, free some locks, and so on. Since users may destroy qemu processes in parallel, libvirt holds what we call 'qemu driver lock'. It's lock that protects the most important internal structure where we keep list of domains among with their state.
Consequence:
One function tried to lock qemu driver even though it was already locked. This lead to unresolvable deadlock.
Fix:
Code was rewritten and the locking was moved after unlocking the qemu driver.
Result:
Libvirt doesn't deadlock anymore.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2012-11-22 09:40:32 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 856950 | ||
Bug Blocks: |
Description
Chris Pelland
2012-11-12 15:17:55 UTC
Moving to POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2012-November/msg00108.html *** Bug 876102 has been marked as a duplicate of this bug. *** I test with steps in https://bugzilla.redhat.com/show_bug.cgi?id=856950#c13 version qemu-kvm-rhev-0.12.1.2-2.330.el6.x86_64 kernel-2.6.32-335.el6.x86_64 libvirt-0.9.10-21.el6_3.6.x86_64 It may report error, but after that sometimes it can succeed The message is get from the attach-detach loop Disk attached successfully error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot write data: Broken pipe error: Failed to attach disk error: operation failed: target vdb already exists error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot write data: Broken pipe error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot recv data: Connection reset by peer Disk detached successfully error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot recv data: Connection reset by peer error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot write data: Broken pipe error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot recv data: Connection reset by peer error: No found disk whose source path or target is vdb Is that still have problem ? And after 15 minutes, all messages are like error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot write data: Broken pipe error: Failed to reconnect to the hypervisor error: no valid connection error: Cannot recv data: Connection reset by peer No, we don't have a problem. Just for the record, I've managed to log in into the machine and found the source of those error messages: 2012-11-15 08:40:17.614+0000: 26234: error : virNetServerDispatchNewClient:246 : Too many active clients (20), dropping connection from 127.0.0.1;0 So I think this is okay. Thanks for Michal's help. Verify pass on qemu-kvm-0.12.1.2-2.295.el6.x86_64 kernel-2.6.32-279.el6.x86_64 libvirt-0.9.10-21.el6_3.6.x86_64 Steps on one console do # while true; do for i in {1..10}; do virsh create /tmp/test$i.xml; done ; for i in {1..10}; do virsh destroy test$i; done; done on another console do # while true;do virsh attach-disk tt /var/lib/libvirt/images/disk.img vdb; sleep 2; virsh detach-disk tt vdb;sleep 2; done Running about 1.5 hours, libvirtd still running, no error Can reproduce on libvirt-0.10.1-2.el6.x86_64 After about 1.5 hours, libvirtd crash. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-1484.html |