Bug 1034807
Summary: | deadlock in nwfilter code under heavy load | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Laine Stump <laine> | ||||
Component: | libvirt | Assignee: | Laine Stump <laine> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.0 | CC: | acathrow, bdpayne, berrange, dallan, dyuan, gsun, honzhang, jmiao, laine, mjg59, mzhan, veillard | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-1.1.1-22.el7 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 929412 | Environment: | |||||
Last Closed: | 2014-06-13 12:41:02 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 929412 | ||||||
Bug Blocks: | 1034806 | ||||||
Attachments: |
|
Description
Laine Stump
2013-11-26 14:34:02 UTC
Created attachment 834797 [details]
backtrace from deadlocked libvirtd
Hi Laine, Could you show the reproduce steps of this deadlock? This bug was originally filed against upstream (Bug 929412) by Matthew Garrett. He may be able to offer more details about his test setup, but it looks like you should be able to trigger it by rapidly doing a lot of start/destroy of domains which have <filterref> elements, while simultaneously defining/undefining nwfilter rules - the deadlock occurs if a domain operation (start/destroy, for example) happens to be holding the nwfilter lock at the time a nwfilter operation (nwfilter-define/nwfilter-undefine) tries to scan all domains to update their filters after adding/removing a filter from the list. (In reply to Laine Stump from comment #3) > This bug was originally filed against upstream (Bug 929412) by Matthew > Garrett. He may be able to offer more details about his test setup, but it > looks like you should be able to trigger it by rapidly doing a lot of > start/destroy of domains which have <filterref> elements, while > simultaneously defining/undefining nwfilter rules - the deadlock occurs if a > domain operation (start/destroy, for example) happens to be holding the > nwfilter lock at the time a nwfilter operation > (nwfilter-define/nwfilter-undefine) tries to scan all domains to update > their filters after adding/removing a filter from the list. Thanks Laine, I can reproduce it with your suggestion. Upstream fixed in commits commit 6e5c79a1b5a8b3a23e7df7ffe58fb272aa17fbfb Author: Daniel P. Berrange <berrange> Date: Wed Jan 22 17:28:29 2014 +0000 Push nwfilter update locking up to top level commit c065984b58000a44c90588198d222a314ac532fd Author: Daniel P. Berrange <berrange> Date: Wed Jan 22 15:26:21 2014 +0000 Add a read/write lock implementation backported patches have been tested on RHEL7 - deadlock easily occurred without the patches, and was eliminated with them. Posted to rhvirt-patches: http://post-office.corp.redhat.com/archives/rhvirt-patches/2014-February/msg00079.html Verify the bug as follows. The result is expected. Move its status to VERIFIED. Versions libvirt-1.1.1-22.el7.x86_64 Steps 1. prepare a nwfilter xml and define it # cat disallow-arp.xml <filter name='disallow-arp' chain='arp'> <rule action='drop' direction='inout' priority='500'/> </filter> # virsh nwfilter-define disallow-arp.xml Network filter disallow-arp defined from disallow-arp.xml 2. modify guest xml # virsh dumpxml r6.4 <domain type='kvm'> ...... <interface type='network'> <mac address='52:54:00:e7:33:6c'/> <source network='default1'/> <model type='rtl8139'/> <filterref filter='disallow-arp'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> ...... </domain> 3. execute start and destroy guest operation # while true ; do virsh start r6.4 ; sleep 1 ; virsh destroy r6.4 ; done 4. In another terminal, execute define and undefine nwfilter operation # while true; do virsh nwfilter-undefine disallow-arp; sleep 1; virsh nwfilter-define disallow-arp.xml; done Result No deadlock occurs, libvirt works fine. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |