Bug 1060248

Summary: Multiple concurrent connections cause libvirtd SIGABRT
Product: [Community] Virtualization Tools Reporter: Florian Haas <florian>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DEFERRED QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: andreas, crobinso, rbalakri, tuksgig
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-10 17:44:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Log buffer dump after SIGABRT none

Description Florian Haas 2014-01-31 15:14:12 UTC
Created attachment 857889 [details]
Log buffer dump after SIGABRT

Description of problem:

On a 48-core system, multiple concurrent connections cause libvirtd to crash with SIGABRT.


Version-Release number of selected component (if applicable):
libvirt-bin_1.1.1-0ubuntu8~cloud2_amd64.deb (from Ubuntu Cloud archive for OpenStack Havana)

How reproducible:
100%


Steps to Reproduce:
1. Define a a sufficient number of domains (I use 50 to trigger the issue reliably, but have seen it happen on 10 or fewer). The domains do not have to be started.
2. Run multiple concurrent connections against libvirtd, like so:

for i in `seq 50`; do echo "test-1"; done | xargs -n1 -P 50 virsh domstate


Actual results:
libvirtd crashes with SIGABRT. Debug trace in attachment.


Expected results:
All virsh calls complete, unless some exceed the libvirtd max_client limit, in which case they should be rejected with -ECONNREFUSED. This is the behavior observed after downgrading to 1.0.2-0ubuntu11.13.04.5~cloud1 (which ships in Ubuntu Cloud Archive for Grizzly).


Additional info:
A complete script to define test domains and trigger the issue can be found at https://gist.github.com/fghaas/8705466

The issue cannot be reproduced on a system with significantly fewer cores (such as a workstation). Likewise, it also cannot be reproduced when the calls are serialized, such as executing 50 "virsh domstate" commands over the same virsh session.

Comment 1 Cole Robinson 2016-04-10 17:44:36 UTC
Sorry this never received a response. I don't know if this is fixed and I don't have a machine with that many cores to test. Given the age of the bug I'm just going to close it as DEFERRED, but if you can still reproduce with a more recent distro and libvirt please reopen and provide a recent backtrace