Bug 1303031 - libvirtd hang since fork() was called while another thread had security manager locked
libvirtd hang since fork() was called while another thread had security manag...
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt (Show other bugs)
7.2
x86_64 Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Ján Tomko
yafu
:
Depends On:
Blocks: 1401400
  Show dependency treegraph
 
Reported: 2016-01-29 06:22 EST by yafu
Modified: 2018-01-01 20:26 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description yafu 2016-01-29 06:22:44 EST
Description of problem:
Run test-virt-alignment-scan-guests.sh in the libguestfs (https://github.com/libguestfs/libguestfs/blob/master/align/test-virt-alignment-scan-guests.sh , the script will start up lots of parallel libvirt instances), and the libvirtd process hang since fork() was called while another thread had security manager locked.

Version-Release number of selected component (if applicable):
libvirt-1.2.17-13.el7_2.2.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64

How reproducible:
sometimes

Steps to Reproduce:
1.Run test-virt-alignment-scan-guests.sh in the libguestfs:
# while true ; do ./test-virt-alignment-scan-guests.sh ; done

2.Check the output of 'virsh list' at the same time:
#watch virsh list

3.After about 20 hours, the libvirtd daemon hang and the output of 'virsh list' did not change any more.

4.Using gdb to print the libvirtd process backtrace:
(gdb)bt
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fec0cafed02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x00007fec0cafec08 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7febec1d92b0) at pthread_mutex_lock.c:64
#3  0x00007fec0f491ba5 in virMutexLock (m=m@entry=0x7febec1d92b0) at util/virthread.c:89
#4  0x00007fec0f47922e in virObjectLock (anyobj=anyobj@entry=0x7febec1d92a0) at util/virobject.c:323
#5  0x00007fec0f60db1c in virSecurityManagerSetSocketLabel (mgr=0x7febec1d92a0, vm=vm@entry=0x7febd800bb20) at security/security_manager.c:431
#6  0x00007fec0f60ade3 in virSecurityStackSetSocketLabel (mgr=<optimized out>, vm=0x7febd800bb20) at security/security_stack.c:456
#7  0x00007fec0f60db29 in virSecurityManagerSetSocketLabel (mgr=0x7febec1d91f0, vm=0x7febd800bb20) at security/security_manager.c:432
#8  0x00007febf66e896e in qemuProcessHook (data=0x7febfee740f0) at qemu/qemu_process.c:3227
#9  0x00007fec0f43f8fa in virExec (cmd=cmd@entry=0x7febd8007a70) at util/vircommand.c:692
#10 0x00007fec0f442267 in virCommandRunAsync (cmd=cmd@entry=0x7febd8007a70, pid=pid@entry=0x0) at util/vircommand.c:2429
#11 0x00007fec0f442616 in virCommandRun (cmd=cmd@entry=0x7febd8007a70, exitstatus=exitstatus@entry=0x0) at util/vircommand.c:2261
#12 0x00007febf66f053d in qemuProcessStart (conn=conn@entry=0x7febe0008110, driver=driver@entry=0x7febec11d7b0, vm=<optimized out>, asyncJob=asyncJob@entry=0, 
    migrateFrom=migrateFrom@entry=0x0, stdin_fd=stdin_fd@entry=-1, stdin_path=stdin_path@entry=0x0, snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, 
    flags=flags@entry=5) at qemu/qemu_process.c:4859
#13 0x00007febf673d7df in qemuDomainCreateXML (conn=0x7febe0008110, xml=<optimized out>, flags=<optimized out>) at qemu/qemu_driver.c:1768
#14 0x00007fec0f520a11 in virDomainCreateXML (conn=0x7febe0008110, 
    xmlDesc=0x7febd8000cb0 "<?xml version=\"1.0\"?>\n<domain type=\"kvm\" xmlns:qemu=\"http://libvirt.org/schemas/domain/qemu/1.0\">\n  <name>guestfs-fxyf5s4hmc0iipn6</name>\n  <memory unit=\"MiB\">500</memory>\n  <currentMemory unit=\"MiB\">"..., flags=2) at libvirt-domain.c:180
#15 0x00007fec1018411a in remoteDispatchDomainCreateXML (server=0x7fec11e1bb60, msg=0x7fec11e362f0, ret=0x7febd800d0b0, args=0x7febd80087f0, rerr=0x7febfee74c30, client=0x7fec11e3e320)
    at remote_dispatch.h:3754
#16 remoteDispatchDomainCreateXMLHelper (server=0x7fec11e1bb60, client=0x7fec11e3e320, msg=0x7fec11e362f0, rerr=0x7febfee74c30, args=0x7febd80087f0, ret=0x7febd800d0b0)
    at remote_dispatch.h:3732
#17 0x00007fec0f59c3c2 in virNetServerProgramDispatchCall (msg=0x7fec11e362f0, client=0x7fec11e3e320, server=0x7fec11e1bb60, prog=0x7fec11e30000) at rpc/virnetserverprogram.c:437
#18 virNetServerProgramDispatch (prog=0x7fec11e30000, server=server@entry=0x7fec11e1bb60, client=0x7fec11e3e320, msg=0x7fec11e362f0) at rpc/virnetserverprogram.c:307
#19 0x00007fec0f59763d in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7fec11e1bb60) at rpc/virnetserver.c:135
#20 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7fec11e1bb60) at rpc/virnetserver.c:156
#21 0x00007fec0f4924f5 in virThreadPoolWorker (opaque=opaque@entry=0x7fec11e10de0) at util/virthreadpool.c:145
#22 0x00007fec0f491a18 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
#23 0x00007fec0cafcdc5 in start_thread (arg=0x7febfee75700) at pthread_create.c:308
#24 0x00007fec0c82a1cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Actual results:


Expected results:
The libvirtd should not hang while lots of threads try to lock the security manager at the same time.

Additional info:
Because the libvirtd log_level is 3 when the bug happend, I can not provide the debug log of libvirtd now. I will try to reproduce the bug and provide the debug log of libvirtd if I reproduce it.
Comment 1 Richard W.M. Jones 2017-09-26 06:17:39 EDT
We haven't seen any hangs recently, and this was tested against
a very old version of libvirt.

Is it possible to retest this using RHEL 7.4 / 7.5 libvirt to see
if it still happens (even rarely after 20+ hours)?

Otherwise I suggest closing this and if we find problems with locking
we can reopen or open a new bug.
Comment 2 yafu 2017-10-11 02:12:21 EDT
(In reply to Richard W.M. Jones from comment #1)
> We haven't seen any hangs recently, and this was tested against
> a very old version of libvirt.
> 
> Is it possible to retest this using RHEL 7.4 / 7.5 libvirt to see
> if it still happens (even rarely after 20+ hours)?
> 
> Otherwise I suggest closing this and if we find problems with locking
> we can reopen or open a new bug.

I retested the bug with libvirt-3.8.0-1.el7.x86_64. libvirtd will hang in about 5 minutes. The backtrace of libvirtd is the same as comment 0.

Note You need to log in before you can comment on or make changes to this bug.