Bug 875741

Summary: libvirtd segfaults in libnl3 (nl_object_put)
Product: [Community] Virtualization Tools Reporter: Richard W.M. Jones <rjones>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, dallan, dyasny, laine, tgraf
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-13 13:56:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
core.1352922701.16713.libvirtd.xz none

Description Richard W.M. Jones 2012-11-12 14:13:18 UTC
Description of problem:

could not destroy libvirt domain: Cannot recv data: Connection reset by peer [code=38 domain=7]

Happened randomly during a run of the libguestfs test suite
(t/guestfs_500_mount_local.opt).

Version-Release number of selected component (if applicable):

libvirt-0.10.2-3.fc18.x86_64
libguestfs 1.19.58

How reproducible:

Rare.

Steps to Reproduce:
1. Unknown - but happened when running the libguestfs test suite.

Comment 1 Richard W.M. Jones 2012-11-12 14:15:32 UTC
There is no other error anywhere that I can see.  Nothing
in libvirtd.log nor in the qemu stderr output.

Comment 2 Richard W.M. Jones 2012-11-12 14:17:49 UTC
Oh that is bizarre.

The same error happened at the same time in another
libguestfs test run:

*stdin*:2: libguestfs: error: could not create appliance through libvirt: Cannot
recv data: Connection reset by peer [code=38 domain=7]

In a completely different part of the test suite, but they
are probably sharing a single user libvirtd instance.

Comment 3 Richard W.M. Jones 2012-11-13 14:30:39 UTC
I wrote a test in the libguestfs test suite which is
explicitly designed to trigger this:

/home/rjones/d/libguestfs/run --test ./test-parallel
libguestfs: error: could not get libvirt capabilities: End of file while reading data: Input/output error [code=38 domain=7]
libguestfs: error: could not create appliance through libvirt: Cannot recv data: Connection reset by peer [code=38 domain=7]
libvir: XML-RPC error : End of file while reading data: Input/output error
libguestfs: error: could not get libvirt capabilities: End of file while reading data: Input/output error [code=38 domain=7]
libguestfs: error: could not connect to libvirt (URI = NULL): End of file while reading data: Input/output error [code=38 domain=7]
libguestfs: error: could not create appliance through libvirt: Cannot recv data: Connection reset by peer [code=38 domain=7]
0: thread returned an error
1: thread returned an error
2: thread returned an error
3: thread returned an error
4: thread returned an error
/home/rjones/d/libguestfs/run: command failed with exit code 1
FAIL: test-parallel

All the test does is to repeatedly stop and start transient
domains in a loop in 5 threads.

Comment 4 Michal Privoznik 2012-11-14 15:29:41 UTC
Rich, were you able to get any core dump and/or debug logs? If so, can you please attach them? Thanks.

Comment 5 Richard W.M. Jones 2012-11-14 20:20:00 UTC
Yup, I got two:

-rw-------. 1 rjones rjones 986365952 Nov 14 19:50 /tmp/core.1352922603.8925.libvirtd
-rw-------. 1 rjones rjones 158580736 Nov 14 19:51 /tmp/core.1352922701.16713.libvirtd

I'll upload stack traces when I'm back at work tomorrow.

Comment 6 Richard W.M. Jones 2012-11-14 20:27:16 UTC
From the first core:

Core was generated by `/usr/sbin/libvirtd --timeout=30'.
Program terminated with signal 6, Aborted.
#0  0x0000003577835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
63	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install cryptopp-5.6.1-8.fc18.x86_64 libpciaccess-0.13.1-2.fc18.x86_64 netcf-libs-0.2.2-1.fc19.x86_64
(gdb) t a a bt

Thread 11 (Thread 0x7f3062a2f700 (LWP 8927)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139c10, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x112ac30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f3062a2f700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 10 (Thread 0x7f306222e700 (LWP 8928)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139c10, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x112aa10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f306222e700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 9 (Thread 0x7f3061a2d700 (LWP 8929)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139c10, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x112ac30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f3061a2d700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 8 (Thread 0x7f306122c700 (LWP 8930)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139c10, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x112aa10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f306122c700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 7 (Thread 0x7f3060a2b700 (LWP 8931)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139ca8, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x112ac30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f3060a2b700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 6 (Thread 0x7f305ea27700 (LWP 8935)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139ca8, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x112ac30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f305ea27700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 5 (Thread 0x7f305f228700 (LWP 8934)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139ca8, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x112aa10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f305f228700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 4 (Thread 0x7f3063230700 (LWP 8926)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139c10, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x112aa10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f3063230700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 3 (Thread 0x7f305fa29700 (LWP 8933)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139ca8, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x112ac30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f305fa29700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 2 (Thread 0x7f306022a700 (LWP 8932)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x1139ca8, 
    m=m@entry=0x1139be8) at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x112aa10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f306022a700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 1 (Thread 0x7f3063652840 (LWP 8925)):
#0  0x0000003577835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
#1  0x0000003577837358 in __GI_abort () at abort.c:90
#2  0x000000357782e972 in __assert_fail_base (
    fmt=0x3577979248 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x3595413580 "0", 
    file=file@entry=0x3595413110 "object.c", line=line@entry=197, 
    function=function@entry=0x3595413228 <__PRETTY_FUNCTION__.12164> "nl_object_put") at assert.c:92
#3  0x000000357782ea22 in __GI___assert_fail (
    assertion=assertion@entry=0x3595413580 "0", 
    file=file@entry=0x3595413110 "object.c", line=line@entry=197, 
    function=function@entry=0x3595413228 <__PRETTY_FUNCTION__.12164> "nl_object_put") at assert.c:101
#4  0x000000359540fb0a in nl_object_put (obj=0x7f2f933c3530) at object.c:197
#5  0x000000359540f855 in nl_object_free (obj=0x7f2faaf65e70) at object.c:158
#6  0x000000359540a432 in nl_cache_remove (obj=0x7f2faaf65e70) at cache.c:484
#7  0x000000359540a5a5 in nl_cache_clear (cache=cache@entry=0x7f2faaf4b4c0)
    at cache.c:347
#8  0x000000359540a5ce in nl_cache_free (cache=0x7f2faaf4b4c0) at cache.c:364
#9  0x000000357a807f56 in netlink_close () from /lib64/libnetcf.so.1
#10 0x000000357a808fe0 in drv_close () from /lib64/libnetcf.so.1
#11 0x000000357a804497 in ncf_close () from /lib64/libnetcf.so.1
#12 0x00007f305c251fef in interfaceCloseInterface (conn=0x7f2faaf32ca0)
    at interface/interface_backend_netcf.c:170
#13 0x0000003e026e3f84 in virConnectDispose (obj=0x7f2faaf32ca0)
    at datatypes.c:134
#14 0x0000003e026774d3 in virObjectUnref (anyobj=anyobj@entry=0x7f2faaf32ca0)
    at util/virobject.c:139
#15 0x0000003e026ec798 in virConnectClose (conn=0x7f2faaf32ca0)
    at libvirt.c:1466
#16 0x0000000000429957 in remoteClientFreeFunc (data=<optimized out>)
    at remote.c:654
#17 0x0000003e0274b4ae in virNetServerClientDispose (obj=0x1654b10)
    at rpc/virnetserverclient.c:590
#18 0x0000003e026774d3 in virObjectUnref (anyobj=<optimized out>)
    at util/virobject.c:139
#19 0x0000003e02752a8a in virNetSocketEventFree (
    opaque=opaque@entry=0x1164e80) at rpc/virnetsocket.c:1520
#20 0x0000003e0265e838 in virEventPollCleanupHandles ()
    at util/event_poll.c:567
#21 0x0000003e0265f373 in virEventPollRunOnce () at util/event_poll.c:603
#22 0x0000003e0265e3c7 in virEventRunDefaultImpl () at util/event.c:247
#23 0x0000003e0274b10d in virNetServerRun (srv=srv@entry=0x1139a60)
    at rpc/virnetserver.c:748
#24 0x000000000040c2c3 in main (argc=<optimized out>, argv=<optimized out>)
    at libvirtd.c:1339

Comment 7 Richard W.M. Jones 2012-11-14 20:28:04 UTC
From the second core:

Core was generated by `/usr/sbin/libvirtd --timeout=30'.
Program terminated with signal 6, Aborted.
#0  0x0000003577835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
63	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install cryptopp-5.6.1-8.fc18.x86_64 libpciaccess-0.13.1-2.fc18.x86_64 netcf-libs-0.2.2-1.fc19.x86_64
(gdb) t a a bt

Thread 11 (Thread 0x7f2e4b5a4700 (LWP 16722)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bca8, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x90ce70)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4b5a4700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 10 (Thread 0x7f2e4f5ac700 (LWP 16714)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bc10, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x90ca10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4f5ac700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 9 (Thread 0x7f2e4ada3700 (LWP 16723)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bca8, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x90ca10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4ada3700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 8 (Thread 0x7f2e4e5aa700 (LWP 16716)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bc10, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x90cd50)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4e5aa700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 7 (Thread 0x7f2e4bda5700 (LWP 16721)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bca8, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x90ca10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4bda5700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 6 (Thread 0x7f2e4cda7700 (LWP 16719)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bca8, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x90cc30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4cda7700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 5 (Thread 0x7f2e4c5a6700 (LWP 16720)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bca8, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fbbb in virThreadPoolWorker (opaque=opaque@entry=0x90ce70)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4c5a6700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 4 (Thread 0x7f2e4dda9700 (LWP 16717)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bc10, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x90ce70)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4dda9700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 3 (Thread 0x7f2e4edab700 (LWP 16715)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bc10, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x90cc30)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4edab700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 2 (Thread 0x7f2e4d5a8700 (LWP 16718)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003e0266f756 in virCondWait (c=c@entry=0x91bc10, m=m@entry=0x91bbe8)
    at util/threads-pthread.c:117
#2  0x0000003e0266fb9b in virThreadPoolWorker (opaque=opaque@entry=0x90ca10)
    at util/threadpool.c:103
#3  0x0000003e0266f589 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003578407d15 in start_thread (arg=0x7f2e4d5a8700)
    at pthread_create.c:308
#5  0x00000035778f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 1 (Thread 0x7f2e4f9ce840 (LWP 16713)):
#0  0x0000003577835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
#1  0x0000003577837358 in __GI_abort () at abort.c:90
#2  0x000000357782e972 in __assert_fail_base (
    fmt=0x3577979248 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x3595413580 "0", 
    file=file@entry=0x3595413110 "object.c", line=line@entry=197, 
    function=function@entry=0x3595413228 <__PRETTY_FUNCTION__.12164> "nl_object_put") at assert.c:92
#3  0x000000357782ea22 in __GI___assert_fail (
    assertion=assertion@entry=0x3595413580 "0", 
    file=file@entry=0x3595413110 "object.c", line=line@entry=197, 
    function=function@entry=0x3595413228 <__PRETTY_FUNCTION__.12164> "nl_object_put") at assert.c:101
#4  0x000000359540fb0a in nl_object_put (obj=0x7f2e3480e6b0) at object.c:197
#5  0x000000359540a432 in nl_cache_remove (obj=obj@entry=0x7f2e3480e6b0)
    at cache.c:484
#6  0x000000359540f847 in nl_object_free (obj=0x7f2e3480e6b0) at object.c:155
#7  0x000000359540f855 in nl_object_free (obj=0x7f2e3480faa0) at object.c:158
#8  0x000000359540a432 in nl_cache_remove (obj=0x7f2e3480faa0) at cache.c:484
#9  0x000000359540a5a5 in nl_cache_clear (cache=cache@entry=0x7f2e3480c940)
    at cache.c:347
#10 0x000000359540a5ce in nl_cache_free (cache=0x7f2e3480c940) at cache.c:364
#11 0x000000357a807f56 in netlink_close () from /lib64/libnetcf.so.1
#12 0x000000357a808fe0 in drv_close () from /lib64/libnetcf.so.1
#13 0x000000357a804497 in ncf_close () from /lib64/libnetcf.so.1
#14 0x00007f2e485cdfef in interfaceCloseInterface (conn=0x7f2e346f39d0)
    at interface/interface_backend_netcf.c:170
#15 0x0000003e026e3f84 in virConnectDispose (obj=0x7f2e346f39d0)
    at datatypes.c:134
#16 0x0000003e026774d3 in virObjectUnref (anyobj=anyobj@entry=0x7f2e346f39d0)
    at util/virobject.c:139
#17 0x0000003e026ec798 in virConnectClose (conn=0x7f2e346f39d0)
    at libvirt.c:1466
#18 0x0000000000429957 in remoteClientFreeFunc (data=<optimized out>)
    at remote.c:654
#19 0x0000003e0274b4ae in virNetServerClientDispose (obj=0x944960)
    at rpc/virnetserverclient.c:590
#20 0x0000003e026774d3 in virObjectUnref (anyobj=<optimized out>)
    at util/virobject.c:139
#21 0x0000003e02752a8a in virNetSocketEventFree (opaque=opaque@entry=0x944710)
    at rpc/virnetsocket.c:1520
#22 0x0000003e0265e838 in virEventPollCleanupHandles ()
    at util/event_poll.c:567
#23 0x0000003e0265f373 in virEventPollRunOnce () at util/event_poll.c:603
#24 0x0000003e0265e3c7 in virEventRunDefaultImpl () at util/event.c:247
#25 0x0000003e0274b10d in virNetServerRun (srv=srv@entry=0x91ba60)
    at rpc/virnetserver.c:748
#26 0x000000000040c2c3 in main (argc=<optimized out>, argv=<optimized out>)
    at libvirtd.c:1339

Comment 8 Richard W.M. Jones 2012-11-14 20:31:56 UTC
netcf-libs-0.2.2-1.fc19.x86_64 since that seems to be
where the crash is occurring.

I tried installing the corresponding netcf-debuginfo, but
for some reason I don't understand I didn't get any more
info out of gdb.

Comment 9 Richard W.M. Jones 2012-11-14 20:37:32 UTC
192		obj->ce_refcnt--;
193		NL_DBG(4, "Returned object reference %p, %d remaining\n",
194		       obj, obj->ce_refcnt);
195	
196		if (obj->ce_refcnt < 0)
197			BUG();             <---- fails here
198	
199		if (obj->ce_refcnt <= 0)
200			nl_object_free(obj);

Obviously a ref-counting bug.

Comment 10 Laine Stump 2012-11-14 20:40:49 UTC
Adding tgraf since the assert is in libnl.

Comment 11 Richard W.M. Jones 2012-11-14 21:58:59 UTC
This is the libvirt connection object:

(gdb) frame 17
#17 0x0000003e026ec798 in virConnectClose (conn=0x7f2e346f39d0)
    at libvirt.c:1466
1466	    if (!virObjectUnref(conn))
(gdb) print *conn
$2 = {
  object = {
    magic = 3405643784, 
    refs = 0, 
    klass = 0x7f2e400070b0
  }, 
  flags = 0, 
  uri = 0x7f2e346f07c0, 
  driver = 0x7f2e478e1960 <qemuDriver>, 
  networkDriver = 0x7f2e4a5a1720 <networkDriver>, 
  interfaceDriver = 0x7f2e487d83a0 <interfaceDriver>, 
  storageDriver = 0x7f2e4a386900 <storageDriver>, 
  deviceMonitor = 0x7f2e49079560 <udevDeviceMonitor>, 
  secretDriver = 0x7f2e48e653c0 <secretDriver>, 
  nwfilterDriver = 0x7f2e48c54840 <nwfilterDriver>, 
  privateData = 0x7f2e400af5e0, 
  networkPrivateData = 0x0, 
  interfacePrivateData = 0x7f2e346f35a0, 
  storagePrivateData = 0x7f2e400012c0, 
  devMonPrivateData = 0x7f2e40002250, 
  secretPrivateData = 0x7f2e40022bb0, 
  nwfilterPrivateData = 0x7f2e400b83a0, 
  lock = {
    lock = {
      __data = {
        __lock = 0, 
        __count = 0, 
        __owner = 0, 
        __nusers = 0, 
        __kind = 0, 
        __spins = 0, 
        __list = {
          __prev = 0x0, 
          __next = 0x0
        }
      }, 
      __size = '\000' <repeats 39 times>, 
      __align = 0
    }
  }, 
  err = {
    code = 0, 
    domain = 0, 
    message = 0x0, 
    level = VIR_ERR_NONE, 
    conn = 0x0, 
    dom = 0x0, 
    str1 = 0x0, 
    str2 = 0x0, 
    str3 = 0x0, 
    int1 = 0, 
    int2 = 0, 
    net = 0x0
  }, 
  handler = 0x0, 
  userData = 0x0, 
  closeCallback = 0x0, 
  closeOpaque = 0x0, 
  closeFreeCallback = 0x0, 
  closeDispatch = false, 
  closeUnregisterCount = 0
}

(gdb) frame 14
#14 0x00007f2e485cdfef in interfaceCloseInterface (conn=0x7f2e346f39d0)
    at interface/interface_backend_netcf.c:170
170	        ncf_close(driver->netcf);
(gdb) print *driver
$3 = {
  lock = {
    lock = {
      __data = {
        __lock = 0, 
        __count = 0, 
        __owner = 0, 
        __nusers = 0, 
        __kind = 0, 
        __spins = 0, 
        __list = {
          __prev = 0x0, 
          __next = 0x0
        }
      }, 
      __size = '\000' <repeats 39 times>, 
      __align = 0
    }
  }, 
  netcf = 0x7f2e346f0970
}

This is the netcf object which triggers the refcounting
failure:

(gdb) frame 4
#4  0x000000359540fb0a in nl_object_put (obj=0x7f2e3480e6b0) at object.c:197
197			BUG();
(gdb) print *obj
$1 = {
  ce_refcnt = -1, 
  ce_ops = 0x3595244160 <link_obj_ops>, 
  ce_cache = 0x0, 
  ce_list = {
    next = 0x7f2e3480eb58, 
    prev = 0x7f2e345bb1e0
  }, 
  ce_msgtype = 16, 
  ce_flags = 0, 
  ce_mask = 262117
}

Comment 12 Richard W.M. Jones 2012-11-14 22:02:04 UTC
This is the test we use:

https://github.com/libguestfs/libguestfs/blob/master/tests/parallel/test-parallel.c

It's very simple: it creates 5 threads, then in each
thread it keeps creating and destroying libvirt transient
domains (using qemu:///session).

To give you an idea of the libvirt code we use to create
a transient domain, see:
https://github.com/libguestfs/libguestfs/blob/master/src/launch-libvirt.c#L117

Comment 13 Richard W.M. Jones 2012-11-14 22:29:03 UTC
Created attachment 645196 [details]
core.1352922701.16713.libvirtd.xz

I'm not making much headway understanding this bug, so I've
attached the core file itself (which xz managed to compress
rather excellently).

Note this core file corresponds to the following packages:

libvirt-0.10.2.1-2.fc18.x86_64
netcf-libs-0.2.2-1.fc19.x86_64

(and probably other precise versions are needed too, but that
should be enough to get a meaningful stack trace).

Comment 14 Richard W.M. Jones 2012-11-15 17:18:33 UTC
I updated to libnl3-3.2.14-1.fc18.x86_64, and it still crashes.

Here is the stack trace (the other threads are all blocked in
pthread_cond_wait so I didn't include them).  Notice it seems
subtly different from the others.

Thread 1 (Thread 0x7fcd5152b840 (LWP 19393)):
#0  0x0000003577835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
#1  0x0000003577837358 in __GI_abort () at abort.c:90
#2  0x000000357782e972 in __assert_fail_base (
    fmt=0x3577979248 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x3595413580 "el6", 
    file=file@entry=0x3595413110 "DBG<3>: recvmsgs(%p): Read %d bytes\n", 
    line=line@entry=197, function=function@entry=0x3595413228 ".\n")
    at assert.c:92
#3  0x000000357782ea22 in __GI___assert_fail (assertion=0x3595413580 "el6", 
    file=0x3595413110 "DBG<3>: recvmsgs(%p): Read %d bytes\n", line=197, 
    function=0x3595413228 ".\n") at assert.c:101
#4  0x000000359540fb0a in nl_object_put (obj=0x7fccde794150) at object.c:201
#5  0x00007fccf266c520 in ?? ()
#6  0x000000359540f855 in nl_object_alloc_name (kind=<optimized out>, 
    result=<optimized out>) at object.c:96
#7  0x00007fccf266c520 in ?? ()
#8  0x00007fccf2668ec0 in ?? ()
#9  0x0000000000000015 in ?? ()
#10 0x000000359540a432 in fprintf (__fmt=0x3595411aa7 "BUG: %s:%d\n", 
    __stream=<error reading variable: Cannot access memory at address 0x0>)
    at /usr/include/bits/stdio2.h:97
#11 nl_cache_subset (orig=0x4bc1, filter=0x4bc1) at cache.c:283
#12 0x00007fccf2668ec0 in ?? ()
#13 0x000000000000000d in ?? ()
#14 0x000000359540a5ce in nl_cache_name (cache=<optimized out>)
    at ../include/netlink-local.h:184
#15 nl_cache_clear (cache=0x4bc1) at cache.c:344
#16 0x000000357a808fe0 in drv_close (ncf=0x357a807f56 <netlink_close+22>, 
    ncf@entry=0x7fccf25a9910) at drv_redhat.c:384
#17 0x000000357a804497 in ncf_close (ncf=0x7fccf25a9910) at netcf.c:101
#18 0x00007fcd4a12afef in interfaceCloseInterface (conn=0x7fccf25a9f60)
    at interface/interface_backend_netcf.c:170
#19 0x0000003e026e3f84 in virConnectDispose (obj=0x7fccf25a9f60)
    at datatypes.c:134
#20 0x0000003e026774d3 in virObjectUnref (anyobj=anyobj@entry=0x7fccf25a9f60)
    at util/virobject.c:139
#21 0x0000003e026ec798 in virConnectClose (conn=0x7fccf25a9f60)
    at libvirt.c:1466
#22 0x0000000000429957 in remoteClientFreeFunc (data=<optimized out>)
    at remote.c:654
#23 0x0000003e0274b4ae in virNetServerClientDispose (obj=0x10e29f0)
    at rpc/virnetserverclient.c:590
#24 0x0000003e026774d3 in virObjectUnref (anyobj=<optimized out>)
    at util/virobject.c:139
#25 0x0000003e02752a8a in virNetSocketEventFree (opaque=opaque@entry=0x10dd7b0)
    at rpc/virnetsocket.c:1520
#26 0x0000003e0265e838 in virEventPollCleanupHandles ()
    at util/event_poll.c:567
#27 0x0000003e0265f373 in virEventPollRunOnce () at util/event_poll.c:603
#28 0x0000003e0265e3c7 in virEventRunDefaultImpl () at util/event.c:247
#29 0x0000003e0274b10d in virNetServerRun (srv=srv@entry=0xf05a60)
    at rpc/virnetserver.c:748
#30 0x000000000040c2c3 in main (argc=<optimized out>, argv=<optimized out>)
    at libvirtd.c:1339

Comment 15 Richard W.M. Jones 2012-11-16 09:43:59 UTC
I tried setting NLDBG=4.  However this just makes the crashes
appear elsewhere (eg. bug 877312).  I couldn't get this crash
to happen with that setting.  Removing NLDBG=4 makes this crash
the most frequent.

Changing the summary to distinguish this crash from the other
crashes we are seeing.

Comment 16 Richard W.M. Jones 2012-11-16 14:16:43 UTC
I posted a simple reproducer on libvir-list:

https://www.redhat.com/archives/libvir-list/2012-November/msg00745.html

Comment 17 Richard W.M. Jones 2012-11-16 16:52:44 UTC
This is a core dump from a second Intel machine, using the
self-contained test program from here:
https://www.redhat.com/archives/libvir-list/2012-November/msg00756.html

Note although it appears to be the same bug, the stack trace
is very slightly different, because this time the error happens
in nl_object_free instead of nl_object_put.

Core was generated by `/home/rjones/d/libvirt/daemon/.libs/lt-libvirtd --timeout=30'.
Program terminated with signal 6, Aborted.
#0  0x0000003a48835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
63	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install ceph-devel-0.46-1.fc18.x86_64 ceph-libs-0.46-1.fc18.x86_64 cryptopp-5.6.1-8.fc18.x86_64 gnome-keyring-3.6.1-1.fc18.x86_64 hal-libs-0.5.14-6.fc15.x86_64 libnl3-3.2.14-1.fc18.x86_64 libpciaccess-0.13.1-2.fc18.x86_64

(gdb) t a a bt

Thread 11 (Thread 0x7f7f5802c700 (LWP 5503)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf060, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a1743eb in virThreadPoolWorker (opaque=opaque@entry=0x1daeba0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f5802c700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 10 (Thread 0x7f7f5902e700 (LWP 5501)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf060, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a1743eb in virThreadPoolWorker (opaque=opaque@entry=0x1daeba0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f5902e700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 9 (Thread 0x7f7f5882d700 (LWP 5502)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf060, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a1743eb in virThreadPoolWorker (opaque=opaque@entry=0x1dbe3b0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f5882d700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 8 (Thread 0x7f7f55827700 (LWP 5508)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf0f8, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a17440b in virThreadPoolWorker (opaque=opaque@entry=0x1daecc0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f55827700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 7 (Thread 0x7f7f5982f700 (LWP 5500)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf060, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a1743eb in virThreadPoolWorker (opaque=opaque@entry=0x1dbe3b0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f5982f700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 6 (Thread 0x7f7f56028700 (LWP 5507)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf0f8, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a17440b in virThreadPoolWorker (opaque=opaque@entry=0x1daeba0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f56028700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 5 (Thread 0x7f7f55026700 (LWP 5509)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf0f8, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a17440b in virThreadPoolWorker (opaque=opaque@entry=0x1daeba0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f55026700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 4 (Thread 0x7f7f56829700 (LWP 5506)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf0f8, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a17440b in virThreadPoolWorker (opaque=opaque@entry=0x1daecc0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f56829700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 3 (Thread 0x7f7f5702a700 (LWP 5505)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf0f8, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a17440b in virThreadPoolWorker (opaque=opaque@entry=0x1daeba0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f5702a700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 2 (Thread 0x7f7f5782b700 (LWP 5504)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x00007f7f5a173d6a in virCondWait (c=c@entry=0x1dbf060, 
    m=m@entry=0x1dbf038) at util/threads-pthread.c:117
#2  0x00007f7f5a1743eb in virThreadPoolWorker (opaque=opaque@entry=0x1dbe3b0)
    at util/threadpool.c:103
#3  0x00007f7f5a173a06 in virThreadHelper (data=<optimized out>)
    at util/threads-pthread.c:161
#4  0x0000003a48c07d15 in start_thread (arg=0x7f7f5782b700)
    at pthread_create.c:308
#5  0x0000003a488f22cd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Thread 1 (Thread 0x7f7f59c51840 (LWP 5453)):
#0  0x0000003a48835ba5 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:63
#1  0x0000003a48837358 in __GI_abort () at abort.c:90
#2  0x0000003a488754ab in __libc_message (do_abort=do_abort@entry=2, 
    fmt=fmt@entry=0x3a489799e8 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:197
#3  0x0000003a4887b646 in malloc_printerr (action=3, 
    str=0x3a48977828 "corrupted double-linked list", ptr=<optimized out>)
    at malloc.c:4969
#4  0x0000003a4887cc5b in _int_free (av=0x7f7f3c000020, p=0x7f7f3c129bc0, 
    have_lock=0) at malloc.c:3973
#5  0x00007f7f59c7b8dd in nl_object_free () from /lib64/libnl-3.so.200
#6  0x00007f7f59c7b8d5 in nl_object_free () from /lib64/libnl-3.so.200
#7  0x00007f7f59c764a2 in nl_cache_remove () from /lib64/libnl-3.so.200
#8  0x00007f7f59c76615 in nl_cache_clear () from /lib64/libnl-3.so.200
#9  0x00007f7f59c7663e in nl_cache_free () from /lib64/libnl-3.so.200
#10 0x00007f7f52928f56 in netlink_close (ncf=ncf@entry=0x7f7f40006730)
    at dutil_linux.c:864
#11 0x00007f7f52929fe0 in drv_close (ncf=ncf@entry=0x7f7f40006730)
    at drv_redhat.c:384
#12 0x00007f7f52925497 in ncf_close (ncf=0x7f7f40006730) at netcf.c:101
#13 0x00007f7f52b66dbe in interfaceCloseInterface (conn=0x7f7f40140d30)
    at interface/interface_backend_netcf.c:170
#14 0x00007f7f5a1fb394 in virConnectDispose (obj=0x7f7f40140d30)
    at datatypes.c:134
#15 0x00007f7f5a17d565 in virObjectUnref (anyobj=anyobj@entry=0x7f7f40140d30)
    at util/virobject.c:139
#16 0x00007f7f5a2049bd in virConnectClose (conn=0x7f7f40140d30)
    at libvirt.c:1469
#17 0x000000000042d02d in remoteClientFreeFunc (data=<optimized out>)
    at remote.c:679
#18 0x00007f7f5a26f502 in virNetServerClientDispose (obj=0x1de90b0)
    at rpc/virnetserverclient.c:725
#19 0x00007f7f5a17d565 in virObjectUnref (anyobj=<optimized out>)
    at util/virobject.c:139
#20 0x00007f7f5a278cca in virNetSocketEventFree (opaque=opaque@entry=0x1de6e90)
    at rpc/virnetsocket.c:1628
#21 0x00007f7f5a15ebd2 in virEventPollCleanupHandles ()
    at util/event_poll.c:567
#22 0x00007f7f5a15f963 in virEventPollRunOnce () at util/event_poll.c:603
#23 0x00007f7f5a15e66b in virEventRunDefaultImpl () at util/event.c:247
#24 0x00007f7f5a26edf5 in virNetServerRun (srv=srv@entry=0x1dbeef0)
    at rpc/virnetserver.c:1004
#25 0x000000000040c462 in main (argc=<optimized out>, argv=<optimized out>)
    at libvirtd.c:1355

Comment 18 Thomas Graf 2012-11-20 12:55:01 UTC
I have pushed several thread synchronization fixes to the libnl git tree. I doubt that any of them fix this issue as I can't see any parallelism in the above backtraces but it might be worth a try to rule out the possibility.

Comment 19 Thomas Graf 2012-11-27 11:30:57 UTC
(In reply to comment #18)
> I have pushed several thread synchronization fixes to the libnl git tree. I
> doubt that any of them fix this issue as I can't see any parallelism in the
> above backtraces but it might be worth a try to rule out the possibility.

Theses fixes are now released as 3.2.16 if you want to try.

Comment 20 Daniel Berrangé 2012-12-13 13:56:18 UTC

*** This bug has been marked as a duplicate of bug 886454 ***