Hide Forgot
Created attachment 478179 [details] core files whern libvirtd dead. Description of problem: libvirtd daemon killed with segmentation fault about 30mins after starting libvirtd. Version-Release number of selected component (if applicable): OS : Centos 5.3 Kernel : 2.6.27.29-0.1.1 Arch : x86_64 xen : 3.4.2 daemon : libvirtd 0.8.7 library: libdevmapper 1.02 libhal 1.0.0 libdbus-1 3.4.0 libaudit 0.0.0 libnuma 1 libgnutls 13.0.6 libcrypt 11.2.3 libsasl2 2.0.22 libxenstore 3.0.0 libxml2 2.7.6 libavahi-common 3.4.3 libavahi-client 3.2.1 libpthread 2.5 libc 2.5 libselinux 1 libsepol 1 libcap 1.10 ld 2.5 libz 1.2.3 libgpg-error 0.3.0 libnsl 2.5 libdl 2.5 libresolv 2.5 libcrypt 2.5 libm 2.5 How reproducible: Actually, I have no idea. however, I start libvirtd (service libvirtd start) after about 30mins, it dead with segmentation fault. Additional info: I checked core dump file. And I'll show u the result of the core. (gdb) where #0 0x0000000000416b9b in virNodeDeviceDefFree () #1 0x0000000000416e3a in virNodeDeviceDefFree () #2 0x000000000041aa24 in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 (gdb) thread apply all bt Thread 7 (Thread 11811): #0 0x00007f43dbf0cee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f43ddc1b4f6 in virCondWait () from /usr/lib64/libvirt.so.0 #2 0x000000000041c9bd in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 Thread 6 (Thread 3648): #0 0x00007f43dbf0cee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f43ddc1b4f6 in virCondWait () from /usr/lib64/libvirt.so.0 #2 0x000000000041c9bd in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 Thread 5 (Thread 2931): #0 0x00007f43dbf09b35 in pthread_join () from /lib64/libpthread.so.0 #1 0x000000000041dcc2 in virNodeDeviceDefFree () #2 0x00007f43dbbc8994 in __libc_start_main () from /lib64/libc.so.6 #3 0x00000000004164e9 in virNodeDeviceDefFree () #4 0x00007fff8e5fe5c8 in ?? () #5 0x0000000000000000 in ?? () Thread 4 (Thread 3736): #0 0x00007f43dbf0cee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f43ddc1b4f6 in virCondWait () from /usr/lib64/libvirt.so.0 #2 0x000000000041c9bd in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 Thread 3 (Thread 11801): #0 0x00007f43dbf0cee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f43ddc1b4f6 in virCondWait () from /usr/lib64/libvirt.so.0 #2 0x000000000041c9bd in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 Thread 2 (Thread 11796): #0 0x00007f43dbf0cee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f43ddc1b4f6 in virCondWait () from /usr/lib64/libvirt.so.0 #2 0x000000000041c9bd in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 Thread 1 (Thread 2932): #0 0x0000000000416b9b in virNodeDeviceDefFree () #1 0x0000000000416e3a in virNodeDeviceDefFree () #2 0x000000000041aa24 in virNodeDeviceDefFree () #3 0x00007f43dbf0873d in start_thread () from /lib64/libpthread.so.0 #4 0x00007f43dbc7ef6d in clone () from /lib64/libc.so.6 (gdb) and I attached core files.
I debugged core file with gdb and follows are the result. (gdb) frame 0 #0 0x000000000041854b in virEventCleanupHandles () at event.c:528 528 in event.c (gdb) p i $1 = 0 (gdb) p eventLoop $2 = {lock = {lock = {__data = {__lock = 1, __count = 0, __owner = 19517, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001\000\000\000\000\000\000\000=L\000\000\001", '\000' <repeats 26 times>, __align = 1}}, running = 1, leader = {thread = 1096337728}, wakeupfd = {3, 5}, handlesCount = 9, handlesAlloc = 0, handles = 0x0, timeoutsCount = 2, timeoutsAlloc = 10, timeouts = 0x7059c0} (gdb) p eventLoop.handles $3 = (struct virEventHandle *) 0x0 (gdb) p eventLoop.handlesCount $4 = 9 (gdb) p eventLoop.handles[0] Cannot access memory at address 0x0 (gdb) p eventLoop.handles $5 = (struct virEventHandle *) 0x0 (gdb) p *eventLoop.handles Cannot access memory at address 0x0 (gdb) p eventLoop.handlesAlloc $6 = 0 (gdb) somethings looks like abnormal. at event.c, in virEventCleanupHandles function. ================================================================== static int virEventCleanupHandles(void) { int i; DEBUG("Cleanup %zu", eventLoop.handlesCount); /* Remove deleted entries, shuffling down remaining * entries as needed to form contiguous series */ for (i = 0 ; i < eventLoop.handlesCount ; ) { if (!eventLoop.handles[i].deleted) { i++; continue; } if (eventLoop.handles[i].ff) (eventLoop.handles[i].ff)(eventLoop.handles[i].opaque); if ((i+1) < eventLoop.handlesCount) { memmove(eventLoop.handles+i, eventLoop.handles+i+1, sizeof(struct virEventHandle)*(eventLoop.handlesCount-(i+1))); } eventLoop.handlesCount--; } /* Release some memory if we've got a big chunk free */ if ((eventLoop.handlesAlloc - EVENT_ALLOC_EXTENT) > eventLoop.handlesCount) { EVENT_DEBUG("Releasing %zu out of %zu handles slots used, releasing %d", eventLoop.handlesCount, eventLoop.handlesAlloc, EVENT_ALLOC_EXTENT); VIR_SHRINK_N(eventLoop.handles, eventLoop.handlesAlloc, EVENT_ALLOC_EXTENT); } return 0; } ==================================================================== at first if statment, eventLoop.handles has NULL value, but, eventLoop.handlesCount is 9. and also, when I checked eventLoop handlesCount = 9, handlesAlloc = 0, handles = 0x0, timeoutsCount = 2, timeoutsAlloc = 10 how can be handlesCount = 9 with handlesAlloc = 0
this bug is fixed at 8.8 thanks anyway