Description of problem: virt-admin always crash during init Version-Release number of selected component (if applicable): v2.0.0-rc1-26-g0b4645a How reproducible: 100% Steps to Reproduce: 1. # virt-admin srv-list Segmentation fault (core dumped) 2. back trace Program received signal SIGSEGV, Segmentation fault. 0x00007ffff45d7c4c in free () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff45d7c4c in free () from /lib64/libc.so.6 #1 0x00007ffff7485d0a in virFree (ptrptr=ptrptr@entry=0x5555557693c8) at util/viralloc.c:582 #2 0x00007ffff74a27fb in virResetError (err=0x5555557693a0) at util/virerror.c:384 #3 0x00007ffff74a2c9a in virResetLastError () at util/virerror.c:417 #4 0x00007ffff74a3b4d in virEventRegisterDefaultImpl () at util/virevent.c:269 #5 0x0000555555559460 in vshAdmInit (ctl=0x7fffffffdd40) at virt-admin.c:1009 #6 main (argc=1, argv=<optimized out>) at virt-admin.c:1380 3. Actual results: virt-admin always crash during init Expected results: fix it Additional info: looks like virt-admin did not call virInitialize() to init virLastErr.
Thanks for the report and the suggested fix, I sent a patch: http://www.redhat.com/archives/libvir-list/2016-June/msg02083.html
Could you run this through valgrind like this: valgrind --tool=memcheck --leak-check=full virt-admin srv-list And post the output? Thanks.
(In reply to Martin Kletzander from comment #2) > Could you run this through valgrind like this: > > valgrind --tool=memcheck --leak-check=full virt-admin srv-list > > And post the output? Thanks. Hi Martin, This is valgrind output: # valgrind --tool=memcheck --leak-check=full virt-admin srv-list ==8337== Memcheck, a memory error detector ==8337== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==8337== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info ==8337== Command: virt-admin srv-list ==8337== ==8337== Invalid read of size 8 ==8337== at 0x52F1CFC: virFree (viralloc.c:582) ==8337== by 0x530E7FA: virResetError (virerror.c:384) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422238 is 8 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid write of size 8 ==8337== at 0x52F1D0A: virFree (viralloc.c:583) ==8337== by 0x530E7FA: virResetError (virerror.c:384) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422238 is 8 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid read of size 8 ==8337== at 0x52F1CFC: virFree (viralloc.c:582) ==8337== by 0x530E803: virResetError (virerror.c:385) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422240 is 16 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid write of size 8 ==8337== at 0x52F1D0A: virFree (viralloc.c:583) ==8337== by 0x530E803: virResetError (virerror.c:385) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422240 is 16 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid read of size 8 ==8337== at 0x52F1CFC: virFree (viralloc.c:582) ==8337== by 0x530E80C: virResetError (virerror.c:386) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422248 is 24 bytes after a block of size 32 in arena "client" ==8337== ==8337== Invalid free() / delete / delete[] / realloc() ==8337== at 0x4C2AD17: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x52F1D09: virFree (viralloc.c:582) ==8337== by 0x530E80C: virResetError (virerror.c:386) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0x60 is not stack'd, malloc'd or (recently) free'd ==8337== ==8337== Invalid write of size 8 ==8337== at 0x52F1D0A: virFree (viralloc.c:583) ==8337== by 0x530E80C: virResetError (virerror.c:386) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422248 is 24 bytes after a block of size 32 in arena "client" ==8337== ==8337== Invalid write of size 8 ==8337== at 0x530E834: UnknownInlinedFun (string3.h:84) ==8337== by 0x530E834: virResetError (virerror.c:387) ==8337== by 0x530FB4C: virEventRegisterDefaultImpl (virevent.c:269) ==8337== by 0x10D45F: vshAdmInit (virt-admin.c:1009) ==8337== by 0x10D45F: main (virt-admin.c:1380) ==8337== Address 0xe422230 is 0 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid read of size 8 ==8337== at 0x52F1CFC: virFree (viralloc.c:582) ==8337== by 0x530E7FA: virResetError (virerror.c:384) ==8337== by 0x4E38666: virAdmConnectIsAlive (libvirt-admin.c:374) ==8337== by 0x10F0D7: vshAdmConnectionHandler (virt-admin.c:979) ==8337== by 0x11121A: vshCommandRun (vsh.c:1264) ==8337== by 0x10D5CB: main (virt-admin.c:1386) ==8337== Address 0xe422238 is 8 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid write of size 8 ==8337== at 0x52F1D0A: virFree (viralloc.c:583) ==8337== by 0x530E7FA: virResetError (virerror.c:384) ==8337== by 0x4E38666: virAdmConnectIsAlive (libvirt-admin.c:374) ==8337== by 0x10F0D7: vshAdmConnectionHandler (virt-admin.c:979) ==8337== by 0x11121A: vshCommandRun (vsh.c:1264) ==8337== by 0x10D5CB: main (virt-admin.c:1386) ==8337== Address 0xe422238 is 8 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid read of size 8 ==8337== at 0x52F1CFC: virFree (viralloc.c:582) ==8337== by 0x530E803: virResetError (virerror.c:385) ==8337== by 0x4E38666: virAdmConnectIsAlive (libvirt-admin.c:374) ==8337== by 0x10F0D7: vshAdmConnectionHandler (virt-admin.c:979) ==8337== by 0x11121A: vshCommandRun (vsh.c:1264) ==8337== by 0x10D5CB: main (virt-admin.c:1386) ==8337== Address 0xe422240 is 16 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== ==8337== Invalid write of size 8 ==8337== at 0x52F1D0A: virFree (viralloc.c:583) ==8337== by 0x530E803: virResetError (virerror.c:385) ==8337== by 0x4E38666: virAdmConnectIsAlive (libvirt-admin.c:374) ==8337== by 0x10F0D7: vshAdmConnectionHandler (virt-admin.c:979) ==8337== by 0x11121A: vshCommandRun (vsh.c:1264) ==8337== by 0x10D5CB: main (virt-admin.c:1386) ==8337== Address 0xe422240 is 16 bytes after a block of size 32 alloc'd ==8337== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8337== by 0x7ED868F: _dlerror_run (in /usr/lib64/libdl-2.17.so) ==8337== by 0x7ED80C0: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so) ==8337== by 0xAA088EC: ??? (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xAA08D2D: FIPS_module_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA92086B: FIPS_mode_set (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0xA91D292: OPENSSL_init_library (in /usr/lib64/libcrypto.so.1.0.1e) ==8337== by 0x400F3A2: _dl_init (in /usr/lib64/ld-2.17.so) ==8337== by 0x4001469: ??? (in /usr/lib64/ld-2.17.so) ==8337== by 0x1: ??? ==8337== by 0xFFF00020A: ??? ==8337== by 0xFFF000215: ??? ==8337== valgrind: m_mallocfree.c:304 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed. valgrind: Heap block lo/hi size mismatch: lo = 96, hi = 0. This is probably caused by your program erroneously writing past the end of a heap block and corrupting heap metadata. If you fix any invalid writes reported by Memcheck, this assertion failure will probably go away. Please try that before reporting this as a bug. host stacktrace: ==8337== at 0x3805DC06: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x3805DD14: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x3805DE96: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x3806AC73: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x380572DB: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x38055DBB: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x38059C2B: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x380553B7: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x38000AD9: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux) ==8337== by 0x8032651A7: ??? ==8337== by 0x802B9DEEF: ??? sched status: running_tid=1 Thread 1: status = VgTs_Runnable ==8337== at 0x52F1CFC: virFree (viralloc.c:582) ==8337== by 0x530E80C: virResetError (virerror.c:386) ==8337== by 0x4E38666: virAdmConnectIsAlive (libvirt-admin.c:374) ==8337== by 0x10F0D7: vshAdmConnectionHandler (virt-admin.c:979) ==8337== by 0x11121A: vshCommandRun (vsh.c:1264) ==8337== by 0x10D5CB: main (virt-admin.c:1386) Thread 2: status = VgTs_Init ==8337== at 0x83ED191: clone (in /usr/lib64/libc-2.17.so) ==8337== by 0x80E2CFF: ??? (in /usr/lib64/libpthread-2.17.so) ==8337== by 0x154DC6FF: ??? Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks.
Created attachment 1173491 [details] patch for virt-admin crash due to uninitialized threadlocal storage
Created attachment 1173492 [details] patch to fix virt-admin crash caused by uninitialized threadlocal storage
I wasn't able to reproduce this on any machine (fedora 23, rhel7), not even on a host with plenty of uptime (where there's lots of apps and the memory is likely to be reused). Could you reproduce this after restarting the host you're testing it on? Anyway, I dug into our code (and into glibc's pthreads as well) and the problem is most likely (as you suspected in #c0) caused by not calling virAdmInitialize, because what happens then is that we do not call pthread_key_create to get a unique thread local storage identifier which leaves us with value 0 (global variable...) which obviously returned some other proces's threadlocal data in your case (whereas I always got NULL which does not reveal the crash). Since you can reproduce the issue with ease, does the patch (slightly modified version of Cole's patch that he proposed) I added to the attachment (#c5) fix the issue for you?
(In reply to Erik Skultety from comment #6) > I wasn't able to reproduce this on any machine (fedora 23, rhel7), not even > on a host with plenty of uptime (where there's lots of apps and the memory > is likely to be reused). Could you reproduce this after restarting the host > you're testing it on? Anyway, I dug into our code (and into glibc's pthreads > as well) and the problem is most likely (as you suspected in #c0) caused by > not calling virAdmInitialize, because what happens then is that we do not > call pthread_key_create to get a unique thread local storage identifier > which leaves us with value 0 (global variable...) which obviously returned > some other proces's threadlocal data in your case (whereas I always got NULL > which does not reveal the crash). > Since you can reproduce the issue with ease, does the patch (slightly > modified version of Cole's patch that he proposed) I added to the attachment > (#c5) fix the issue for you? Hi Erik, Thanks your patch, it fix the issue on my machine. I wondered why i am so lucky (or unlucky) that i can always reproduce this issue on my machine :)
Fixed upstream now: commit c924965b240a7689af888132ae5b38aaabff6d46 Author: Erik Skultety <eskultet> Date: Wed Jun 29 16:12:58 2016 +0200 admin: fix virt-admin startup crash by calling virAdmInitialize Similarly to what virsh virt-login-shell do, call virAdmInitialize prior to initializing an event loop and initializing the error handler. Commit 97973ebb7 described and fixed an identical issue for libvirt_lxc. Since virAdmInitialize becomes a public API after applying this patch, the symbol is also added to public syms and the doc string of the method is slightly enhanced analogically to virInitialize. Signed-off-by: Erik Skultety <eskultet>