Bug 1860222

Summary: virt-qemu-run coredump occasionally when trying to start guest
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: yafu <yafu>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED ERRATA QA Contact: yafu <yafu>
Severity: high Docs Contact:
Priority: medium    
Version: 8.3CC: berrange, chhu, dyuan, jdenemar, jsuchane, lmen, pkrempa, virt-maint, xuzhang
Target Milestone: rcKeywords: Triaged
Target Release: 8.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-7.3.0-1.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 07:50:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 7.1.0
Embargoed:

Description yafu 2020-07-24 03:29:00 UTC
Description of problem:
virt-qemu-run coredump occasionally when trying to start guest 

Version-Release number of selected component (if applicable):
libvirt-6.5.0-1.el8.x86_64
qemu-kvm-5.0.0-2.module+el8.3.0+7379+0505d6ca.x86_64

How reproducible:
occasionally

Steps to Reproduce:
1.stack trace is as follows:
Process 13651 (virt-qemu-run) of user 0 dumped core.
Stack trace of thread 13651:  
0x00007fd19726b7ff raise (libc.so.6)
0x00007fd197255c35 abort (libc.so.6)  
0x00007fd1980fe823 g_mutex_clear (libglib-2.0.so.0)  
0x00007fd1980b5a84 g_main_context_unref (libglib-2.0.so.0) 
0x00007fd19b2d3abb vir_event_thread_finalize (libvirt.so.0)
0x00007fd1983988a9 g_object_unref (libgobject-2.0.so.0)  
0x00007fd190211311 qemuProcessQMPFree (libvirt_driver_qemu.so)  
0x00007fd190164f58 virQEMUCapsInitQMPSingle (libvirt_driver_qemu.so)  
0x00007fd19016508f virQEMUCapsInitQMP (libvirt_driver_qemu.so)  
0x00007fd190165373 virQEMUCapsNewData (libvirt_driver_qemu.so) 
0x00007fd19b3772b1 virFileCacheNewData (libvirt.so.0) 
0x00007fd19b3777ea virFileCacheLookup (libvirt.so.0) 
0x00007fd190165627 virQEMUCapsCacheLookup (libvirt_driver_qemu.so) 
0x00007fd1901974c3 qemuDomainPostParseDataAlloc (libvirt_driver_qemu.so)
0x00007fd19b408051 virDomainDefPostParse (libvirt.so.0)
0x00007fd19b41a4bb virDomainDefParseNode (libvirt.so.0)
0x00007fd19b41a69b virDomainDefParse (libvirt.so.0)
0x00007fd19b41a7ae virDomainDefParseString (libvirt.so.0) 
0x00007fd1902a2112 qemuDomainCreateXML (libvirt_driver_qemu.so)
0x00007fd19b5fa8ac virDomainCreateXML (libvirt.so.0)
0x000055a7f3f0ee7b main (virt-qemu-run)
0x00007fd1972577b3 __libc_start_main (libc.so.6)
0x000055a7f3f0d74e _start (virt-qemu-run)

Stack trace of thread 13655:
0x00007fd1972b6656 malloc_consolidate (libc.so.6)
0x00007fd1972b8b08 _int_malloc (libc.so.6)  
0x00007fd1972ba412 malloc (libc.so.6) 
0x00007fd1980bd25e g_realloc (libglib-2.0.so.0)  
0x00007fd1980d9a57 g_string_maybe_expand (libglib-2.0.so.0) 
0x00007fd1980d9ab6 g_string_sized_new (libglib-2.0.so.0)
0x00007fd1980dcc5e g_test_log (libglib-2.0.so.0)
0x00007fd1980def2a g_assertion_message (libglib-2.0.so.0)  
0x00007fd1980def8e g_assertion_message_expr (libglib-2.0.so.0)  
0x00007fd1980a597e g_hash_table_remove_internal (libglib-2.0.so.0)
0x00007fd1980b4664 g_source_unref_internal (libglib-2.0.so.0)
0x00007fd19b2d384c glib_autoptr_cleanup_GSource (libvirt.so.0)
0x00007fd1980dfd4a g_thread_proxy (libglib-2.0.so.0)
0x00007fd197a1b14a start_thread (libpthread.so.0)
0x00007fd197330f23 __clone (libc.so.6)
Stack trace of thread 13652:
0x00007fd197325ca1 __poll (libc.so.6) 
0x00007fd1980b79b6 g_main_context_iterate.isra.21 (libglib-2.0.so.0)  
0x00007fd1980b7ae0 g_main_context_iteration (libglib-2.0.so.0) 
0x00007fd19b2d34a9 virEventGLibRunOnce (libvirt.so.0)
0x00007fd19b2d20a9 virEventRunDefaultImpl (libvirt.so.0)  
0x000055a7f3f0d91d qemuShimEventLoop (virt-qemu-run)  
0x00007fd1980dfd4a g_thread_proxy (libglib-2.0.so.0)  
0x00007fd197a1b14a start_thread (libpthread.so.0)  
0x00007fd197330f23 __clone (libc.so.6)

Actual results:


Expected results:


Additional info:

Comment 1 Jaroslav Suchanek 2020-07-24 13:07:15 UTC
Can you please attach full coredump, domain configuration and command line options used? Thanks.

Comment 2 yafu 2020-07-27 09:40:58 UTC
(In reply to Jaroslav Suchanek from comment #1)
> Can you please attach full coredump, domain configuration and command line
> options used? Thanks.

Coredump can be reproduced with the cmd:
#for i in {1..1000}; do virt-qemu-run -d -v /tmp/guest.xml & sleep 10; pkill -9 qemu-kvm; done

And the full backtrace is as follows:
Core was generated by `virt-qemu-run -d -v /tmp/guest.xml'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	  return ret;
[Current thread is 1 (Thread 0x7f50a1380c00 (LWP 222802))]
Missing separate debuginfos, use: yum debuginfo-install brotli-1.0.6-1.el8.x86_64 dbus-libs-1.12.8-10.el8.x86_64 device-mapper-libs-1.02.171-2.el8.x86_64 gnutls-3.6.14-3.el8.x86_64 krb5-libs-1.18.2-2.el8.x86_64 libblkid-2.32.1-22.el8.x86_64 libcom_err-1.45.4-3.el8.x86_64 libcurl-7.61.1-12.el8.x86_64 libgcrypt-1.8.5-3.el8.x86_64 libmount-2.32.1-22.el8.x86_64 libnghttp2-1.33.0-1.el8_0.1.x86_64 libssh-0.9.4-1.el8.x86_64 libuuid-2.32.1-22.el8.x86_64 openldap-2.4.46-11.el8.x86_64 openssl-libs-1.1.1g-5.el8.x86_64 sssd-client-2.2.3-20.el8.x86_64 zlib-1.2.11-14.el8.x86_64
(gdb) t a a bt

Thread 3 (Thread 0x7f5096429700 (LWP 222804)):
#0  0x00007f509c951ca1 in __GI___poll (fds=0x55b70acd20c0, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f509d6e39b6 in g_main_context_poll (priority=<optimized out>, n_fds=2, fds=0x55b70acd20c0, timeout=<optimized out>, context=0x55b70acf3530) at gmain.c:4203
#2  0x00007f509d6e39b6 in g_main_context_iterate (context=context@entry=0x55b70acf3530, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
    at gmain.c:3897
#3  0x00007f509d6e3ae0 in g_main_context_iteration (context=0x55b70acf3530, context@entry=0x0, may_block=may_block@entry=1) at gmain.c:3963
#4  0x00007f50a08ff4a9 in virEventGLibRunOnce () at ../../src/util/vireventglib.c:496
#5  0x00007f50a08fe0a9 in virEventRunDefaultImpl () at ../../src/util/virevent.c:343
#6  0x000055b70954691d in qemuShimEventLoop (opaque=<optimized out>) at ../../src/qemu/qemu_shim.c:41
#7  0x00007f509d70bd4a in g_thread_proxy (data=0x55b70acf3800) at gthread.c:784
#8  0x00007f509d04714a in start_thread (arg=<optimized out>) at pthread_create.c:479
#9  0x00007f509c95cf23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7f508ffff700 (LWP 222813)):
#0  0x00007f509c8e26b0 in malloc_consolidate (av=av@entry=0x7f5088000020) at malloc.c:4488
#1  0x00007f509c8e4b08 in _int_malloc (av=av@entry=0x7f5088000020, bytes=bytes@entry=2048) at malloc.c:3711
#2  0x00007f509c8e6412 in __GI___libc_malloc (bytes=2048) at malloc.c:3073
#3  0x00007f509d6e925e in g_realloc (mem=0x0, n_bytes=2048) at gmem.c:164
#4  0x00007f509d705a57 in g_string_maybe_expand (string=string@entry=0x7f5088001f20, len=len@entry=1024) at gstring.c:102
#5  0x00007f509d705ab6 in g_string_sized_new (dfl_size=dfl_size@entry=1024) at gstring.c:127
#6  0x00007f509d708c5e in g_test_log_dump (len=<synthetic pointer>, msg=<synthetic pointer>) at gtestutils.c:3330
#7  0x00007f509d708c5e in g_test_log
    (lbit=G_TEST_LOG_ERROR, string1=0x7f508800fcb0 "GLib:ERROR:ghash.c:377:g_hash_table_lookup_node: assertion failed: (hash_table->ref_count > 0)", string2=<optimized out>, n_args=0, largs=0x0) at gtestutils.c:975
#8  0x00007f509d70af2a in g_assertion_message
    (domain=<optimized out>, file=0x7f509d7324a2 "ghash.c", line=<optimized out>, func=0x7f509d732750 <__func__.11348> "g_hash_table_lookup_node", message=<optimized out>)
    at gtestutils.c:2504
#9  0x00007f509d70af8e in g_assertion_message_expr
    (domain=domain@entry=0x7f509d72d76e "GLib", file=file@entry=0x7f509d7324a2 "ghash.c", line=line@entry=377, func=func@entry=0x7f509d732750 <__func__.11348> "g_hash_table_lookup_node", expr=expr@entry=0x7f509d732488 "hash_table->ref_count > 0") at gtestutils.c:2555
#10 0x00007f509d6d197e in g_hash_table_lookup_node (hash_table=0x55b70ace1760, key=<optimized out>, hash_return=<synthetic pointer>) at ghash.c:377
#11 0x00007f509d6d197e in g_hash_table_lookup_node (hash_return=<synthetic pointer>, key=<optimized out>, hash_table=0x55b70ace1760) at ghash.c:361
#12 0x00007f509d6d197e in g_hash_table_remove_internal (hash_table=0x55b70ace1760, key=<optimized out>, notify=1) at ghash.c:1371
#13 0x00007f509d6e0664 in g_source_unref_internal (source=0x7f5088000b60, context=0x55b70ad87e00, have_lock=0) at gmain.c:2103
#14 0x00007f509d6e1f64 in g_source_unref (source=<optimized out>) at gmain.c:2176
#15 0x00007f50a08ff84c in glib_autoptr_cleanup_GSource (_ptr=<synthetic pointer>) at /usr/include/glib-2.0/glib/glib-autocleanups.h:58
#16 0x00007f50a08ff84c in virEventThreadWorker (opaque=0x55b70ad87f80) at ../../src/util/vireventthread.c:114
#17 0x00007f509d70bd4a in g_thread_proxy (data=0x55b70acf3850) at gthread.c:784
#18 0x00007f509d04714a in start_thread (arg=<optimized out>) at pthread_create.c:479
--Type <RET> for more, q to quit, c to continue without paging--
#19 0x00007f509c95cf23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f50a1380c00 (LWP 222802)):
#0  0x00007f509c8977ff in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f509c881c35 in __GI_abort () at abort.c:79
#2  0x00007f509d72a823 in g_mutex_clear (mutex=0x55b70ad87e00) at gthread-posix.c:1307
#3  0x00007f509d72a823 in g_mutex_clear (mutex=mutex@entry=0x55b70ad87e00) at gthread-posix.c:1302
#4  0x00007f509d6e1a84 in g_main_context_unref (context=0x55b70ad87e00) at gmain.c:582
#5  0x00007f509d6e1a84 in g_main_context_unref (context=0x55b70ad87e00) at gmain.c:541
#6  0x00007f50a08ffabb in vir_event_thread_finalize (object=0x55b70ad83180 [virEventThread]) at ../../src/util/vireventthread.c:50
#7  0x00007f509d9c48a9 in g_object_unref (_object=<optimized out>) at gobject.c:3340
#8  0x00007f509d9c48a9 in g_object_unref (_object=0x55b70ad83180) at gobject.c:3232

#9  0x00007f509583d311 in qemuProcessQMPFree (proc=proc@entry=0x55b70ad87b90) at ../../src/qemu/qemu_process.c:8355
#10 0x00007f5095790f58 in virQEMUCapsInitQMPSingle
    (qemuCaps=qemuCaps@entry=0x55b70ad88010, libDir=libDir@entry=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", runUid=runUid@entry=107, runGid=runGid@entry=107, onlyTCG=onlyTCG@entry=false) at ../../src/qemu/qemu_capabilities.c:5409
#11 0x00007f509579108f in virQEMUCapsInitQMP (runGid=107, runUid=107, libDir=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", qemuCaps=0x55b70ad88010)
    at ../../src/qemu/qemu_capabilities.c:5420
#12 0x00007f509579108f in virQEMUCapsNewForBinaryInternal
    (hostArch=VIR_ARCH_X86_64, binary=binary@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", libDir=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", runUid=107, runGid=107, hostCPUSignature=0x55b70ad01320 "GenuineIntel, Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz, family: 6, model: 85, stepping: 7", microcodeVersion=83898113, kernelVersion=0x55b70ad00d60 "4.18.0-211.el8.x86_64 #1 SMP Thu Jun 4 08:08:16 UTC 2020") at ../../src/qemu/qemu_capabilities.c:5472
#13 0x00007f5095791373 in virQEMUCapsNewData (binary=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", privData=0x55b70ad5b8f0) at ../../src/qemu/qemu_capabilities.c:5505
#14 0x00007f50a09a32b1 in virFileCacheNewData (name=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", cache=<optimized out>) at ../../src/util/virfilecache.c:208
#15 0x00007f50a09a32b1 in virFileCacheValidate (cache=cache@entry=0x55b70ad5c030, name=name@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", data=data@entry=0x7ffca39ffd90)
    at ../../src/util/virfilecache.c:277
#16 0x00007f50a09a37ea in virFileCacheLookup (cache=cache@entry=0x55b70ad5c030, name=name@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm") at ../../src/util/virfilecache.c:310
#17 0x00007f5095791627 in virQEMUCapsCacheLookup (cache=0x55b70ad5c030, binary=0x55b70ad7dc40 "/usr/libexec/qemu-kvm") at ../../src/qemu/qemu_capabilities.c:5647
#18 0x00007f50957c34c3 in qemuDomainPostParseDataAlloc (def=<optimized out>, parseFlags=<optimized out>, opaque=<optimized out>, parseOpaque=0x7ffca39ffe18)
    at ../../src/qemu/qemu_domain.c:5470
#19 0x00007f50a0a34051 in virDomainDefPostParse
    (def=def@entry=0x55b70ad7d200, parseFlags=parseFlags@entry=258, xmlopt=xmlopt@entry=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0)
    at ../../src/conf/domain_conf.c:5970
#20 0x00007f50a0a464bb in virDomainDefParseNode
    (xml=xml@entry=0x55b70aced140, root=root@entry=0x55b70ad5f020, xmlopt=xmlopt@entry=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258)
    at ../../src/conf/domain_conf.c:22520
#21 0x00007f50a0a4669b in virDomainDefParse
    (xmlStr=xmlStr@entry=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., filename=filename@entry=0x0, xmlopt=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258) at ../../src/conf/domain_conf.c:22474
--Type <RET> for more, q to quit, c to continue without paging--
#22 0x00007f50a0a467ae in virDomainDefParseString
    (xmlStr=xmlStr@entry=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., xmlopt=<optimized out>, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258)
    at ../../src/conf/domain_conf.c:22488
#23 0x00007f50958ce112 in qemuDomainCreateXML
    (conn=0x55b70acf9090, xml=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., flags=0) at ../../src/qemu/qemu_driver.c:1744
#24 0x00007f50a0c268ac in virDomainCreateXML
    (conn=0x55b70acf9090, xmlDesc=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., flags=0) at ../../src/libvirt-domain.c:176
#25 0x000055b709547e7b in main (argc=<optimized out>, argv=<optimized out>) at ../../src/qemu/qemu_shim.c:289

Comment 3 Daniel Berrangé 2020-07-29 16:57:41 UTC
I've not been successful in reproducing the problem, but the stack trace is reasonably clear in what's wrong, so I've done this fix:

https://www.redhat.com/archives/libvir-list/2020-July/msg01950.html

Comment 5 Peter Krempa 2021-03-10 17:33:36 UTC
This was fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1894045 but here the reproducer is a bit different since it involves virt-qemu-run. I'll move this to post so that it's tested in the next cycle.

Comment 8 yafu 2021-05-27 01:39:42 UTC
Verified with libvirt-daemon-7.3.0-1.module+el8.5.0+11004+f4810536.x86_64.

Test steps:
1.Execute 'virt-qemu-run' to start guest 1000 times and no coredump happend:
# for i in {1..1000}; do virt-qemu-run -d -v /root/vm1.xml & sleep 5; pkill -9 qemu-kvm; done

Comment 10 errata-xmlrpc 2021-11-16 07:50:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684