Bug 1845468
| Summary: | [backport] libvirt crashes when stopping daemon after virsh command | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | smitterl |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
| Status: | CLOSED ERRATA | QA Contact: | yafu <yafu> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 8.3 | CC: | dzheng, jdenemar, jsuchane, lmen, mprivozn, pkrempa, smitterl, virt-bugs, virt-maint, yalzhang |
| Target Milestone: | rc | Keywords: | Automation, Regression, Triaged, Upstream |
| Target Release: | 8.0 | Flags: | pm-rhel:
mirror+
|
| Hardware: | All | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1836865 | Environment: | |
| Last Closed: | 2022-05-10 13:18:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | 7.10.0 |
| Embargoed: | |||
| Bug Depends On: | 1836865, 1949342 | ||
| Bug Blocks: | |||
|
Comment 2
Peter Krempa
2020-07-13 08:05:35 UTC
There were number of changes/improvements in the deamon stopping code,
commit 9b648cb83eeb86293ed3471098fa21f6c44b5e31
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 11:13:44 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:34:00 2020 +0300
util: remove unused virThreadPoolNew macro
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 61845fbf4226118a0cf868974cc100b852e0a282
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 11:13:12 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
rpc: cleanup virNetDaemonClose method
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 399039a6b1d07610294cc810ac7a01b51ff6b5cf
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 11:12:26 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
qemu: implement driver's shutdown/shutdown wait methods
On shutdown we just stop accepting new jobs for worker thread so that on
shutdown wait we can exit worker thread faster. Yes we basically stop
processing of events for VMs but we are going to do so anyway in case of daemon
shutdown.
At the same time synchronous event processing that some API calls may require
are still possible as per VM event loop is still running and we don't need
worker thread for synchronous event processing.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 860a999802d3c82538373bb3f314f92a2e258754
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 11:02:59 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
qemu: avoid deadlock in qemuDomainObjStopWorker
We are dropping the only reference here so that the event loop thread
is going to be exited synchronously. In order to avoid deadlocks we
need to unlock the VM so that any handler being called can finish
execution and thus even loop thread be finished too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
Reviewed-by: Daniel P. Berrangé <berrange>
commit f4fc3db9204407874181117085756c9ced78adad
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 10:23:00 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
vireventthread: exit thread synchronously on finalize
It it useful to be sure no thread is running after we drop all references to
virEventThread. Otherwise in order to avoid crashes we need to synchronize some
[jarda@jsrh libvirt]$ git log --author=nshirokovskiy
commit 9b648cb83eeb86293ed3471098fa21f6c44b5e31
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 11:13:44 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:34:00 2020 +0300
util: remove unused virThreadPoolNew macro
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 61845fbf4226118a0cf868974cc100b852e0a282
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 11:13:12 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
rpc: cleanup virNetDaemonClose method
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 399039a6b1d07610294cc810ac7a01b51ff6b5cf
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 11:12:26 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
qemu: implement driver's shutdown/shutdown wait methods
On shutdown we just stop accepting new jobs for worker thread so that on
shutdown wait we can exit worker thread faster. Yes we basically stop
processing of events for VMs but we are going to do so anyway in case of daemon
shutdown.
At the same time synchronous event processing that some API calls may require
are still possible as per VM event loop is still running and we don't need
worker thread for synchronous event processing.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 860a999802d3c82538373bb3f314f92a2e258754
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 11:02:59 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
qemu: avoid deadlock in qemuDomainObjStopWorker
We are dropping the only reference here so that the event loop thread
is going to be exited synchronously. In order to avoid deadlocks we
need to unlock the VM so that any handler being called can finish
execution and thus even loop thread be finished too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
Reviewed-by: Daniel P. Berrangé <berrange>
commit f4fc3db9204407874181117085756c9ced78adad
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 10:23:00 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
vireventthread: exit thread synchronously on finalize
It it useful to be sure no thread is running after we drop all references to
virEventThread. Otherwise in order to avoid crashes we need to synchronize some
other way or we make extra references in event handler callbacks to all the
object in use. And some of them are not prepared to be refcounted.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
Reviewed-by: Daniel P. Berrangé <berrange>
commit 5c0cd375d1d1659ae3a5db0ce3e26e5570123dff
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 10:10:26 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
qemu: don't shutdown event thread in monitor EOF callback
This hunk was introduced in [1] in order to avoid loosing
events from monitor on stopping qemu process. But as explained
in [2] on destroy we won't get neither EOF nor any other
events as monitor is just closed. In case of crash/shutdown
we won't get any more events as well and qemuDomainObjStopWorker
will be called by qemuProcessStop eventually. Thus let's
remove qemuDomainObjStopWorker from qemuProcessHandleMonitorEOF
as it is not useful anymore.
[1] e6afacb0f: qemu: start/stop an event loop thread for domains
[2] d2954c072: qemu: ensure domain event thread is always stopped
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
Reviewed-by: Daniel P. Berrangé <berrange>
commit 94e45d1042e21e03a15ce993f90fbef626f1ae41
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 09:53:04 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
rpc: finish all threads before exiting main loop
Currently we have issues like [1] on libvirtd shutdown as we cleanup while RPC
and other threads are still running. Let's finish all threads other then main
before cleanup.
The approach to finish threads is suggested in [2]. In order to finish RPC
threads serving API calls we let the event loop run but stop accepting new API
calls and block processing any pending API calls. We also inform all drivers of
shutdown so they can prepare for shutdown too. Then we wait for all RPC threads
and driver's background thread to finish. If finishing takes more then 15s we
just exit as we can't safely cleanup in time.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1828207
[2] https://www.redhat.com/archives/libvir-list/2020-April/msg01328.html
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
Reviewed-by: Daniel P. Berrangé <berrange>
commit b776dfa8e881c868dc554c5c245f15c49332ce80
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 09:50:25 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
rpc: add shutdown facilities to netserver
virNetServerClose and virNetServerShutdownWait are used to start net server
threads shutdown and wait net server threads to actually finish respectively
during net daemon shutdown procedure.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 0f38dedd8929dcb1473fc64773be4b941526ee1d
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 23 09:43:46 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:59 2020 +0300
rpc: add virNetDaemonSetShutdownCallbacks
The function is used to set shutdown prepare and wait callbacks. Prepare
callback is used to inform other threads of the daemon that the daemon will be
closed soon so that they can start to shutdown. Wait callback is used to wait
for other threads to actually finish.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 1eae52b9f1f2c0232d14e0effa47e8e6e5cce28d
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 10:59:33 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:58 2020 +0300
rpc: don't unref service ref on socket behalf twice
Second unref was added in [1]. We don't need it actually as
we pass free callback to virNetSocketAddIOCallback thus
when we call virNetSocketRemoveIOCallback the extra ref for
callback will be dropped without extra efforts.
[1] 355d8f470f9: virNetServerServiceClose: Don't leak sockets
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 255437eeb710d8135136af11b37ceae674d483ce
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 10:58:02 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:58 2020 +0300
util: add stop/drain functions to thread pool
Stop just send signal for threads to exit when they finish with
current task. Drain waits when all threads will finish.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit 018e213f5d1bbf5a68b7b7d46c8bacec06d97d49
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Fri Jul 10 14:36:54 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:58 2020 +0300
util: always initialize priority condition
Even if we have no priority threads on pool creation we can add them thru
virThreadPoolSetParameters later.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit c5bf40bfa6afa4ec27052120e268a55213709ca4
Author: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
commit c5bf40bfa6afa4ec27052120e268a55213709ca4
Author: Nikolay Shirokovskiy <nshirokovskiy>
AuthorDate: Thu Jul 9 10:46:34 2020 +0300
Commit: Nikolay Shirokovskiy <nshirokovskiy>
CommitDate: Mon Sep 7 09:33:58 2020 +0300
libvirt: add stateShutdownPrepare/stateShutdownWait to drivers
stateShutdownPrepare is supposed to inform driver that it will be closed soon
so that the driver can prepare and finish all background threads quickly on
stateShutdownWait call.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy>
Reviewed-by: Daniel P. Berrangé <berrange>
Reviewed-by: Daniel Henrique Barboza <danielhb413>
Would you be able to re-test this with current master? Thanks.
Couldn't reproduce with latest master. Reproduced with: libvirt-daemon-6.6.0-4.module+el8.3.0+7883+3d717aa8.s390x How often: 8/10 Command: for i in $(seq 10); do dmesg -C; systemctl start libvirtd; virsh list; systemctl stop libvirtd; dmesg; sleep 2; done Verified with: libvirt@master HEAD@8c16f81eb97bbd576a009f64f13773200171704b (Thu Sep 10 15:24:02 2020 +0200) #virsh --version 6.8.0 How often: 0/50 Command: for i in $(seq 50); do dmesg -C; systemctl start libvirtd; virsh list; systemctl stop libvirtd; dmesg; sleep 2; done Moving to POST per comments 3 and 4. Hi Michal,
I met another libvirtd crash while trying to verify this bug. Could you help to check it please? Thanks in advance.
Test steps:
1.# for i in {1..10}; do systemctl restart libvirtd; virsh list ; systemctl restart libvirtd ; sleep 3; done
2.# coredumpctl debug
PID: 1656719 (libvirtd)
UID: 0 (root)
GID: 0 (root)
Signal: 11 (SEGV)
Timestamp: Tue 2021-10-26 03:59:46 EDT (4min 49s ago)
Command Line: /usr/sbin/libvirtd --timeout 120
Executable: /usr/sbin/libvirtd
Control Group: /
Slice: -.slice
Boot ID: 914f188bc9df437ca6ca52fcf196857d
Machine ID: a7d52dfbef0b4c86a4823ec28ac04cf1
Hostname: dell-per730-30.lab.eng.pek2.redhat.com
Storage: /var/lib/systemd/coredump/core.libvirtd.0.914f188bc9df437ca6ca52fcf196857d.1656719.1635235186000000.lz4
Message: Process 1656719 (libvirtd) of user 0 dumped core.
Stack trace of thread 1656794:
#0 0x00007f836a26ac24 virEventThreadGetContext (libvirt.so.0)
#1 0x00007f832090e547 qemuConnectAgent (libvirt_driver_qemu.so)
#2 0x00007f832091aa7f qemuProcessReconnect (libvirt_driver_qemu.so)
#3 0x00007f836a2be31b virThreadHelper (libvirt.so.0)
#4 0x00007f83665b117a start_thread (libpthread.so.0)
#5 0x00007f8368d5fdc3 __clone (libc.so.6)
Stack trace of thread 1656719:
#0 0x00007f8368cb3167 vfprintf (libc.so.6)
#1 0x00007f8368d6fc8c __vasprintf_chk (libc.so.6)
#2 0x00007f8369b03d7d g_vasprintf (libglib-2.0.so.0)
#3 0x00007f8369adcf51 g_strdup_vprintf (libglib-2.0.so.0)
#4 0x00007f836a2485ed vir_g_strdup_vprintf (libvirt.so.0)
#5 0x00007f836a288717 virLogMessage (libvirt.so.0)
#6 0x00007f836a2a1d8a virObjectUnref (libvirt.so.0)
#7 0x00007f836a3e9f4c virSecurityStackClose (libvirt.so.0)
#8 0x00007f836a3e64f7 virSecurityManagerDispose (libvirt.so.0)
#9 0x00007f836a2a18de vir_object_finalize (libvirt.so.0)
#10 0x00007f83698328a9 g_object_unref (libgobject-2.0.so.0)
#11 0x00007f836a2a1d58 virObjectUnref (libvirt.so.0)
#12 0x00007f832088b5dc qemuStateCleanup (libvirt_driver_qemu.so)
#13 0x00007f836a4721d0 virStateCleanup (libvirt.so.0)
#14 0x000055b37089866d main (libvirtd)
#15 0x00007f8368c86493 __libc_start_main (libc.so.6)
#16 0x000055b370898e7e _start (libvirtd)
Stack trace of thread 1656736:
#0 0x00007f8368d54a41 __poll (libc.so.6)
#1 0x00007f8369abdc86 g_main_context_iterate.isra.21 (libglib-2.0.so.0)
#2 0x00007f8369abddb0 g_main_context_iteration (libglib-2.0.so.0)
#3 0x00007f8369abde01 glib_worker_main (libglib-2.0.so.0)
#4 0x00007f8369ae5fca g_thread_proxy (libglib-2.0.so.0)
#5 0x00007f83665b117a start_thread (libpthread.so.0)
#6 0x00007f8368d5fdc3 __clone (libc.so.6)
Stack trace of thread 1656788:
#0 0x00007f8368d506eb llseek.5 (libc.so.6)
#1 0x00007f836a2a26a4 virPCIDeviceRead.isra.1 (libvirt.so.0)
#2 0x00007f836a2a2790 virPCIDeviceFindCapabilityOffset (libvirt.so.0)
#3 0x00007f836a2a44ed virPCIDeviceInit (libvirt.so.0)
#4 0x00007f836a2a6e90 virPCIDeviceIsPCIExpress (libvirt.so.0)
#5 0x00007f8321231f6d udevAddOneDevice (libvirt_driver_nodedev.so)
#6 0x00007f83212326ee nodeStateInitializeEnumerate (libvirt_driver_nodedev.so)
#7 0x00007f836a2be31b virThreadHelper (libvirt.so.0)
#8 0x00007f83665b117a start_thread (libpthread.so.0)
#9 0x00007f8368d5fdc3 __clone (libc.so.6)
Stack trace of thread 1656738:
#0 0x00007f8368d54a41 __poll (libc.so.6)
#1 0x00007f8369abdc86 g_main_context_iterate.isra.21 (libglib-2.0.so.0)
#2 0x00007f8369abe042 g_main_loop_run (libglib-2.0.so.0)
#3 0x00007f836954a5da gdbus_shared_thread_func (libgio-2.0.so.0)
#4 0x00007f8369ae5fca g_thread_proxy (libglib-2.0.so.0)
#5 0x00007f83665b117a start_thread (libpthread.so.0)
#6 0x00007f8368d5fdc3 __clone (libc.so.6)
Stack trace of thread 1656787:
#0 0x00007f83665b73fc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1 0x00007f836a2be1ca virCondWait (libvirt.so.0)
#2 0x00007f83212328ac udevEventHandleThread (libvirt_driver_nodedev.so)
#3 0x00007f836a2be31b virThreadHelper (libvirt.so.0)
#4 0x00007f83665b117a start_thread (libpthread.so.0)
#5 0x00007f8368d5fdc3 __clone (libc.so.6)
Stack trace of thread 1656793:
#0 0x00007f83665b73fc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1 0x00007f836a2be1ca virCondWait (libvirt.so.0)
#2 0x00007f83208e5654 qemuMonitorSend (libvirt_driver_qemu.so)
#3 0x00007f83208f4c35 qemuMonitorJSONCommandWithFd (libvirt_driver_qemu.so)
#4 0x00007f83208ff658 qemuMonitorJSONGetObjectListPaths (libvirt_driver_qemu.so)
#5 0x00007f8320901d4b qemuMonitorJSONGetDeviceAliases (libvirt_driver_qemu.so)
#6 0x00007f832087a3e3 qemuDomainUpdateDeviceList (libvirt_driver_qemu.so)
#7 0x00007f832091a6c7 qemuProcessReconnect (libvirt_driver_qemu.so)
#8 0x00007f836a2be31b virThreadHelper (libvirt.so.0)
#9 0x00007f83665b117a start_thread (libpthread.so.0)
#10 0x00007f8368d5fdc3 __clone (libc.so.6)
Stack trace of thread 1656792:
#0 0x00007f83665b73fc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1 0x00007f836a2be1ca virCondWait (libvirt.so.0)
#2 0x00007f83208e5654 qemuMonitorSend (libvirt_driver_qemu.so)
#3 0x00007f83208f4c35 qemuMonitorJSONCommandWithFd (libvirt_driver_qemu.so)
#4 0x00007f83208ff658 qemuMonitorJSONGetObjectListPaths (libvirt_driver_qemu.so)
#5 0x00007f8320901d4b qemuMonitorJSONGetDeviceAliases (libvirt_driver_qemu.so)
#6 0x00007f832087a3e3 qemuDomainUpdateDeviceList (libvirt_driver_qemu.so)
#7 0x00007f832091a6c7 qemuProcessReconnect (libvirt_driver_qemu.so)
#8 0x00007f836a2be31b virThreadHelper (libvirt.so.0)
#9 0x00007f83665b117a start_thread (libpthread.so.0)
#10 0x00007f8368d5fdc3 __clone (libc.so.6)
I did the test with libvirt-7.8.0-1.module+el8.6.0+12982+5e169f40.x86_64 for the issue in comment #9. I'm sorry but from the stack trace it's unclear to me what is wrong. I suspect the second thread is the one that crashed, but it's not obvious why. Do you have coredump available please? (In reply to Michal Privoznik from comment #11) > I'm sorry but from the stack trace it's unclear to me what is wrong. I > suspect the second thread is the one that crashed, but it's not obvious why. > Do you have coredump available please? Sorry, i did not install the debug pkgs for the backtrace in comment #9. How about this backtrace? (gdb) t a a bt Thread 8 (Thread 0x7f8305ffb700 (LWP 1656792)): #0 0x00007f83665b73fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f836a2be1ca in virCondWait (c=c@entry=0x7f82e4005630, m=m@entry=0x7f82e4005608) at ../src/util/virthread.c:156 #2 0x00007f83208e5654 in qemuMonitorSend (mon=mon@entry=0x7f82e40055f0, msg=msg@entry=0x7f8305ffa780) at ../src/qemu/qemu_monitor.c:961 #3 0x00007f83208f4c35 in qemuMonitorJSONCommandWithFd (mon=0x7f82e40055f0, cmd=0x7f82f0000c60, scm_fd=-1, reply=0x7f8305ffa810) at ../src/qemu/qemu_monitor_json.c:327 #4 0x00007f83208ff658 in qemuMonitorJSONCommand (reply=0x7f8305ffa810, cmd=0x7f82f0000c60, mon=0x7f82e40055f0) at ../src/qemu/qemu_monitor_json.c:6283 #5 qemuMonitorJSONGetObjectListPaths (mon=0x7f82e40055f0, path=path@entry=0x7f832095c873 "/machine/peripheral", paths=paths@entry=0x7f8305ffa860) at ../src/qemu/qemu_monitor_json.c:6283 #6 0x00007f8320901d4b in qemuMonitorJSONGetDeviceAliases (mon=<optimized out>, aliases=aliases@entry=0x7f8305ffa890) at ../src/qemu/qemu_monitor_json.c:7421 #7 0x00007f83208eefc6 in qemuMonitorGetDeviceAliases (mon=<optimized out>, aliases=aliases@entry=0x7f8305ffa890) at ../src/qemu/qemu_monitor.c:3920 #8 0x00007f832087a3e3 in qemuDomainUpdateDeviceList (driver=driver@entry=0x7f8300147890, vm=vm@entry=0x7f83001d4020, asyncJob=asyncJob@entry=0) at ../src/qemu/qemu_domain.c:8163 #9 0x00007f832091a6c7 in qemuProcessUpdateDevices (vm=0x7f83001d4020, driver=0x7f8300147890) at ../src/qemu/qemu_process.c:3726 #10 qemuProcessReconnect (opaque=<optimized out>) at ../src/qemu/qemu_process.c:8719 #11 0x00007f836a2be31b in virThreadHelper (data=<optimized out>) at ../src/util/virthread.c:241 #12 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 #13 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f83057fa700 (LWP 1656793)): #0 0x00007f83665b73fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f836a2be1ca in virCondWait (c=c@entry=0x7f82e40050b0, m=m@entry=0x7f82e4005088) at ../src/util/virthread.c:156 #2 0x00007f83208e5654 in qemuMonitorSend (mon=mon@entry=0x7f82e4005070, msg=msg@entry=0x7f83057f9780) at ../src/qemu/qemu_monitor.c:961 #3 0x00007f83208f4c35 in qemuMonitorJSONCommandWithFd (mon=0x7f82e4005070, cmd=0x7f82e4010340, scm_fd=-1, reply=0x7f83057f9810) at ../src/qemu/qemu_monitor_json.c:327 #4 0x00007f83208ff658 in qemuMonitorJSONCommand (reply=0x7f83057f9810, cmd=0x7f82e4010340, mon=0x7f82e4005070) at ../src/qemu/qemu_monitor_json.c:6283 #5 qemuMonitorJSONGetObjectListPaths (mon=0x7f82e4005070, path=path@entry=0x7f832095c873 "/machine/peripheral", paths=paths@entry=0x7f83057f9860) at ../src/qemu/qemu_monitor_json.c:6283 #6 0x00007f8320901d4b in qemuMonitorJSONGetDeviceAliases (mon=<optimized out>, aliases=aliases@entry=0x7f83057f9890) at ../src/qemu/qemu_monitor_json.c:7421 #7 0x00007f83208eefc6 in qemuMonitorGetDeviceAliases (mon=<optimized out>, aliases=aliases@entry=0x7f83057f9890) at ../src/qemu/qemu_monitor.c:3920 #8 0x00007f832087a3e3 in qemuDomainUpdateDeviceList (driver=driver@entry=0x7f8300147890, vm=vm@entry=0x7f83001d4110, asyncJob=asyncJob@entry=0) at ../src/qemu/qemu_domain.c:8163 #9 0x00007f832091a6c7 in qemuProcessUpdateDevices (vm=0x7f83001d4110, driver=0x7f8300147890) at ../src/qemu/qemu_process.c:3726 #10 qemuProcessReconnect (opaque=<optimized out>) at ../src/qemu/qemu_process.c:8719 #11 0x00007f836a2be31b in virThreadHelper (data=<optimized out>) at ../src/util/virthread.c:241 #12 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 #13 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f8306ffd700 (LWP 1656787)): #0 0x00007f83665b73fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f836a2be1ca in virCondWait (c=c@entry=0x55b370dd26f8, m=m@entry=0x55b370dd26b8) at ../src/util/virthread.c:156 #2 0x00007f83212328ac in udevEventHandleThread (opaque=<optimized out>) at ../src/node_device/node_device_udev.c:1802 #3 0x00007f836a2be31b in virThreadHelper (data=<optimized out>) at ../src/util/virthread.c:241 #4 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 #5 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f83077fe700 (LWP 1656738)): #0 0x00007f8368d54a41 in poll () from /lib64/libc.so.6 #1 0x00007f8369abdc86 in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0 #2 0x00007f8369abe042 in g_main_loop_run () from /lib64/libglib-2.0.so.0 #3 0x00007f836954a5da in gdbus_shared_thread_func () from /lib64/libgio-2.0.so.0 #4 0x00007f8369ae5fca in g_thread_proxy () from /lib64/libglib-2.0.so.0 #5 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 #6 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f83067fc700 (LWP 1656788)): --Type <RET> for more, q to quit, c to continue without paging-- #0 0x00007f8368d506eb in lseek64 () from /lib64/libc.so.6 #1 0x00007f836a2a26a4 in virPCIDeviceRead (cfgfd=cfgfd@entry=3, pos=pos@entry=6, buf=buf@entry=0x7f83067fb936 "", buflen=buflen@entry=2, dev=<optimized out>) at ../src/util/virpci.c:341 #2 0x00007f836a2a2790 in virPCIDeviceRead16 (pos=6, cfgfd=3, dev=<optimized out>) at ../src/util/virpci.c:540 #3 virPCIDeviceFindCapabilityOffset (dev=dev@entry=0x7f82ec081160, cfgfd=cfgfd@entry=3, capability=capability@entry=1, offset=offset@entry=0x7f82ec0811bc) at ../src/util/virpci.c:540 #4 0x00007f836a2a44ed in virPCIDeviceInit (dev=0x7f82ec081160, cfgfd=3) at ../src/util/virpci.c:995 #5 0x00007f836a2a6e90 in virPCIDeviceIsPCIExpress (dev=dev@entry=0x7f82ec081160) at ../src/util/virpci.c:2727 #6 0x00007f8321231f6d in udevProcessPCI (def=0x7f82ec04eca0, device=<optimized out>) at ../src/node_device/node_device_udev.c:434 #7 udevGetDeviceDetails (def=0x7f82ec04eca0, device=<optimized out>) at ../src/node_device/node_device_udev.c:1366 #8 udevAddOneDevice (device=<optimized out>) at ../src/node_device/node_device_udev.c:1524 #9 0x00007f83212326ee in udevProcessDeviceListEntry (list_entry=0x7f82ec01e6b0, udev=0x7f830010dda0) at ../src/node_device/node_device_udev.c:1594 #10 udevEnumerateDevices (udev=0x7f830010dda0) at ../src/node_device/node_device_udev.c:1648 #11 nodeStateInitializeEnumerate (opaque=0x7f830010dda0) at ../src/node_device/node_device_udev.c:1979 #12 0x00007f836a2be31b in virThreadHelper (data=<optimized out>) at ../src/util/virthread.c:241 #13 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 #14 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f8307fff700 (LWP 1656736)): #0 0x00007f8368d54a41 in poll () from /lib64/libc.so.6 #1 0x00007f8369abdc86 in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0 #2 0x00007f8369abddb0 in g_main_context_iteration () from /lib64/libglib-2.0.so.0 #3 0x00007f8369abde01 in glib_worker_main () from /lib64/libglib-2.0.so.0 #4 0x00007f8369ae5fca in g_thread_proxy () from /lib64/libglib-2.0.so.0 #5 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 #6 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f836aa08b00 (LWP 1656719)): #0 0x00007f8368cb3167 in vfprintf () from /lib64/libc.so.6 #1 0x00007f8368d6fc8c in __vasprintf_chk () from /lib64/libc.so.6 #2 0x00007f8369b03d7d in g_vasprintf () from /lib64/libglib-2.0.so.0 #3 0x00007f8369adcf51 in g_strdup_vprintf () from /lib64/libglib-2.0.so.0 #4 0x00007f836a2485ed in vir_g_strdup_vprintf (msg=msg@entry=0x7f836a4c8ba9 "OBJECT_UNREF: obj=%p", args=args@entry=0x7ffefc860cf8) at ../src/util/glibcompat.c:209 #5 0x00007f836a288717 in virLogVMessage (vargs=0x7ffefc860cf8, fmt=0x7f836a4c8ba9 "OBJECT_UNREF: obj=%p", metadata=0x0, funcname=0x7f836a4c8d68 <__func__.19256> "virObjectUnref", linenr=380, filename=0x7f836a4c8b2f "../src/util/virobject.c", priority=VIR_LOG_INFO, source=0x7f836a7ff4b0 <virLogSelf>) at ../src/util/virlog.c:536 #6 virLogMessage (source=source@entry=0x7f836a7ff4b0 <virLogSelf>, priority=priority@entry=VIR_LOG_INFO, filename=filename@entry=0x7f836a4c8b2f "../src/util/virobject.c", linenr=linenr@entry=380, funcname=funcname@entry=0x7f836a4c8d68 <__func__.19256> "virObjectUnref", metadata=metadata@entry=0x0, fmt=0x7f836a4c8ba9 "OBJECT_UNREF: obj=%p") at ../src/util/virlog.c:635 #7 0x00007f836a2a1d8a in virObjectUnref (anyobj=0x7f8300128420) at ../src/util/virobject.c:380 #8 0x00007f836a3e9f4c in virSecurityStackClose (mgr=<optimized out>) at ../src/security/security_stack.c:96 #9 0x00007f836a3e64f7 in virSecurityManagerDispose (obj=0x7f8300128490) at ../src/security/security_manager.c:57 #10 0x00007f836a2a18de in vir_object_finalize (gobj=0x7f8300128490) at ../src/util/virobject.c:325 #11 0x00007f83698328a9 in g_object_unref () from /lib64/libgobject-2.0.so.0 #12 0x00007f836a2a1d58 in virObjectUnref (anyobj=0x7f8300128490) at ../src/util/virobject.c:379 #13 0x00007f832088b5dc in qemuStateCleanup () at ../src/qemu/qemu_driver.c:1085 #14 0x00007f836a4721d0 in virStateCleanup () at ../src/libvirt.c:733 #15 0x000055b37089866d in main (argc=<optimized out>, argv=<optimized out>) at ../src/remote/remote_daemon.c:1227 Thread 1 (Thread 0x7f8304ff9700 (LWP 1656794)): #0 virEventThreadGetContext (evt=0x0) at ../src/util/vireventthread.c:194 #1 0x00007f832090e547 in qemuConnectAgent (driver=driver@entry=0x7f8300147890, vm=0x7f83001d4200) at ../src/qemu/qemu_process.c:243 #2 0x00007f832091aa7f in qemuProcessReconnect (opaque=<optimized out>) at ../src/qemu/qemu_process.c:8727 #3 0x00007f836a2be31b in virThreadHelper (data=<optimized out>) at ../src/util/virthread.c:241 #4 0x00007f83665b117a in start_thread () from /lib64/libpthread.so.0 --Type <RET> for more, q to quit, c to continue without paging-- #5 0x00007f8368d5fdc3 in clone () from /lib64/libc.so.6 Thanks, I've posted patches here: https://listman.redhat.com/archives/libvir-list/2021-October/msg01114.html Let me move back to ASSIGNED so that the patch can be picked up by rebase (once pushed upstream). Patches from comment 13 pushed upstream: 3640731ed5 qemuMonitorOpen: Rework domain object refcounting e812213bc1 qemu_agent: Drop destroy callback 0a9cb29ba2 qemuAgentOpen: Rework domain object refcounting 108e131a3d qemu_agent: Rework domain object locking when opening agent v7.9.0-130-g3640731ed5 Verified with libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d.x86_64.
Test steps:
1.#for i in {1..50}; do systemctl restart libvirtd; virsh list ; systemctl restart libvirtd ; sleep 5; done
2.Check coredump info:
#coredumpctl list
No coredumps found.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1759 |