Bug 1402921

Summary: qemu monitor event handlers requiring VM lock should not be run from daemon event loop
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Peter Krempa <pkrempa>
Component: libvirtAssignee: Daniel Berrangé <berrange>
Status: CLOSED CURRENTRELEASE QA Contact: Han Han <hhan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.0CC: dm, dyuan, jdenemar, lmen, mprivozn, pkrempa, xuzhang
Target Milestone: rcKeywords: TestOnly, Triaged
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-19 16:19:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Peter Krempa 2016-12-08 16:03:13 UTC
Description of problem:
Asynchronous qemu events are handled by the event loop handler in the daemon. The handler code calls qemuMonitorEmitEvent which attempts to lock the VM object. If the VM object is already locked the event loop gets stuck.

Version-Release number of selected component (if applicable):
upstream for quite a while now (all rhel7)

How reproducible:
100%

Steps to Reproduce:
1. destroy VM with gluster disk that has a lot of offline servers at the beginning
2. observe stuck event loop
3.

Actual results:


Expected results:


Additional info:

Event loop is stuck:
Thread 1 (Thread 0x7f024edbb880 (LWP 50009)):
#0  0x00007f024b463f1c in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f024b45db75 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00007f024e0a6e19 in virMutexLock (m=0x7f01dc137a30) at util/virthread.c:89
#3  0x00007f024e0826e4 in virObjectLock (anyobj=0x7f01dc137a20) at util/virobject.c:323
#4  0x00007f0204a7d460 in qemuProcessHandleEvent (mon=0x7f0228c18ef0, vm=0x7f01dc137a20, 
    eventName=0x565338e1cc90 "SHUTDOWN", seconds=1481212149, micros=367587, details=0x0, 
    opaque=0x7f01dc079e20) at qemu/qemu_process.c:618
#5  0x00007f0204aa15eb in qemuMonitorEmitEvent (mon=0x7f0228c18ef0, event=0x565338e1cc90 "SHUTDOWN", 
    seconds=1481212149, micros=367587, details=0x0) at qemu/qemu_monitor.c:1274
#6  0x00007f0204ab4a47 in qemuMonitorJSONIOProcessEvent (mon=0x7f0228c18ef0, obj=0x565338e1c710)
    at qemu/qemu_monitor_json.c:170
#7  0x00007f0204ab4c67 in qemuMonitorJSONIOProcessLine (mon=0x7f0228c18ef0, 
    line=0x565338e54480 "{\"timestamp\": {\"seconds\": 1481212149, \"microseconds\": 367587}, \"event\": \"SHUTDOWN\"}", msg=0x0) at qemu/qemu_monitor_json.c:207
#8  0x00007f0204ab4eb9 in qemuMonitorJSONIOProcess (mon=0x7f0228c18ef0, 
    data=0x565338e22a30 "{\"timestamp\": {\"seconds\": 1481212149, \"microseconds\": 367587}, \"event\": \"SHUTDOWN\"}\r\n", len=85, msg=0x0) at qemu/qemu_monitor_json.c:249
#9  0x00007f0204a9f35e in qemuMonitorIOProcess (mon=0x7f0228c18ef0) at qemu/qemu_monitor.c:432
#10 0x00007f0204a9fca7 in qemuMonitorIO (watch=28, fd=24, events=0, opaque=0x7f0228c18ef0)
    at qemu/qemu_monitor.c:686
#11 0x00007f024e0467d8 in virEventPollDispatchHandles (nfds=11, fds=0x565338e551f0)
    at util/vireventpoll.c:508
#12 0x00007f024e046f8d in virEventPollRunOnce () at util/vireventpoll.c:657
#13 0x00007f024e04507c in virEventRunDefaultImpl () at util/virevent.c:314
#14 0x00007f024e1f645b in virNetDaemonRun (dmn=0x565338e21de0) at rpc/virnetdaemon.c:818
#15 0x0000565338173230 in main (argc=1, argv=0x7ffe87606f18) at libvirtd.c:1617


While other thread does some business:

hread 5 (Thread 0x7f023f921700 (LWP 50026)):
#0  0x00007f024b46112f in pthread_cond_wait () from /lib64/libpthread.so.0
#1  0x00007f023fbd8293 in glfs_lock (fs=fs@entry=0x7f0230094220)
    at /var/tmp/portage/sys-cluster/glusterfs-3.8.6/work/glusterfs-3.8.6/api/src/glfs-internal.h:338
#2  glfs_init_wait (fs=fs@entry=0x7f0230094220)
    at /var/tmp/portage/sys-cluster/glusterfs-3.8.6/work/glusterfs-3.8.6/api/src/glfs.c:887
#3  0x00007f023fbd8708 in pub_glfs_init (fs=0x7f0230094220)
    at /var/tmp/portage/sys-cluster/glusterfs-3.8.6/work/glusterfs-3.8.6/api/src/glfs.c:997
#4  0x00007f02442a7b00 in virStorageFileBackendGlusterInit (src=0x7f02300b0720)
    at storage/storage_backend_gluster.c:643
#5  0x00007f024428ce2d in virStorageFileInitAs (src=0x7f02300b0720, uid=4294967295, gid=4294967295)
    at storage/storage_driver.c:2947
#6  0x00007f024428ce7b in virStorageFileInit (src=0x7f02300b0720) at storage/storage_driver.c:2967
#7  0x00007f0204ac833b in qemuSecurityChownCallback (src=0x7f02300b0720, uid=0, gid=0)
    at qemu/qemu_driver.c:359
#8  0x00007f024e29fef7 in virSecurityDACSetOwnershipInternal (priv=0x7f01dc015100, src=0x7f02300b0720, 
    path=0x0, uid=0, gid=0) at security/security_dac.c:294
#9  0x00007f024e2a04e2 in virSecurityDACRestoreFileLabelInternal (priv=0x7f01dc015100, src=0x7f02300b0720, 
    path=0x0) at security/security_dac.c:403
#10 0x00007f024e2a0862 in virSecurityDACRestoreImageLabelInt (mgr=0x7f01dc2110d0, def=0x7f02300afb90, 
    src=0x7f02300b0720, migrated=false) at security/security_dac.c:518
#11 0x00007f024e2a1b50 in virSecurityDACRestoreAllLabel (mgr=0x7f01dc2110d0, def=0x7f02300afb90, 
    migrated=false) at security/security_dac.c:1157
#12 0x00007f024e2a440d in virSecurityManagerRestoreAllLabel (mgr=0x7f01dc2110d0, vm=0x7f02300afb90, 
    migrated=false) at security/security_manager.c:814
#13 0x00007f024e29f1a1 in virSecurityStackRestoreAllLabel (mgr=0x7f01dc211070, vm=0x7f02300afb90, 
    migrated=false) at security/security_stack.c:333
#14 0x00007f024e2a440d in virSecurityManagerRestoreAllLabel (mgr=0x7f01dc211070, vm=0x7f02300afb90, 
    migrated=false) at security/security_manager.c:814
#15 0x00007f0204a8b329 in qemuProcessStop (driver=0x7f01dc079e20, vm=0x7f01dc137a20, 
    reason=VIR_DOMAIN_SHUTOFF_DESTROYED, asyncJob=QEMU_ASYNC_JOB_NONE, flags=0) at qemu/qemu_process.c:6052
#16 0x00007f0204acd5a9 in qemuDomainDestroyFlags (dom=0x7f02300877c0, flags=0) at qemu/qemu_driver.c:2256
#17 0x00007f0204acd664 in qemuDomainDestroy (dom=0x7f02300877c0) at qemu/qemu_driver.c:2278
#18 0x00007f024e16af5e in virDomainDestroy (domain=0x7f02300877c0) at libvirt-domain.c:479
#19 0x000056533817c545 in remoteDispatchDomainDestroy (server=0x565338e21be0, client=0x565338e00660, 
    msg=0x565338e525c0, rerr=0x7f023f920b80, args=0x7f0230087800) at remote_dispatch.h:4562
#20 0x000056533817c45e in remoteDispatchDomainDestroyHelper (server=0x565338e21be0, client=0x565338e00660, 
    msg=0x565338e525c0, rerr=0x7f023f920b80, args=0x7f0230087800, ret=0x7f023007e900)

Comment 2 Peter Krempa 2017-09-01 13:45:48 UTC
Stopping of the qemu process is handled via the event thread now, so the event loop should no-longer be blocked:

commit 8c9ff9960b29d4703a99efdd1cadcf6f48799cc0
Author: Jiri Denemark <jdenemar>
Date:   Thu Feb 11 15:32:48 2016 +0100

    qemu: Process monitor EOF in a job
    
    Stopping a domain without a job risks a race condition with another
    thread which started a job a which does not expect anyone else to be
    messing around with the same domain object.

Comment 3 Jiri Denemark 2017-09-05 07:57:25 UTC
This should already be fixed since RHEL-7.3.

Comment 4 jiyan 2018-01-23 03:08:24 UTC
Hi, Peter, I do not understand the bug description clearly, could you please help to provide more detailed steps to verify this bug? Thank you very much.

Comment 5 Peter Krempa 2018-01-26 12:23:06 UTC
According to deeper investigation from QE it's not enough just to punt the handlers out of the event loop but we also need to guarantee that the callbacks don't get stuck on a lock:    

Thread 21 (Thread 0x7f6ac5ba1700 (LWP 6658)):
    #0  0x00007f6ad53f48f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
    #1  0x00007f6ab68ea893 in glfs_init_wait () from /lib64/libgfapi.so.0
    #2  0x00007f6ab68ead58 in glfs_init () from /lib64/libgfapi.so.0
    #3  0x00007f6ab6b0ffad in virStorageFileBackendGlusterInit (src=0x7f6a80002200) at storage/storage_backend_gluster.c:660
    #4  0x00007f6ac2247140 in virStorageFileInitAs (src=src@entry=0x7f6a80002200, uid=uid@entry=4294967295, gid=gid@entry=4294967295) at storage/storage_source.c:153
    #5  0x00007f6ac22471f8 in virStorageFileInit (src=src@entry=0x7f6a80002200) at storage/storage_source.c:173
    #6  0x00007f6ab53c3b08 in qemuSecurityChownCallback (src=<optimized out>, uid=0, gid=0) at qemu/qemu_driver.c:353
    #7  0x00007f6ad82e8046 in virSecurityDACSetOwnershipInternal (priv=priv@entry=0x7f6aac1ad700, src=src@entry=0x7f6aac1b6570, path=path@entry=0x0, uid=uid@entry=0, gid=gid@entry=0) at security/security_dac.c:569
    #8  0x00007f6ad82e849b in virSecurityDACRestoreFileLabelInternal (priv=priv@entry=0x7f6aac1ad700, src=src@entry=0x7f6aac1b6570, path=path@entry=0x0) at security/security_dac.c:678
    #9  0x00007f6ad82ea315 in virSecurityDACRestoreImageLabelInt (mgr=mgr@entry=0x7f6aac1ad730, def=def@entry=0x7f6aac1b5e50, src=0x7f6aac1b6570, migrated=migrated@entry=false) at security/security_dac.c:793
    #10 0x00007f6ad82ea5b9 in virSecurityDACRestoreAllLabel (mgr=0x7f6aac1ad730, def=0x7f6aac1b5e50, migrated=<optimized out>, chardevStdioLogd=<optimized out>) at security/security_dac.c:1553
    #11 0x00007f6ad82ed2b1 in virSecurityManagerRestoreAllLabel (mgr=0x7f6aac1ad730, vm=vm@entry=0x7f6aac1b5e50, migrated=migrated@entry=false, chardevStdioLogd=chardevStdioLogd@entry=true) at security/security_manager.c:889
    #12 0x00007f6ad82e6c22 in virSecurityStackRestoreAllLabel (mgr=<optimized out>, vm=0x7f6aac1b5e50, migrated=<optimized out>, chardevStdioLogd=<optimized out>) at security/security_stack.c:381
    #13 0x00007f6ad82ed2b1 in virSecurityManagerRestoreAllLabel (mgr=0x7f6aac0fb7f0, vm=0x7f6aac1b5e50, migrated=migrated@entry=false, chardevStdioLogd=<optimized out>) at security/security_manager.c:889
    #14 0x00007f6ab540642d in qemuSecurityRestoreAllLabel (driver=driver@entry=0x7f6aac1042e0, vm=vm@entry=0x7f6aac1b6af0, migrated=migrated@entry=false) at qemu/qemu_security.c:78
    #15 0x00007f6ab5371b11 in qemuProcessStop (driver=driver@entry=0x7f6aac1042e0, vm=0x7f6aac1b6af0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=0) at qemu/qemu_process.c:6453
    #16 0x00007f6ab53e4aa2 in qemuDomainDestroyFlags (dom=<optimized out>, flags=flags@entry=0) at qemu/qemu_driver.c:2300
    #17 0x00007f6ab53e4da3 in qemuDomainDestroy (dom=<optimized out>) at qemu/qemu_driver.c:2322
    #18 0x00007f6ad819d2d0 in virDomainDestroy (domain=domain@entry=0x7f6a800025e0) at libvirt-domain.c:475
    #19 0x000055a25ec4cb7f in remoteDispatchDomainDestroy (server=0x55a2604b9f90, msg=0x55a2605088d0, args=<optimized out>, rerr=0x7f6ac5ba0bc0, client=0x55a260508a60) at remote_dispatch.h:4615
    #20 remoteDispatchDomainDestroyHelper (server=0x55a2604b9f90, client=0x55a260508a60, msg=0x55a2605088d0, rerr=0x7f6ac5ba0bc0, args=<optimized out>, ret=0x7f6a80002510) at remote_dispatch.h:4591
    #21 0x00007f6ad824bd5c in virNetServerProgramDispatchCall (msg=0x55a2605088d0, client=0x55a260508a60, server=0x55a2604b9f90, prog=0x55a260505b70) at rpc/virnetserverprogram.c:437
    #22 virNetServerProgramDispatch (prog=0x55a260505b70, server=server@entry=0x55a2604b9f90, client=client@entry=0x55a260508a60, msg=msg@entry=0x55a2605088d0) at rpc/virnetserverprogram.c:307
    #23 0x000055a25ec670da in virNetServerProcessMsg (srv=srv@entry=0x55a2604b9f90, client=0x55a260508a60, prog=<optimized out>, msg=0x55a2605088d0) at rpc/virnetserver.c:148
    #24 0x000055a25ec674e8 in virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a2604b9f90) at rpc/virnetserver.c:169
    #25 0x00007f6ad80b4aa1 in virThreadPoolWorker (opaque=opaque@entry=0x55a2604ae4c0) at util/virthreadpool.c:167
    #26 0x00007f6ad80b3870 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
    #27 0x00007f6ad53f0dd5 in start_thread () from /lib64/libpthread.so.0
    #28 0x00007f6ad511a94d in clone () from /lib64/libc.so.6
    ...
    Thread 1 (Thread 0x7f6ad8fb48c0 (LWP 6650)):
    #0  0x00007f6ad53f747d in __lll_lock_wait () from /lib64/libpthread.so.0
    #1  0x00007f6ad53f2d7b in _L_lock_812 () from /lib64/libpthread.so.0
    #2  0x00007f6ad53f2c48 in pthread_mutex_lock () from /lib64/libpthread.so.0
    #3  0x00007f6ad80b3ac1 in virMutexLock (m=<optimized out>) at util/virthread.c:89
    #4  0x00007f6ad8085fdf in virObjectLock (anyobj=anyobj@entry=0x7f6aac1b6af0) at util/virobject.c:435
    #5  0x00007f6ab536652c in qemuProcessHandleEvent (mon=<optimized out>, vm=0x7f6aac1b6af0, eventName=0x55a2604bbf90 "SHUTDOWN", seconds=1516949172, micros=523889, details=0x55a2604bbdd0 "{\"guest\":false}", opaque=0x7f6aac1042e0) at qemu/qemu_process.c:640
    #6  0x00007f6ab538f9ce in qemuMonitorEmitEvent (mon=mon@entry=0x7f6a980230f0, event=event@entry=0x55a2604bbf90 "SHUTDOWN", seconds=1516949172, micros=523889, details=0x55a2604bbdd0 "{\"guest\":false}") at qemu/qemu_monitor.c:1326
    #7  0x00007f6ab53aac43 in qemuMonitorJSONIOProcessEvent (obj=0x55a260507520, mon=0x7f6a980230f0) at qemu/qemu_monitor_json.c:172
    #8  qemuMonitorJSONIOProcessLine (mon=mon@entry=0x7f6a980230f0, line=<optimized out>, msg=msg@entry=0x0) at qemu/qemu_monitor_json.c:209
    #9  0x00007f6ab53aafa0 in qemuMonitorJSONIOProcess (mon=mon@entry=0x7f6a980230f0, data=0x55a260509000 "{\"timestamp\": {\"seconds\": 1516949172, \"microseconds\": 523889}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, msg=msg@entry=0x0) at qemu/qemu_monitor_json.c:251
    #10 0x00007f6ab538dc0e in qemuMonitorIOProcess (mon=0x7f6a980230f0) at qemu/qemu_monitor.c:441
    #11 qemuMonitorIO (watch=watch@entry=15, fd=<optimized out>, events=0, events@entry=1, opaque=opaque@entry=0x7f6a980230f0) at qemu/qemu_monitor.c:695
    #12 0x00007f6ad803827f in virEventPollDispatchHandles (fds=<optimized out>, nfds=<optimized out>) at util/vireventpoll.c:508
    #13 virEventPollRunOnce () at util/vireventpoll.c:657
    #14 0x00007f6ad8035f3a in virEventRunDefaultImpl () at util/virevent.c:327
    #15 0x00007f6ad8243235 in virNetDaemonRun (dmn=dmn@entry=0x55a2604bbd40) at rpc/virnetdaemon.c:837
    #16 0x000055a25ec1357c in main (argc=<optimized out>, argv=<optimized out>) at libvirtd.c:1494

Moving back to assigned.

Comment 7 Dmitry Melekhov 2018-10-10 04:39:53 UTC
Hello!

Looks like we hit this yesterday, 
could you tell me is it already fixed?

Thank you!

Comment 8 Dmitry Melekhov 2018-10-10 04:42:36 UTC
btw, I think this https://bugzilla.redhat.com/show_bug.cgi?id=1559722
is the same bug.

Comment 9 Peter Krempa 2018-10-10 07:24:10 UTC
(In reply to Dmitry Melekhov from comment #8)
> btw, I think this https://bugzilla.redhat.com/show_bug.cgi?id=1559722
> is the same bug.

The above bug tracks the stuck gluster connection, while here we track that a second libvirtd thread can get stuck too.

Comment 10 Jaroslav Suchanek 2019-04-24 12:26:53 UTC
This bug is going to be addressed in next major release.

Comment 13 Michal Privoznik 2021-05-31 11:00:18 UTC
I think this was fixed upstream by the following series:

a18f2c52ac qemu: convert agent to use the per-VM event loop
0d62faf62a qemu: fix variable naming in agent code
436a56e37d qemu: convert monitor to use the per-VM event loop
ba906ab1c0 tests: start/stop an event thread for QEMU monitor/agent tests
92890fbfa1 qemu: start/stop an event thread for QMP probing
e6afacb0fe qemu: start/stop an event loop thread for domains
5c146f6e57 src: introduce an abstraction for running event loops

v6.2.0-rc1~236

Comment 14 Michal Privoznik 2021-06-28 09:03:02 UTC
Moving to POST per comment 13. Libvirt creates per-vm thread for talking on monitor and agent sockets:

# gdb -p $(pgrep libvirtd)
...
poll 1 $ info threads
  Id   Target Id                                          Frame 
* 1    Thread 0x7ffacade2d80 (LWP 9466) "libvirtd"        0x00007ffacde26fff in poll () from /lib64/libc.so.6
  2    Thread 0x7ffaca70b640 (LWP 9467) "rpc-worker"      0x00007ffacd673732 in __futex_abstimed_wait_common64 () from /lib64/libpthread.so.0
  3    Thread 0x7ffac9f0a640 (LWP 9468) "rpc-worker"      0x00007ffacd673732 in __futex_abstimed_wait_common64 () from /lib64/libpthread.so.0
  4    Thread 0x7ffac9709640 (LWP 9469) "prio-rpc-worker" 0x00007ffacd673732 in __futex_abstimed_wait_common64 () from /lib64/libpthread.so.0
  5    Thread 0x7ffac3fff640 (LWP 9470) "rpc-worker"      0x00007ffacd673732 in __futex_abstimed_wait_common64 () from /lib64/libpthread.so.0
  6    Thread 0x7ffac2ffd640 (LWP 9472) "gmain"           0x00007ffacde26fff in poll () from /lib64/libc.so.6
  7    Thread 0x7ffac27fc640 (LWP 9473) "gdbus"           0x00007ffacde26fff in poll () from /lib64/libc.so.6
  8    Thread 0x7ffac1ffb640 (LWP 9522) "udev-event"      0x00007ffacd673732 in __futex_abstimed_wait_common64 () from /lib64/libpthread.so.0
  9    Thread 0x7ffac37fe640 (LWP 9724) "vm-fedora"       0x00007ffacde26fff in poll () from /lib64/libc.so.6

(the thread 9 "vm-fedora" is for processing QMP on guest "fedora")

Comment 15 John Ferlan 2021-06-30 19:11:19 UTC
Dan - since the referenced patches from Michal in Comment 13 are yours, I'm assigning to you under the premise that no bug in POST, w/ ITR/ITM set should be assigned to virt-maint

Comment 16 Jiri Denemark 2021-07-01 14:34:20 UTC
According to the commits in comment #13 this was released already in RHEL-AV
8.3.0. Marking as TestOnly for 8.5.0.

Comment 17 Han Han 2021-07-29 10:51:38 UTC
Unable to reproduce the bug as report on libvirt-6.0.0-36.module+el8.5.0+11222+c889b3f3.x86_64 with these steps:
1. Start VM with gluster network disk
2. Stop gluster cluster(2 nodes)
3. Destroy VM

VM is not stucked.


I have check the threads of libvirt on libvirt-6.0.0-36.module+el8.5.0+11222+c889b3f3.x86_64 and libvirt-7.5.0-1.module+el8.5.0+11664+59f87560.x86_64

VM xml:
<domain>
...
  <devices>
    <disk type="network" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source protocol="gluster" name="gv0/hhan.qcow2" index="1">
        <host name="10.0.148.14" port="24007"/>
      </source>
      <backingStore/>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
    </disk>
    ...
    <channel type="unix">
      <target type="virtio" name="org.qemu.guest_agent.0"/>
      <address type="virtio-serial" controller="0" bus="0" port="1"/>
    </channel>
  </devices>
</domain>


After VM started, check threads of libvirtd:
Results of libvirt-6.0.0-36.module+el8.5.0+11222+c889b3f3.x86_64
(gdb) info threads
  Id   Target Id                                        Frame 
* 1    Thread 0x7fc8d62f4c00 (LWP 506213) "libvirtd"    0x00007fc8d1881a41 in poll () from /lib64/libc.so.6
  2    Thread 0x7fc8be617700 (LWP 506214) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3    Thread 0x7fc8bde16700 (LWP 506215) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4    Thread 0x7fc8bd615700 (LWP 506216) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x7fc8bce14700 (LWP 506217) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6    Thread 0x7fc8bc613700 (LWP 506218) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  7    Thread 0x7fc8bbe12700 (LWP 506219) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  8    Thread 0x7fc8bb611700 (LWP 506220) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9    Thread 0x7fc8bae10700 (LWP 506221) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  10   Thread 0x7fc8ba60f700 (LWP 506222) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  11   Thread 0x7fc8b9e0e700 (LWP 506223) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  12   Thread 0x7fc87efac700 (LWP 506224) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  13   Thread 0x7fc87e7ab700 (LWP 506225) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  14   Thread 0x7fc87dfaa700 (LWP 506226) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  15   Thread 0x7fc87d7a9700 (LWP 506227) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  16   Thread 0x7fc87cfa8700 (LWP 506228) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  17   Thread 0x7fc86688b700 (LWP 506276) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  18   Thread 0x7fc86608a700 (LWP 506308) "glfs_sproc0" 0x00007fc8d1b6364a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  19   Thread 0x7fc863b92700 (LWP 506309) "glfs_sproc1" 0x00007fc8d1b6364a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  20   Thread 0x7fc867fff700 (LWP 506353) "libvirtd"    0x00007fc8d1b632fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

Results of libvirt-7.5.0-1.module+el8.5.0+11664+59f87560.x86_64:
(gdb) info threads
  Id   Target Id                                            Frame 
* 1    Thread 0x7f04af9a7b00 (LWP 503626) "libvirtd"        0x00007f04adcfda41 in poll () from /lib64/libc.so.6
  2    Thread 0x7f04995cb700 (LWP 503627) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3    Thread 0x7f0498dca700 (LWP 503628) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4    Thread 0x7f04985c9700 (LWP 503629) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x7f0497dc8700 (LWP 503630) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6    Thread 0x7f048f5c7700 (LWP 503631) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  7    Thread 0x7f04975c7700 (LWP 503632) "prio-rpc-worker" 0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  8    Thread 0x7f0496dc6700 (LWP 503633) "prio-rpc-worker" 0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9    Thread 0x7f04965c5700 (LWP 503634) "prio-rpc-worker" 0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  10   Thread 0x7f0495dc4700 (LWP 503635) "prio-rpc-worker" 0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  11   Thread 0x7f04955c3700 (LWP 503636) "prio-rpc-worker" 0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  12   Thread 0x7f0469a00700 (LWP 503637) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  13   Thread 0x7f04691ff700 (LWP 503638) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  14   Thread 0x7f04689fe700 (LWP 503639) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  15   Thread 0x7f04532f6700 (LWP 503640) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  16   Thread 0x7f0452af5700 (LWP 503641) "rpc-worker"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  17   Thread 0x7f0443fff700 (LWP 503643) "gmain"           0x00007f04adcfda41 in poll () from /lib64/libc.so.6
  18   Thread 0x7f04437fe700 (LWP 503644) "gdbus"           0x00007f04adcfda41 in poll () from /lib64/libc.so.6
  19   Thread 0x7f0442ffd700 (LWP 503691) "udev-event"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  20   Thread 0x7f04408ee700 (LWP 503718) "glfs_sproc0"     0x00007f04ab56064a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  21   Thread 0x7f043bfff700 (LWP 503719) "glfs_sproc1"     0x00007f04ab56064a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  22   Thread 0x7f04522f4700 (LWP 503921) "vm-hhan"         0x00007f04adcfda41 in poll () from /lib64/libc.so.6
  23   Thread 0x7f0437fd6700 (LWP 503947) "qemu-event"      0x00007f04ab5602fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

Hi,Peter, is there any other methods to reproduce this bug?

Comment 18 Peter Krempa 2021-07-30 08:50:59 UTC
I'm sorry I didn't follow the bug too closely any more so I don't have any suggestions for better testing.

Comment 19 Han Han 2021-08-09 02:07:47 UTC
Michal, do you have any idea to reproduce this bug?

Comment 20 Michal Privoznik 2021-08-16 12:49:44 UTC
I'm sorry. I don't have any clever idea either. I think attaching gdb to a daemon that has a domain running and typing 'info threads' to observe 'vm-$domname" is enough. That's the per domain thread that handles the events.

Comment 21 Han Han 2021-08-20 08:56:39 UTC
Verfied as https://bugzilla.redhat.com/show_bug.cgi?id=1402921#c17