Description of problem: Randomly on different oVirt nodes occurred issue with VDSM service - (MainThread) [vds] Received signal 15, shutting down (vdsmd:73) At the same time I see events from oVirt Engine: - VDSM ovirt-host-02.local command Get Host Statistics failed: Broken pipe - Invalid status on Data Center ClusterName. Setting Data Center status to Non Responsive (On host ovirt-host-02.local, Error: Network error during communication with the Host.). - VDSM command GetStoragePoolInfoVDS failed: Connection timeout for host ' ovirt-host-02.local', last response arrived 22501 ms ago - Host ovirt-host-02.local is non responsive During couple of minutes oVirt node restart VDSM and server again connected to the oVirt Cluster, but this unexpected VDSM crashing trigger VMs migration. Version-Release number of selected component (if applicable): ovirt-engine version 4.4.4.7-1.el8 # rpm -qa | grep vdsm vdsm-network-4.40.40-1.el8.x86_64 vdsm-client-4.40.40-1.el8.noarch vdsm-4.40.40-1.el8.x86_64 vdsm-hook-ethtool-options-4.40.40-1.el8.noarch vdsm-common-4.40.40-1.el8.noarch vdsm-hook-fcoe-4.40.40-1.el8.noarch vdsm-http-4.40.40-1.el8.noarch vdsm-hook-openstacknet-4.40.40-1.el8.noarch vdsm-yajsonrpc-4.40.40-1.el8.noarch vdsm-jsonrpc-4.40.40-1.el8.noarch vdsm-hook-vmfex-dev-4.40.40-1.el8.noarch vdsm-python-4.40.40-1.el8.noarch vdsm-gluster-4.40.40-1.el8.x86_64 vdsm-hook-vhostmd-4.40.40-1.el8.noarch vdsm-api-4.40.40-1.el8.noarch # rpm -qa | grep ovirt ovirt-ansible-collection-1.2.4-1.el8.noarch ovirt-host-dependencies-4.4.1-4.el8.x86_64 ovirt-host-4.4.1-4.el8.x86_64 ovirt-imageio-common-2.1.1-1.el8.x86_64 ovirt-openvswitch-ovn-common-2.11-0.2020061801.el8.noarch ovirt-imageio-client-2.1.1-1.el8.x86_64 ovirt-python-openvswitch-2.11-0.2020061801.el8.noarch python3-ovirt-setup-lib-1.3.2-1.el8.noarch cockpit-ovirt-dashboard-0.14.17-1.el8.noarch ovirt-vmconsole-host-1.0.9-1.el8.noarch ovirt-vmconsole-1.0.9-1.el8.noarch ovirt-openvswitch-ovn-host-2.11-0.2020061801.el8.noarch ovirt-provider-ovn-driver-1.2.33-1.el8.noarch ovirt-release44-4.4.4.1-1.el8.noarch ovirt-hosted-engine-ha-2.4.5-1.el8.noarch ovirt-openvswitch-ovn-2.11-0.2020061801.el8.noarch ovirt-openvswitch-2.11-0.2020061801.el8.noarch ovirt-imageio-daemon-2.1.1-1.el8.x86_64 ovirt-hosted-engine-setup-2.4.9-1.el8.noarch python3-ovirt-engine-sdk4-4.4.8-1.el8.x86_64 How reproducible: I don't have reproduce steps. Issue occurs on oVirt host without any action from oVirt users. Actual results: Crash VDSM service on oVirt node. Expected results: SuperVDSM server shouldn't terminate Additional info: vdsm.log: 2021-02-23 14:38:27,587+0000 INFO (jsonrpc/0) [api] FINISH getStats error=Virtual machine does not exist: {'vmId': '935bc35f-8cf8-4d42-944e-14a6ccfe1154'} (api:129) 2021-02-23 14:38:27,587+0000 INFO (jsonrpc/0) [api.virt] FINISH getStats return={'status': {'code': 1, 'message': "Virtual machine does not exist: {'vmId': '935bc35f-8cf8-4d42-944e-14a6ccfe1154'}"}} from=::1,60484, vmId=935bc35f-8cf8-4d42-944e-14a6ccfe1154 (api:54) 2021-02-23 14:38:27,587+0000 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call VM.getStats failed (error 1) in 0.00 seconds (__init__:312) 2021-02-23 14:38:29,305+0000 WARN (qgapoller/1) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-23 14:38:29,306+0000 CRIT (qgapoller/1) [root] taking calling process down. (libvirtconnection:158) 2021-02-23 14:38:29,306+0000 WARN (qgapoller/1) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-23 14:38:29,307+0000 CRIT (qgapoller/1) [root] taking calling process down. (libvirtconnection:158) 2021-02-23 14:38:29,307+0000 INFO (qgapoller/1) [vds] Failed to get guest info for vm=d95e4196-2b5e-487b-97aa-daf9ad8b3272, error: internal error: client socket is closed (qemuguestagent:414) 2021-02-23 14:38:29,310+0000 INFO (MainThread) [vds] Received signal 15, shutting down (vdsmd:73) 2021-02-23 14:38:29,311+0000 INFO (MainThread) [root] Stopping DHCP monitor. (dhcp_monitor:106) 2021-02-23 14:38:29,541+0000 INFO (MainThread) [root] Stopping Bond monitor. (bond_monitor:53) 2021-02-23 14:38:29,543+0000 INFO (MainThread) [health] Stopping health monitor (health:71) 2021-02-23 14:38:29,556+0000 INFO (MainThread) [vds] Stopping http server (http:85) 2021-02-23 14:38:29,556+0000 INFO (http) [vds] Server stopped (http:75) 2021-02-23 14:38:29,557+0000 INFO (MainThread) [jsonrpc.JsonRpcServer] Stopping JsonRPC Server (__init__:446) 2021-02-23 14:38:29,565+0000 INFO (MainThread) [root] Unregistering all secrets (secret:94) 2021-02-23 14:38:29,565+0000 WARN (MainThread) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-23 14:38:29,565+0000 CRIT (MainThread) [root] taking calling process down. (libvirtconnection:158) 2021-02-23 14:38:29,566+0000 INFO (MainThread) [vds] Received signal 15, shutting down (vdsmd:73) 2021-02-23 14:38:29,566+0000 ERROR (MainThread) [vds] Exception raised (vdsmd:164) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/vdsmd.py", line 162, in run serve_clients(log) File "/usr/lib/python3.6/site-packages/vdsm/vdsmd.py", line 129, in serve_clients cif.prepareForShutdown() File "/usr/lib/python3.6/site-packages/vdsm/clientIF.py", line 391, in prepareForShutdown secret.clear() File "/usr/lib/python3.6/site-packages/vdsm/virt/secret.py", line 96, in clear for virsecret in con.listAllSecrets(): File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 5692, in listAllSecrets raise libvirtError("virConnectListAllSecrets() failed", conn=self) libvirt.libvirtError: internal error: client socket is closed 2021-02-23 14:38:29,570+0000 INFO (MainThread) [vds] Stopping threads (vdsmd:166) 2021-02-23 14:38:29,570+0000 INFO (MainThread) [vds] Exiting (vdsmd:177) 2021-02-23 14:40:01,004+0000 INFO (MainThread) [vds] (PID: 3722708) I am the actual vdsm 4.40.40.1 ovirt-host-02.local (4.18.0-240.10.1.el8_3.x86_64) (vdsmd:155) 2021-02-23 14:40:01,004+0000 INFO (MainThread) [vds] VDSM will run with cpu affinity: frozenset({1}) (vdsmd:274) 2021-02-23 14:40:01,016+0000 INFO (MainThread) [storage.HSM] START HSM init (hsm:376) 2021-02-23 14:40:01,021+0000 INFO (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:381) 2021-02-23 14:40:01,021+0000 INFO (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:201) 2021-02-23 14:40:01,528+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/19822b30-5d5b-4b5b-a003-49a6537bde9f' (hsm:532) 2021-02-23 14:40:01,529+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/24d3498c-9ddf-4b0c-a5ab-591c90daf695' (hsm:532) 2021-02-23 14:40:01,529+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/35b65f82-7b9f-4304-ab09-e1ddd5d13fa3' (hsm:532) 2021-02-23 14:40:01,530+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/de7fec7c-f7d3-4f9c-b378-0442a873fa69' (hsm:532) 2021-02-23 14:40:01,530+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/4c76052f-9281-474c-9af7-c742ae8b2592' (hsm:532) 2021-02-23 14:40:01,530+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/dc474146-14ca-4883-a042-72f8543e001c' (hsm:532) 2021-02-23 14:40:01,531+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/f98a9d77-2e28-41c1-940a-da4a06ed2bab' (hsm:532) 2021-02-23 14:40:01,531+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/7453f6ce-d98e-4378-b424-4de420eb1d47' (hsm:532) 2021-02-23 14:40:01,531+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/mastersd' (hsm:532) 2021-02-23 14:40:01,532+0000 INFO (MainThread) [storage.HSM] Removing directory '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af' (hsm:546) 2021-02-23 14:40:01,534+0000 INFO (MainThread) [storage.check] Starting check service (check:91) 2021-02-23 14:40:01,535+0000 INFO (hsm/init) [storage.StorageDomainCache] Refreshing storage domain cache (resize=True) (sdc:80) 2021-02-23 14:40:01,536+0000 INFO (hsm/init) [storage.ISCSI] Scanning iSCSI devices (iscsi:442) 2021-02-23 14:40:01,546+0000 INFO (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (dispatcher:47) 2021-02-23 14:40:01,549+0000 INFO (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x140547172350944> (asyncevent:126) 2021-02-23 14:40:01,652+0000 INFO (MainThread) [vdsm.api] START registerDomainStateChangeCallback(callbackFunc=functools.partial(<function clientIF.contEIOVms at 0x7fd3ac0a9b70>, <weakproxy at 0x7fd3ac424d68 to clientIF at 0x7fd3ac438588>)) from=internal, task_id=5568047 d-91fd-48c3-8bb4-61844a3167ff (api:48) 2021-02-23 14:40:01,653+0000 INFO (MainThread) [vdsm.api] FINISH registerDomainStateChangeCallback return=None from=internal, task_id=5568047d-91fd-48c3-8bb4-61844a3167ff (api:54) 2021-02-23 14:40:01,654+0000 INFO (MainThread) [vds] Using libvirt for querying QEMU-GA (qemuguestagent:162) 2021-02-23 14:40:01,655+0000 INFO (MainThread) [MOM] Preparing MOM interface (momIF:85) 2021-02-23 14:40:01,655+0000 INFO (MainThread) [MOM] MOM: Using named unix socket: /run/vdsm/mom-vdsm.sock (momIF:94) 2021-02-23 14:40:01,656+0000 INFO (MainThread) [root] Unregistering all secrets (secret:94) 2021-02-23 14:40:01,683+0000 INFO (MainThread) [vds] Setting channels' timeout to 30 seconds. (vmchannels:224) 2021-02-23 14:40:01,684+0000 INFO (MainThread) [vds] Starting QEMU-GA poller (qemuguestagent:176) 2021-02-23 14:40:01,687+0000 INFO (MainThread) [vds.MultiProtocolAcceptor] Listening at :::54321 (protocoldetector:183) 2021-02-23 14:40:01,822+0000 INFO (http) [vds] Server running (http:64) supervdsm.log: MainThread::DEBUG::2021-02-23 14:38:29,977::supervdsm_server::314::SuperVdsm.Server::(main) Terminated normally Thread-1::DEBUG::2021-02-23 14:38:29,979::concurrent::265::root::(run) FINISH thread <Thread(Thread-1, stopped daemon 139762298169088)> (0) MainThread::DEBUG::2021-02-23 14:38:40,079::__init__::64::blivet::(register_device_format) registered device format class DeviceFormat as None MainThread::DEBUG::2021-02-23 14:38:40,092::__init__::64::blivet::(register_device_format) registered device format class BIOSBoot as biosboot MainThread::DEBUG::2021-02-23 14:38:40,101::__init__::64::blivet::(register_device_format) registered device format class DiskLabel as disklabel MainThread::DEBUG::2021-02-23 14:38:40,101::__init__::64::blivet::(register_device_format) registered device format class DMRaidMember as dmraidmember MainThread::DEBUG::2021-02-23 14:38:40,111::__init__::64::blivet::(register_device_format) registered device format class Ext2FS as ext2 MainThread::DEBUG::2021-02-23 14:38:40,111::__init__::64::blivet::(register_device_format) registered device format class Ext3FS as ext3 MainThread::DEBUG::2021-02-23 14:38:40,111::__init__::64::blivet::(register_device_format) registered device format class Ext4FS as ext4 MainThread::DEBUG::2021-02-23 14:38:40,111::__init__::64::blivet::(register_device_format) registered device format class FATFS as vfat MainThread::DEBUG::2021-02-23 14:38:40,112::__init__::64::blivet::(register_device_format) registered device format class EFIFS as efi MainThread::DEBUG::2021-02-23 14:38:40,112::__init__::64::blivet::(register_device_format) registered device format class BTRFS as btrfs MainThread::DEBUG::2021-02-23 14:38:40,112::__init__::64::blivet::(register_device_format) registered device format class GFS2 as gfs2 MainThread::DEBUG::2021-02-23 14:38:40,112::__init__::64::blivet::(register_device_format) registered device format class JFS as jfs MainThread::DEBUG::2021-02-23 14:38:40,112::__init__::64::blivet::(register_device_format) registered device format class ReiserFS as reiserfs MainThread::DEBUG::2021-02-23 14:38:40,113::__init__::64::blivet::(register_device_format) registered device format class XFS as xfs MainThread::DEBUG::2021-02-23 14:38:40,113::__init__::64::blivet::(register_device_format) registered device format class HFS as hfs MainThread::DEBUG::2021-02-23 14:38:40,113::__init__::64::blivet::(register_device_format) registered device format class AppleBootstrapFS as appleboot MainThread::DEBUG::2021-02-23 14:38:40,113::__init__::64::blivet::(register_device_format) registered device format class HFSPlus as hfs+ MainThread::DEBUG::2021-02-23 14:38:40,113::__init__::64::blivet::(register_device_format) registered device format class MacEFIFS as macefi MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class NTFS as ntfs MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class NFS as nfs MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class NFSv4 as nfs4 MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class Iso9660FS as iso9660 MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class NoDevFS as nodev MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class DevPtsFS as devpts MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class ProcFS as proc MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class SysFS as sysfs MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class TmpFS as tmpfs MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class BindFS as bind MainThread::DEBUG::2021-02-23 14:38:40,114::__init__::64::blivet::(register_device_format) registered device format class SELinuxFS as selinuxfs MainThread::DEBUG::2021-02-23 14:38:40,115::__init__::64::blivet::(register_device_format) registered device format class USBFS as usbfs MainThread::DEBUG::2021-02-23 14:38:40,115::__init__::64::blivet::(register_device_format) registered device format class EFIVarFS as efivarfs MainThread::INFO::2021-02-23 14:38:40,134::__init__::55::program::(log_bd_message) Running [1] lvm version ... MainThread::INFO::2021-02-23 14:38:40,144::__init__::55::program::(log_bd_message) stdout[1]: LVM version: 2.03.09(2)-RHEL8 (2020-05-28) Library version: 1.02.171-RHEL8 (2020-05-28) Driver version: 4.42.0 Configuration: ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/in clude --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-default-dm-run-dir=/run --with-default-run-dir=/run/lvm --with-default-pid-dir=/run --with-default-locking-dir=/run/ lock/lvm --with-usrlibdir=/usr/lib64 --enable-fsadm --enable-write_install --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig --enable-cmdlib --enable-dmeventd --enable-blkid_wiping --with-cluster=internal --enabl e-cmirrord --with-udevdir=/usr/lib/udev/rules.d --enable-udev_sync --with-thin=internal --with-cache=internal --enable-lvmpolld --enable-lvmlockd-dlm --enable-lvmlockd-dlmcontrol --enable-lvmlockd-sanlock --enable-dbus-service --enable-notify-dbus --enable-dmfilemapd --wi th-writecache=internal --with-vdo=internal --with-vdo-format=/usr/bin/vdoformat --with-integrity=internal --disable-silent-rules MainThread::INFO::2021-02-23 14:38:40,145::__init__::55::program::(log_bd_message) stderr[1]: MainThread::INFO::2021-02-23 14:38:40,145::__init__::55::program::(log_bd_message) ...done [1] (exit code: 0) MainThread::DEBUG::2021-02-23 14:38:40,162::__init__::64::blivet::(register_device_format) registered device format class LUKS as luks MainThread::DEBUG::2021-02-23 14:38:40,162::__init__::64::blivet::(register_device_format) registered device format class Integrity as integrity MainThread::DEBUG::2021-02-23 14:38:40,164::__init__::64::blivet::(register_device_format) registered device format class LVMPhysicalVolume as lvmpv MainThread::DEBUG::2021-02-23 14:38:40,165::__init__::64::blivet::(register_device_format) registered device format class MDRaidMember as mdmember MainThread::DEBUG::2021-02-23 14:38:40,165::__init__::64::blivet::(register_device_format) registered device format class MultipathMember as multipath_member MainThread::DEBUG::2021-02-23 14:38:40,166::__init__::64::blivet::(register_device_format) registered device format class PPCPRePBoot as prepboot MainThread::DEBUG::2021-02-23 14:38:40,167::__init__::64::blivet::(register_device_format) registered device format class SwapSpace as swap MainThread::DEBUG::2021-02-23 14:38:40,446::supervdsm_server::274::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsm MainThread::DEBUG::2021-02-23 14:38:40,446::supervdsm_server::283::SuperVdsm.Server::(main) Parsing cmd args MainThread::DEBUG::2021-02-23 14:38:40,446::supervdsm_server::286::SuperVdsm.Server::(main) Cleaning old socket /run/vdsm/svdsm.sock MainThread::DEBUG::2021-02-23 14:38:40,446::supervdsm_server::290::SuperVdsm.Server::(main) Setting up keep alive thread MainThread::DEBUG::2021-02-23 14:38:40,446::supervdsm_server::296::SuperVdsm.Server::(main) Creating remote object manager Thread-1::DEBUG::2021-02-23 14:38:40,447::concurrent::258::root::(run) START thread <Thread(Thread-1, started daemon 140413524825856)> (func=<bound method Server.serve_forever of <multiprocessing.managers.Server object at 0x7fb4a45fb208>>, args=(), kwargs={}) MainThread::INFO::2021-02-23 14:38:40,448::fileUtils::252::storage.fileUtils::(chown) Changing owner for /run/vdsm/svdsm.sock, to (36:36) MainThread::DEBUG::2021-02-23 14:38:40,448::cmdutils::130::root::(exec_cmd) /usr/bin/systemctl status lldpad.service (cwd None) MainThread::DEBUG::2021-02-23 14:38:40,472::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:40,472::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool -ping (cwd None) MainThread::DEBUG::2021-02-23 14:38:40,524::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,135::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i eno1 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,283::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,290::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i enp59s0f0 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,298::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,337::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i eno2 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,415::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,422::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i enp59s0f1 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,443::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,451::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i eno3 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,504::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,523::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i enp175s0 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,765::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,772::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i enp175s1 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:41,932::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:41,976::cmdutils::130::root::(exec_cmd) /usr/sbin/lldptool get-lldp -i eno4 adminStatus (cwd None) MainThread::DEBUG::2021-02-23 14:38:42,117::cmdutils::138::root::(exec_cmd) SUCCESS: <err> = b''; <rc> = 0 MainThread::DEBUG::2021-02-23 14:38:42,127::supervdsm_server::309::SuperVdsm.Server::(main) Started serving super vdsm object restore-net::INFO::2021-02-23 14:38:43,161::restore_net_config::286::root::(restore) networks already restored. doing nothing.
Please attach libvirt logs / journal. Apparently libvirt crashed/exited and in that case we restart vdsm
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
(In reply to Michal Skrivanek from comment #1) > Please attach libvirt logs / journal. Apparently libvirt crashed/exited and > in that case we restart vdsm Hi Michal, what exactly libvirt log file or journal you need? I can't provide whole content of /var/log/libvirt/qemu/ because this is production cluster, that has private VM names. Based on logs, the source of issue - this is requesting stats from not existing VM on oVirt node. This cause crash: [api.virt] FINISH getStats return={'status': {'code': 1, 'message': "Virtual machine does not exist: {'vmId': '935bc35f-8cf8-4d42-944e-14a6ccfe1154'}"}}
I have found name of VM that does not exists at time when issue occurred on node-02. This is HostedEngine VM. It turns out that an unexpected request for stats of HostedEngine VM on node which not handle this VM cause this crash. <vm href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154" id="935bc35f-8cf8-4d42-944e-14a6ccfe1154"> <actions> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/shutdown" rel="shutdown"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/start" rel="start"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/stop" rel="stop"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/suspend" rel="suspend"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/detach" rel="detach"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/ticket" rel="ticket"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/previewsnapshot" rel="previewsnapshot"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/reordermacaddresses" rel="reordermacaddresses"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/thawfilesystems" rel="thawfilesystems"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/undosnapshot" rel="undosnapshot"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/reboot" rel="reboot"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/cancelmigration" rel="cancelmigration"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/commitsnapshot" rel="commitsnapshot"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/clone" rel="clone"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/freezefilesystems" rel="freezefilesystems"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/logon" rel="logon"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/maintenance" rel="maintenance"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/migrate" rel="migrate"/> <link href="/ovirt-engine/api/vms/935bc35f-8cf8-4d42-944e-14a6ccfe1154/export" rel="export"/> </actions> <name>HostedEngine</name> <description>Hosted engine VM</description> <comment/>
(In reply to Pavel Zinchuk from comment #3) > (In reply to Michal Skrivanek from comment #1) > > Please attach libvirt logs / journal. Apparently libvirt crashed/exited and > > in that case we restart vdsm > > Hi Michal, what exactly libvirt log file or journal you need? > I can't provide whole content of /var/log/libvirt/qemu/ because this is > production cluster, that has private VM names. We don't need content of /var/log/libvirt/qemu at the moment. Sending the content of journal should be good enough to give us a starting point (the libvirt logs we require are part of the journal).
# journalctl -u libvirtd.service --since "2021-02-23 14:00:00" -- Logs begin at Tue 2021-02-16 18:51:27 GMT, end at Wed 2021-02-24 14:06:21 GMT. -- Feb 23 14:35:58 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:36:10 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:36:11 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:37:13 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:37:15 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:38:17 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:38:23 ovirt-host-02.local libvirtd[2909]: Guest agent is not responding: Guest agent not available for now Feb 23 14:38:29 ovirt-host-02.local systemd[1]: libvirtd.service: Main process exited, code=killed, status=11/SEGV Feb 23 14:38:29 ovirt-host-02.local systemd[1]: libvirtd.service: Failed with result 'signal'. Feb 23 14:38:29 ovirt-host-02.local systemd[1]: libvirtd.service: Service RestartSec=100ms expired, scheduling restart. Feb 23 14:38:29 ovirt-host-02.local systemd[1]: libvirtd.service: Scheduled restart job, restart counter is at 1. Feb 23 14:38:30 ovirt-host-02.local systemd[1]: Stopped Virtualization daemon. Feb 23 14:38:30 ovirt-host-02.local systemd[1]: Starting Virtualization daemon... Feb 23 14:38:30 ovirt-host-02.local systemd[1]: Started Virtualization daemon. Feb 23 14:39:56 ovirt-host-02.local libvirtd[3710529]: libvirt version: 6.6.0, package: 7.3.el8 (CBS <cbs>, 2021-01-27-15:50:28, ) Feb 23 14:39:56 ovirt-host-02.local libvirtd[3710529]: hostname: ovirt-host-02.local Feb 23 14:39:56 ovirt-host-02.local libvirtd[3710529]: Failed to open file '/proc/3713202/stat': No such file or directory Feb 23 14:40:12 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:40:39 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:40:52 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:41:06 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:43:26 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:43:40 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:44:07 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:44:23 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:44:36 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:44:50 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:45:10 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:45:31 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:46:28 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:48:12 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:48:19 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now Feb 23 14:49:21 ovirt-host-02.local libvirtd[3710529]: Guest agent is not responding: Guest agent not available for now
This confirms our suspicion that vdsm is restarted because libvirtd crashed: > Feb 23 14:38:29 ovirt-host-02.local systemd[1]: libvirtd.service: Main process exited, code=killed, status=11/SEGV Unfortunately it looks like the core dump was not generated. Could you please check if 'abrt-cli list' shows anything? If yes, could you provide the backtrace from libvirtd?
Yes, 'abrt-cli list' show next: # abrt-cli list id a683eea721ef2c5f010ee406f635958712817c1f reason: stop_all_submonitors(): monitor.py:119:stop_all_submonitors:RuntimeError: dictionary changed size during iteration time: Thu Jul 2 13:30:34 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.3-1.el8 uid: 36 (vdsm) count: 32 Directory: /var/tmp/abrt/Python3-2020-07-02-13:30:34-69333 id 0ee3f1e103a4fe31468deebe9a4526886fe62146 reason: connect(): decoder.py:355:raw_decode:json.decoder.JSONDecodeError: Invalid control character at: line 1 column 51 (char 50) time: Fri Dec 11 21:21:46 2020 cmdline: /usr/bin/python3 -m ovirt_hosted_engine_setup.vm_status --json package: ovirt-hosted-engine-setup-2.4.8-1.el8 uid: 0 (root) count: 2 Directory: /var/tmp/abrt/Python3-2020-12-11-21:21:46-10054 id 7b2a37152b2b98f7e097402ecf08c5471e0696b8 reason: libvirtd killed by SIGSEGV time: Tue Feb 23 14:38:24 2021 cmdline: /usr/sbin/libvirtd package: libvirt-daemon-6.6.0-7.3.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/ccpp-2021-02-23-14:38:24-2909 id 7932d2460de0726bb49bb2ea7894e3f7f0703f31 reason: print_json(): display_status.py:133:print_json:BrokenPipeError: [Errno 32] Broken pipe time: Sat Nov 7 00:53:10 2020 cmdline: /bin/python3 /bin/gstatus -a -o json uid: 0 (root) count: 2 Directory: /var/tmp/abrt/Python3-2020-11-07-00:53:10-136647 id 390a6a6f038b73df29746e1266027e7da1b44b3e reason: XML(): ElementTree.py:1315:XML: File "<string>", line None time: Tue Feb 16 18:32:49 2021 cmdline: /bin/python3 /bin/gstatus -a -o json uid: 0 (root) count: 1 Directory: /var/tmp/abrt/Python3-2021-02-16-18:32:49-1143953 id 6b067a1f3461666bb8731843339226a52ae26c1c reason: connect_storage_server(): broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Tue Jul 7 06:21:55 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.3-1.el8 uid: 36 (vdsm) count: 34 Directory: /var/tmp/abrt/Python3-2020-07-07-06:21:55-1458 id 55b09ddf81526217f6df5b08b9005cb05a99388a reason: __vdsm_json_rpc_connect(): broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Sun Aug 2 08:31:22 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.4-1.el8 uid: 36 (vdsm) count: 25 Directory: /var/tmp/abrt/Python3-2020-08-02-08:31:22-1608 id 5fa179db944148c7e360b3c2a8caf2d8f2489b02 reason: ecp22_start.cold.10(): lldpad killed by SIGSEGV time: Fri Jul 3 16:55:38 2020 cmdline: /usr/sbin/lldpad -t package: lldpad-1.0.1-13.git036e314.el8 uid: 0 (root) count: 16 Directory: /var/tmp/abrt/ccpp-2020-07-03-16:55:38-73653 id f335d16e8d567091cf0bbbad239e2b65e0bec06c reason: broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Sat Nov 7 01:51:49 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker uid: 36 Directory: /var/tmp/abrt/Python3-2020-11-07-01:51:49-9138 Reported: cannot be reported id efb1daf0d9a333a6944cbdd4a8d79769f7da1a8c reason: broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Sat Nov 7 01:39:56 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker uid: 36 Directory: /var/tmp/abrt/Python3-2020-11-07-01:39:56-10853 Reported: cannot be reported id 242b25636aa5c61d73a4299de965284bd2aa950f reason: reload_service(): peer_eventsapi.py:142:reload_service:PermissionError: [Errno 13] Permission denied time: Fri Aug 21 14:46:37 2020 cmdline: /usr/libexec/platform-python /usr/libexec/glusterfs/peer_eventsapi.py node-reload package: glusterfs-events-7.7-1.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/Python3-2020-08-21-14:46:37-1254122 id 7315cc9665ab61d4088a900d56140200372a81f6 reason: define_volumes(): gstatus:134:main:TypeError: catching classes that do not inherit from BaseException is not allowed time: Fri Aug 14 08:00:32 2020 cmdline: /bin/python3.6 /bin/gstatus -a -o json uid: 0 (root) count: 1508 Directory: /var/tmp/abrt/Python3-2020-08-14-08:00:32-50758 id 075f0debff7de3dc02f3b5bdc77d2c74c5a06956 reason: run(): subprocess.py:438:run:subprocess.CalledProcessError: Command 'gstatus -a -o json ' returned non-zero exit status 1. time: Fri Aug 14 08:06:19 2020 cmdline: /usr/bin/python3 /var/lib/zabbix/gstatus_discovery.py uid: 0 (root) count: 43 Directory: /var/tmp/abrt/Python3-2020-08-14-08:06:19-55709 id 9dca7c942dedc97cf413c84eb99d409d87e6de19 reason: update_state(): gstatus:144:main:TypeError: catching classes that do not inherit from BaseException is not allowed time: Wed Aug 5 01:08:22 2020 cmdline: /bin/python3.6 /bin/gstatus -a -o json uid: 0 (root) count: 3 Directory: /var/tmp/abrt/Python3-2020-08-05-01:08:22-2506497 id f3caced4d854517cf79e61a4a1092d18375d6d2b reason: connect(): unixrpc.py:74:connect:FileNotFoundError: [Errno 2] No such file or directory time: Thu Jul 2 13:12:29 2020 cmdline: /usr/bin/python3 -m ovirt_hosted_engine_setup.reinitialize_lockspace --force package: ovirt-hosted-engine-setup-2.4.4-1.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/Python3-2020-07-02-13:12:29-69342
Content of directory /var/tmp/abrt/ccpp-2021-02-23-14:38:24-2909 I can't provide in public. This is production ovirt node with private hostnames. If you need some specific information from /var/tmp/abrt/ccpp-2021-02-23-14:38:24-290 please specify required file or information. Thanks.
During last day, we had 3 incidents with te same issue, described previously. All time the same error "connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156)" in the logs. After update to the latest oVirt version 4.4.4.7-1.el8 the product has become unstable. Can i ask to increase priority of this ticket to speed up solving the issue? vdsm.log from latest incident 2021-02-25 04:54:09,912+0000 INFO (jsonrpc/0) [api.virt] FINISH getStats return={'status': {'code': 1, 'message': "Virtual machine does not exist: {'vmId': '935bc35f-8cf8-4d42-944e-14a6ccfe1154'}"}} from=::1,49970, vmId=935bc35f-8cf8-4d42-944e-14a6ccfe1154 (api:54) 2021-02-25 04:54:09,912+0000 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call VM.getStats failed (error 1) in 0.00 seconds (__init__:312) 2021-02-25 04:54:10,317+0000 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.45..51,44854 (api:48) 2021-02-25 04:54:10,341+0000 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'code': 0, 'message': 'Done'}, 'statsList': (suppressed)} from=::ffff:192.168.45..51,44854 (api:54) 2021-02-25 04:54:11,091+0000 INFO (jsonrpc/6) [api.host] START getStats() from=::ffff:192.168.45..51,44854 (api:48) 2021-02-25 04:54:11,309+0000 INFO (jsonrpc/6) [vdsm.api] START repoStats(domains=()) from=::ffff:192.168.45..51,44854, task_id=cba96e56-37d8-4bf4-82de-dc8b44d2baa4 (api:48) 2021-02-25 04:54:11,310+0000 INFO (jsonrpc/6) [vdsm.api] FINISH repoStats return={'7453f6ce-d98e-4378-b424-4de420eb1d47': {'code': 0, 'lastCheck': '5.8', 'delay': '0.000927147', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, 'dc474146-14ca-4883-a042-72f8 543e001c': {'code': 0, 'lastCheck': '4.6', 'delay': '0.000855514', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, 'f98a9d77-2e28-41c1-940a-da4a06ed2bab': {'code': 0, 'lastCheck': '4.6', 'delay': '0.000876128', 'valid': True, 'version': 0, 'acquired': True , 'actual': True}, '24d3498c-9ddf-4b0c-a5ab-591c90daf695': {'code': 0, 'lastCheck': '3.1', 'delay': '0.000582697', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '35b65f82-7b9f-4304-ab09-e1ddd5d13fa3': {'code': 0, 'lastCheck': '1.7', 'delay': '0.0005198', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '4c76052f-9281-474c-9af7-c742ae8b2592': {'code': 0, 'lastCheck': '3.6', 'delay': '0.0011351', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '19822b30-5d5b-4b5b-a003-49a6537bde9f': {'code': 0, 'lastCheck': '2.4', 'delay': '0.000368789', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, 'de7fec7c-f7d3-4f9c-b378-0442a873fa69': {'code': 0, 'lastCheck': '1.7', 'delay': '0.000478712', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} f rom=::ffff:192.168.45..51,44854, task_id=cba96e56-37d8-4bf4-82de-dc8b44d2baa4 (api:54) 2021-02-25 04:54:11,310+0000 INFO (jsonrpc/6) [vdsm.api] START multipath_health() from=::ffff:192.168.45..51,44854, task_id=dac513a3-c3b5-4c7f-bca7-630bc1f9347c (api:48) 2021-02-25 04:54:11,310+0000 INFO (jsonrpc/6) [vdsm.api] FINISH multipath_health return={} from=::ffff:192.168.45..51,44854, task_id=dac513a3-c3b5-4c7f-bca7-630bc1f9347c (api:54) 2021-02-25 04:54:11,357+0000 INFO (jsonrpc/6) [api.host] FINISH getStats return={'status': {'code': 0, 'message': 'Done'}, 'info': (suppressed)} from=::ffff:192.168.45..51,44854 (api:54) 2021-02-25 04:54:13,109+0000 INFO (jsonrpc/4) [vdsm.api] START repoStats(domains=['7453f6ce-d98e-4378-b424-4de420eb1d47']) from=::1,49970, task_id=e345ec5e-4ec1-4887-a6f8-37f9d2efcb4e (api:48) 2021-02-25 04:54:13,110+0000 INFO (jsonrpc/4) [vdsm.api] FINISH repoStats return={'7453f6ce-d98e-4378-b424-4de420eb1d47': {'code': 0, 'lastCheck': '1.6', 'delay': '0.000927147', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::1,49970, task_id=e345e c5e-4ec1-4887-a6f8-37f9d2efcb4e (api:54) 2021-02-25 04:54:15,732+0000 WARN (qgapoller/2) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-25 04:54:15,733+0000 CRIT (qgapoller/2) [root] taking calling process down. (libvirtconnection:158) 2021-02-25 04:54:15,734+0000 WARN (qgapoller/2) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-25 04:54:15,734+0000 CRIT (qgapoller/2) [root] taking calling process down. (libvirtconnection:158) 2021-02-25 04:54:15,734+0000 INFO (qgapoller/2) [vds] Failed to get guest info for vm=648b771d-ef30-4fdf-a57e-aec187445eeb, error: internal error: client socket is closed (qemuguestagent:414) 2021-02-25 04:54:15,734+0000 WARN (qgapoller/2) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-25 04:54:15,734+0000 CRIT (qgapoller/2) [root] taking calling process down. (libvirtconnection:158) 2021-02-25 04:54:15,734+0000 ERROR (qgapoller/2) [virt.periodic.Operation] <bound method QemuGuestAgentPoller._poller of <vdsm.virt.qemuguestagent.QemuGuestAgentPoller object at 0x7ff9581bbb70>> operation failed (periodic:188) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/periodic.py", line 186, in __call__ self._func() File "/usr/lib/python3.6/site-packages/vdsm/virt/qemuguestagent.py", line 346, in _poller if not self._runnable_on_vm(vm_obj): File "/usr/lib/python3.6/site-packages/vdsm/virt/qemuguestagent.py", line 547, in _runnable_on_vm if not vm.isDomainRunning(): File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 2426, in isDomainRunning status = self._dom.info() File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f ret = attr(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 1518, in info if ret is None: raise libvirtError ('virDomainGetInfo() failed', dom=self) libvirt.libvirtError: internal error: client socket is closed 2021-02-25 04:54:15,741+0000 INFO (MainThread) [vds] Received signal 15, shutting down (vdsmd:73) 2021-02-25 04:54:15,742+0000 INFO (MainThread) [root] Stopping DHCP monitor. (dhcp_monitor:106) 2021-02-25 04:54:15,842+0000 INFO (MainThread) [root] Stopping Bond monitor. (bond_monitor:53) 2021-02-25 04:54:15,843+0000 INFO (MainThread) [health] Stopping health monitor (health:71) 2021-02-25 04:54:15,853+0000 INFO (MainThread) [vds] Stopping http server (http:85) 2021-02-25 04:54:15,853+0000 INFO (http) [vds] Server stopped (http:75) 2021-02-25 04:54:15,853+0000 INFO (MainThread) [jsonrpc.JsonRpcServer] Stopping JsonRPC Server (__init__:446) 2021-02-25 04:54:16,068+0000 INFO (MainThread) [root] Unregistering all secrets (secret:94) 2021-02-25 04:54:16,068+0000 WARN (MainThread) [root] connection to libvirt broken. ecode: 1 edom: 7 (libvirtconnection:156) 2021-02-25 04:54:16,068+0000 CRIT (MainThread) [root] taking calling process down. (libvirtconnection:158) 2021-02-25 04:54:16,068+0000 INFO (MainThread) [vds] Received signal 15, shutting down (vdsmd:73) 2021-02-25 04:54:16,068+0000 ERROR (MainThread) [vds] Exception raised (vdsmd:164) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/vdsmd.py", line 162, in run serve_clients(log) File "/usr/lib/python3.6/site-packages/vdsm/vdsmd.py", line 129, in serve_clients cif.prepareForShutdown() File "/usr/lib/python3.6/site-packages/vdsm/clientIF.py", line 391, in prepareForShutdown secret.clear() File "/usr/lib/python3.6/site-packages/vdsm/virt/secret.py", line 96, in clear for virsecret in con.listAllSecrets(): File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 5692, in listAllSecrets raise libvirtError("virConnectListAllSecrets() failed", conn=self) libvirt.libvirtError: internal error: client socket is closed 2021-02-25 04:54:16,070+0000 INFO (MainThread) [vds] Stopping threads (vdsmd:166) 2021-02-25 04:54:16,070+0000 INFO (MainThread) [vds] Exiting (vdsmd:177) 2021-02-25 04:55:28,038+0000 INFO (MainThread) [vds] (PID: 3511517) I am the actual vdsm 4.40.40.1 ovirt-host-05.local (4.18.0-240.10.1.el8_3.x86_64) (vdsmd:155) 2021-02-25 04:55:28,039+0000 INFO (MainThread) [vds] VDSM will run with cpu affinity: frozenset({1}) (vdsmd:274) 2021-02-25 04:55:28,049+0000 INFO (MainThread) [storage.HSM] START HSM init (hsm:376) 2021-02-25 04:55:28,052+0000 INFO (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:381) 2021-02-25 04:55:28,053+0000 INFO (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:201) 2021-02-25 04:55:28,408+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/19822b30-5d5b-4b5b-a003-49a6537bde9f' (hsm:532) 2021-02-25 04:55:28,409+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/24d3498c-9ddf-4b0c-a5ab-591c90daf695' (hsm:532) 2021-02-25 04:55:28,409+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/35b65f82-7b9f-4304-ab09-e1ddd5d13fa3' (hsm:532) 2021-02-25 04:55:28,409+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/de7fec7c-f7d3-4f9c-b378-0442a873fa69' (hsm:532) 2021-02-25 04:55:28,410+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/4c76052f-9281-474c-9af7-c742ae8b2592' (hsm:532) 2021-02-25 04:55:28,410+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/dc474146-14ca-4883-a042-72f8543e001c' (hsm:532) 2021-02-25 04:55:28,410+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/f98a9d77-2e28-41c1-940a-da4a06ed2bab' (hsm:532) 2021-02-25 04:55:28,410+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/7453f6ce-d98e-4378-b424-4de420eb1d47' (hsm:532) 2021-02-25 04:55:28,410+0000 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/9eaab007-b755-4563-93a0-5776944327af/mastersd' (hsm:532) 2021-02-25 04:55:28,412+0000 INFO (MainThread) [storage.check] Starting check service (check:91) 2021-02-25 04:55:28,412+0000 INFO (hsm/init) [storage.StorageDomainCache] Refreshing storage domain cache (resize=True) (sdc:80) 2021-02-25 04:55:28,413+0000 INFO (hsm/init) [storage.ISCSI] Scanning iSCSI devices (iscsi:442) 2021-02-25 04:55:28,419+0000 INFO (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (dispatcher:47) 2021-02-25 04:55:28,421+0000 INFO (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x139714285661264> (asyncevent:126) 2021-02-25 04:55:28,469+0000 INFO (MainThread) [vdsm.api] START registerDomainStateChangeCallback(callbackFunc=functools.partial(<function clientIF.contEIOVms at 0x7f11a05efb70>, <weakproxy at 0x7f11c057fd68 to clientIF at 0x7f11c0593748>)) from=internal, task_id=eb2b388 e-d432-45eb-97e6-5a69ded28db9 (api:48) 2021-02-25 04:55:28,470+0000 INFO (MainThread) [vdsm.api] FINISH registerDomainStateChangeCallback return=None from=internal, task_id=eb2b388e-d432-45eb-97e6-5a69ded28db9 (api:54) 2021-02-25 04:55:28,470+0000 INFO (MainThread) [vds] Using libvirt for querying QEMU-GA (qemuguestagent:162) 2021-02-25 04:55:28,471+0000 INFO (MainThread) [MOM] Preparing MOM interface (momIF:85) 2021-02-25 04:55:28,471+0000 INFO (MainThread) [MOM] MOM: Using named unix socket: /run/vdsm/mom-vdsm.sock (momIF:94) 2021-02-25 04:55:28,471+0000 INFO (MainThread) [root] Unregistering all secrets (secret:94) 2021-02-25 04:55:28,491+0000 INFO (MainThread) [vds] Setting channels' timeout to 30 seconds. (vmchannels:224) 2021-02-25 04:55:28,491+0000 INFO (MainThread) [vds] Starting QEMU-GA poller (qemuguestagent:176) 2021-02-25 04:55:28,493+0000 INFO (MainThread) [vds.MultiProtocolAcceptor] Listening at :::54321 (protocoldetector:183) 2021-02-25 04:55:28,547+0000 INFO (http) [vds] Server running (http:64) # abrt-cli list id 745c9c5f0430e3b6df6a2f14e6832bf96c926cc0 reason: g_main_context_check(): libvirtd killed by SIGSEGV time: Thu Feb 25 04:54:11 2021 cmdline: /usr/sbin/libvirtd package: libvirt-daemon-6.6.0-7.3.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/ccpp-2021-02-25-04:54:11-3986073 id 26740be4fd85d4c55dc0e001b3505ece4ca72f46 reason: stop_all_submonitors(): monitor.py:119:stop_all_submonitors:RuntimeError: dictionary changed size during iteration time: Fri Aug 21 07:29:05 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.4-1.el8 uid: 36 (vdsm) count: 8 Directory: /var/tmp/abrt/Python3-2020-08-21-07:29:05-3019 id 5c16b5b21a0c603453a3006c874b6ad30d9b5685 reason: connect_storage_server(): broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Thu Aug 20 08:49:53 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.4-1.el8 uid: 36 (vdsm) count: 129 Directory: /var/tmp/abrt/Python3-2020-08-20-08:49:53-31179 id f4c5ce95d41d55b15df8f4f5ad6b16244e374792 reason: __vdsm_json_rpc_connect(): broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Thu Aug 20 09:59:33 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.4-1.el8 uid: 36 (vdsm) count: 147 Directory: /var/tmp/abrt/Python3-2020-08-20-09:59:33-12749 id 61417beee541dcb3f8df8b1f73b022c43408e8cb reason: get_ovf_from_dir(): query_ova.py:47:get_ovf_from_dir:FileNotFoundError: [Errno 2] No such file or directory: '/tmp/*.ova' time: Thu Sep 3 16:41:17 2020 cmdline: /usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1599151276.8783715-2813736-253499112333134/query_ova.py vm /tmp/*.ova True uid: 0 (root) count: 1 Directory: /var/tmp/abrt/Python3-2020-09-03-16:41:17-3496217 id 93b942fdd204e3e5acbe21051b7f06a0b88abdbf reason: get_ovf_from_dir(): query_ova.py:47:get_ovf_from_dir:FileNotFoundError: [Errno 2] No such file or directory: 'll /tmp/vcp_20.2R1.10.ova' time: Thu Sep 3 11:35:32 2020 cmdline: /usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1599132932.4139793-2779162-216986455300300/query_ova.py vm 'll /tmp/vcp_20.2R1.10.ova' True uid: 0 (root) count: 1 Directory: /var/tmp/abrt/Python3-2020-09-03-11:35:32-3230539 id 979832e576b63a9e36febfe3ec27e65b208e920a reason: execute_in_peers(): cliutils.py:134:execute_in_peers:gluster.cliutils.cliutils.GlusterCmdException: (1, '', 'Commit failed on prd-ovirt-host-07.estore.rot01.kwebbl.cloud. Error: Unable to end. Error : Success\nCommit failed on prd-ovirt-host-03.estore.rot01.kwebbl.cloud. Error: Unable to end. Error : Success\nCommit failed on prd-ovirt-host-06.estore.rot01.kwebbl.cloud. Error: Unable to end. Error : Success\nCommit failed on prd-ovirt-host-02.estore.rot01.kwebbl.cloud. Error: Unable to end. Error : Success\nCommit failed on prd-ovirt-host-01.estore.rot01.kwebbl.cloud. Error: Unable to end. Error : Success\n', 'gluster system:: execute eventsapi.py node-reload') time: Fri Aug 21 14:46:37 2020 cmdline: /usr/libexec/platform-python /sbin/gluster-eventsapi webhook-add http://prd-ovirt-engine-01.live.rot01.kwebbl.cloud:80/ovirt-engine/services/glusterevents package: glusterfs-events-7.7-1.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/Python3-2020-08-21-14:46:37-113908 id 16d49298f5f821ba48d1bf66072e416077903304 reason: glusterd_friend_rpc_create(): glusterfsd killed by SIGSEGV time: Fri Aug 21 11:22:21 2020 cmdline: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO package: glusterfs-fuse-7.7-1.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/ccpp-2020-08-21-11:22:21-1802 id b2bc77f543e5fd6d5630ad787fc18c05317b86b8 reason: ecp22_start.cold.10(): lldpad killed by SIGSEGV time: Fri Aug 21 09:04:03 2020 cmdline: /usr/sbin/lldpad -t package: lldpad-1.0.1-13.git036e314.el8 uid: 0 (root) count: 1 Directory: /var/tmp/abrt/ccpp-2020-08-21-09:04:03-12794 id f6e48ce88537d1b60b0729394d5d5b7d9dfb0471 reason: broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Fri Aug 21 08:33:40 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker uid: 36 Directory: /var/tmp/abrt/Python3-2020-08-21-08:33:40-7300 Reported: cannot be reported id 6309a503b0d6e015d2fbc05ea9f4c499ff0f9f19 reason: _call(): broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Thu Aug 20 09:42:21 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.4-1.el8 uid: 36 (vdsm) count: 11 Directory: /var/tmp/abrt/Python3-2020-08-20-09:42:21-7101 id de46d38de6f21dfdfe55e34be3a44876c0494e3d reason: _handle_quit(): broker.py:114:_handle_quit:AttributeError: 'NoneType' object has no attribute 'close_connections' time: Thu Aug 20 09:09:06 2020 cmdline: /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker package: ovirt-hosted-engine-ha-2.4.4-1.el8 uid: 36 (vdsm) count: 3 Directory: /var/tmp/abrt/Python3-2020-08-20-09:09:06-19319
(In reply to Pavel Zinchuk from comment #9) > Content of directory /var/tmp/abrt/ccpp-2021-02-23-14:38:24-2909 I can't > provide in public. > This is production ovirt node with private hostnames. That's understandable. > > If you need some specific information from > /var/tmp/abrt/ccpp-2021-02-23-14:38:24-290 please specify required file or > information. Yes, having the coredump would be best, but that file will be huge. So for now just files 'core_backtrace' and 'maps'. Those two should not contain any sensitive information, but feel free to send me the files by email if you are concerned. Also please report what versions of libvirt RPMs are installed in your node.
Created attachment 1759339 [details] maps file from issue with date 2021-02-23 14:38
Created attachment 1759340 [details] core_backtrace file from issue with date 2021-02-23 14:38
(In reply to Tomáš Golembiovský from comment #11) > (In reply to Pavel Zinchuk from comment #9) > > Content of directory /var/tmp/abrt/ccpp-2021-02-23-14:38:24-2909 I can't > > provide in public. > > This is production ovirt node with private hostnames. > > That's understandable. > > > > > If you need some specific information from > > /var/tmp/abrt/ccpp-2021-02-23-14:38:24-290 please specify required file or > > information. > > Yes, having the coredump would be best, but that file will be huge. So for > now just files 'core_backtrace' and 'maps'. Those two should not contain any > sensitive information, but feel free to send me the files by email if you > are concerned. > > Also please report what versions of libvirt RPMs are installed in your node. I have attached required files. List of installed libvirt packages: # rpm -qa | grep libvirt libvirt-admin-6.6.0-7.3.el8.x86_64 libvirt-daemon-config-nwfilter-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-scsi-6.6.0-7.3.el8.x86_64 libvirt-bash-completion-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-nodedev-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-iscsi-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-secret-6.6.0-7.3.el8.x86_64 libvirt-daemon-kvm-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-mpath-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-network-6.6.0-7.3.el8.x86_64 libvirt-libs-6.6.0-7.3.el8.x86_64 libvirt-daemon-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-logical-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-qemu-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-gluster-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-iscsi-direct-6.6.0-7.3.el8.x86_64 libvirt-lock-sanlock-6.6.0-7.3.el8.x86_64 python3-libvirt-6.6.0-1.el8.x86_64 libvirt-daemon-driver-nwfilter-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-rbd-6.6.0-7.3.el8.x86_64 libvirt-daemon-config-network-6.6.0-7.3.el8.x86_64 libvirt-client-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-core-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-storage-disk-6.6.0-7.3.el8.x86_64 libvirt-daemon-driver-interface-6.6.0-7.3.el8.x86_64
This is rather bad. The stack of the crashed thread is smashed (other threads are fine waiting in poll or pthread_cond_*wait). That is unfortunate as one needs to first figure out what exactly leads to the crash (and how to reproduce it) before anyone can do anything with the problem. You will need to enable debug logging in libvirtd on your machine and hopefully this will give enough clues to what is happening.
I've enabled libvirt debug on one production oVirt node. Waiting for issue on this node.
Also since libvirtd seems to have crashed, please also attach the backtrace from abrt.
Umm, nevermind I looked into the attachments now: , { "frames": [ { "address": 140074265750689 , "build_id": "ade58d86662aceee2210a9ef12018705e978965d" , "build_id_offset": 990369 , "function_name": "__poll" , "file_name": "/lib64/libc.so.6" } , { "address": 140074277673398 , "build_id": "58f989a97b7d84e636e5885db27a42281581baef" , "build_id_offset": 317878 , "function_name": "g_main_context_iterate.isra.21" , "file_name": "/lib64/libglib-2.0.so.0" } , { "address": 140074277674354 , "build_id": "58f989a97b7d84e636e5885db27a42281581baef" , "build_id_offset": 318834 , "function_name": "g_main_loop_run" , "file_name": "/lib64/libglib-2.0.so.0" } , { "address": 140074331771774 , "build_id": "3032f94742c49f5235533c10277403b2009a9a97" , "build_id_offset": 881534 , "function_name": "virEventThreadWorker" , "file_name": "/lib64/libvirt.so.0" } , { "address": 140074277838154 , "build_id": "58f989a97b7d84e636e5885db27a42281581baef" , "build_id_offset": 482634 , "function_name": "g_thread_proxy" , "file_name": "/lib64/libglib-2.0.so.0" } , { "address": 140074273046858 , "build_id": "634e9ea2594f8c82ddde20ab179bda7daee7e18f" , "build_id_offset": 33098 , "function_name": "start_thread" , "file_name": "/lib64/libpthread.so.0" } , { "address": 140074265796387 , "build_id": "ade58d86662aceee2210a9ef12018705e978965d" , "build_id_offset": 1036067 , "function_name": "__clone" , "file_name": "/lib64/libc.so.6" } ] } ] } This looks like the same problem as https://bugzilla.redhat.com/show_bug.cgi?id=1894045
Umm, nevermind2, that thread is okay, indeed the crashed thread seems to have the stack corrupted. There's one more thread waiting for agent response, all other threads are idle, so we indeed might need to look into the logs what was happening.
@pkrempa - looks like you agree that further investigation is required. Can I assign the bug to yo
(In reply to Peter Krempa from comment #18) > Also since libvirtd seems to have crashed, please also attach the backtrace > from abrt. Hi, we still receive issue on random servers. We can't enable debug on all servers, waiting when issue will occurs again on servers with enabled libvirtd debug. Peter, you still need backtrace from abrt? If yes, please tell me how to get this information.
Based on the backtrace and more closely on other race-condition induced crashes I'm fairly certain that the root cause for this bug is the same as in https://bugzilla.redhat.com/show_bug.cgi?id=1931331 . The only difference here is that in this instance it happened inside a guest-agent call, whereas in 1931331 it happened in a monitor command. Since both actually share the same backend code for accessing the monitor socket and are fixed by the same commit we can assume that the same can happen here too. The discrepancy in the ABRT backtrace not showing the crashed thread is probably a fluke. The backtrace of the crashed thread should look like: Thread 1 (Thread 0x7f6576ffd700 (LWP 31672)): #0 0x00007f66500000f0 in ?? () #1 0x00007f6677c9e7bd in g_source_unref_internal () from /lib64/libglib-2.0.so.0 #2 0x00007f6677c9e90e in g_source_iter_next () from /lib64/libglib-2.0.so.0 #3 0x00007f6677ca0e23 in g_main_context_prepare () from /lib64/libglib-2.0.so.0 #4 0x00007f6677ca18eb in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0 #5 0x00007f6677ca1d72 in g_main_loop_run () from /lib64/libglib-2.0.so.0 #6 0x00007f667b03937e in virEventThreadWorker (opaque=0x7f6650269980) at ../../src/util/vireventthread.c:120 #7 0x00007f6677cc9d4a in g_thread_proxy () from /lib64/libglib-2.0.so.0 #8 0x00007f667783814a in start_thread () from /lib64/libpthread.so.0 #9 0x00007f667714df23 in clone () from /lib64/libc.so.6 For reference the waiting thread from the other bug: Thread 2 (Thread 0x7f666f413700 (LWP 2838)): #0 0x00007f667783e2fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f667b093aea in virCondWait (c=<optimized out>, m=<optimized out>) at ../../src/util/virthread.c:148 #2 0x00007f6631402104 in qemuMonitorSend () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #3 0x00007f6631411ca5 in qemuMonitorJSONCommandWithFd () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #4 0x00007f6631414813 in qemuMonitorJSONGetMemoryStats () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #5 0x00007f663142a79a in qemuDomainMemoryStatsInternal () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #6 0x00007f663142a8f4 in qemuDomainGetStatsBalloon () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so [truncated] As of such. I'll close it as duplicate of 1931331 for now. If you happen to collect a better backtrace and it would contradict my assumption, please reopen this bug. Another possibility to try to produce a better coredump of the existing crash should be to use 'coredumpctl dump /usr/sbin/libvirtd > dump', but I'm not sure whether that will produce a better result. Possibly if you happen to encounter another crash the backtrace may be better. *** This bug has been marked as a duplicate of bug 1931331 ***