Bug 1945940

Summary: RHV4.4.3 libvirtd is getting killed with a segfault (qemuAgentSend)
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Frank DeLorey <fdelorey>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED DUPLICATE QA Contact: Lili Zhu <lizhu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.3CC: jsuchane, lmen, mprivozn, usurse, virt-maint, xuzhang
Target Milestone: rcKeywords: Triaged
Target Release: 8.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-04 07:32:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
VDSM log from time of failure
none
Dump from libvirt none

Comment 1 Frank DeLorey 2021-04-02 19:19:33 UTC
Created attachment 1768643 [details]
VDSM log from time of failure

Comment 2 Frank DeLorey 2021-04-02 19:22:09 UTC
Created attachment 1768644 [details]
Dump from libvirt

Comment 4 Jaroslav Suchanek 2021-04-06 07:42:55 UTC
Martin, can you please check this? Thanks.

Comment 5 Martin Kletzander 2021-04-07 18:22:01 UTC
It is difficult to find out whether this is a dup of the glib related crashes fixed recently as the dump is not very helpful.  I hope this is fixed with Bug 1942010 or similar.  Would it be possible to check with a package that has the fix for the bz to make sure we are not looking for something that is fixed?  In order for us to also look for the possibilities, it would be nice to get a full backtrace of the crashed libvirt.  In the meantime I'll try to load the coredump in gdb to see if that's it or not.

Comment 6 Martin Kletzander 2021-04-07 23:38:17 UTC
I got the full backtrace finally and it is not the previously mentioned bug.  I also noticed that you are hitting this with the libvirt version that already has all the fixes in, so this needs to be investigated further by us.  I'll see what I can do and will keep you posted.  What is the (very rough) frequency that you hit this with?

Comment 7 Frank DeLorey 2021-04-08 15:19:23 UTC
From the customer:

Quite often, about once or twice a day. We have multiple RHV clusters in our environment, it happened on the big cluster (with high VM load) and even on a small cluster (with low VM load). We started seeing this after updating to RHV 4.4"

Comment 8 Martin Kletzander 2021-04-14 12:10:57 UTC
Thank you for the info.  I am going through the code and I noticed one thing only now.  Your reported version is libvirt-6.6.0-13, whereas the latest workaround for the glib-related issues was fixed in libvirt-6.6.0-13.2, in Bug 1942010.  Would you mind trying whether this happens with that fix as well?