Bug 2129239

Summary: virtqemud coredump after restart virtqemud with a vm running with a hostdev interface
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
libvirt sub component: General QA Contact: yalzhang <yalzhang>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: jdenemar, lmen, mprivozn, pkrempa, virt-maint, xuzhang, yicui
Version: 9.1Keywords: Automation, AutomationTriaged, Regression, Upstream
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-8.8.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-09 07:27:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 8.8.0
Embargoed:
Attachments:
Description Flags
Coredump file for PID 2382 none

Description yalzhang@redhat.com 2022-09-23 03:15:33 UTC
Created attachment 1913661 [details]
Coredump file for PID 2382

Created attachment 1913661 [details]
Coredump file for PID 2382

Description of problem:
virtqemud coredump after restart virtqemud with a vm running with a hostdev interface

Version-Release number of selected component (if applicable):
libvirt-8.7.0-1.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. start a vm with a hostdev interface
# virsh attach-interface rhel hostdev --managed 0000:82:10.2 --config
# virsh dumpxml rhel --xpath //interface
<interface type="hostdev" managed="yes">
  <mac address="52:54:00:94:d1:ba"/>
  <source>
    <address type="pci" domain="0x0000" bus="0x82" slot="0x10" function="0x2"/>
  </source>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</interface>

# virsh start rhel
Domain 'rhel' started

# systemctl restart virtqemud

# coredumpctl list
TIME                         PID UID GID SIG     COREFILE EXE                   SIZE
Thu 2022-09-22 22:46:08 EDT 2382   0   0 SIGSEGV present  /usr/sbin/virtqemud 637.7K
Thu 2022-09-22 22:46:09 EDT 2409   0   0 SIGSEGV present  /usr/sbin/virtqemud 637.2K
Thu 2022-09-22 22:46:09 EDT 2436   0   0 SIGSEGV present  /usr/sbin/virtqemud 638.4K
Thu 2022-09-22 22:46:10 EDT 2463   0   0 SIGSEGV present  /usr/sbin/virtqemud 638.9K
Thu 2022-09-22 22:46:11 EDT 2492   0   0 SIGSEGV present  /usr/sbin/virtqemud 636.4K

Actual results:
virtqemud coredump after restart virtqemud with a vm running with a hostdev interface

Expected results:
virtqemud should not coredump

Additional info:
1) can not reproduce for rhel 9.1:
libvirt-8.5.0-6.el9.x86_64
qemu-kvm-7.0.0-13.el9.x86_64
2) can reproduce with libvirt-8.7.0
libvirt-8.7.0-1.el9.x86_64
qemu-kvm-7.0.0-13.el9.x86_64

Comment 1 Peter Krempa 2022-09-23 06:46:48 UTC
Looking through the backtraces a common theme seemes to be:

#0  qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias (parseFlags=<optimized out>, hostdev=0x7f4024094210) at ../src/qemu/qemu_domain.c:5596
#1  qemuDomainHostdevDefPostParse (parseFlags=<optimized out>, qemuCaps=0x7f40240d00a0, hostdev=0x7f4024094210) at ../src/qemu/qemu_domain.c:5628
#2  qemuDomainDeviceDefPostParse (dev=<optimized out>, def=<optimized out>, parseFlags=<optimized out>, opaque=0x7f4024023250, parseOpaque=0x7f40240d00a0) at ../src/qemu/qemu_domain.c:5784

Comment 2 Michal Privoznik 2022-09-23 13:15:31 UTC
Patch posted on the list:

https://listman.redhat.com/archives/libvir-list/2022-September/234447.html

Comment 3 Michal Privoznik 2022-09-23 13:41:40 UTC
Merged upstream:

commit a8947db1a4efc6fc53dabb67b74ba1c01c7fbc8b
Author:     Michal Prívozník <mprivozn>
AuthorDate: Fri Sep 23 15:06:19 2022 +0200
Commit:     Michal Prívozník <mprivozn>
CommitDate: Fri Sep 23 15:28:34 2022 +0200

    qemu_domain: Ignore all but SCSI hostdevs in qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias()
    
    When retiring QEMU_CAPS_BLOCKDEV_HOSTDEV_SCSI capability the
    commit removed a bit too much. Previously, all other devices than
    VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI were ignored in
    qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias(). But the
    commit in question removed not only the capability check but also
    this return early statement. Restore it back.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2129239
    Fixes: dc8dbb27d40968c9d9bfad2c6181bccc20c0e44e
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Peter Krempa <pkrempa>
    Reviewed-by: Martin Kletzander <mkletzan>

v8.7.0-130-ga8947db1a4

Comment 4 yalzhang@redhat.com 2022-10-10 05:05:30 UTC
Reproduce the bug on libvirt-8.7.0-1.el9.x86_64 with the steps in comment 0, then update the libvirt to libvirt-8.8.0-1.el9.x86_64, re-test with the steps, no coredump found.

Comment 7 yalzhang@redhat.com 2022-11-06 13:57:05 UTC
Test on libvirt-8.8.0-1.el9.x86_64 with steps in comment 0, the issuse is fixed.

Comment 9 errata-xmlrpc 2023-05-09 07:27:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2171