Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2059809

Summary: [virtual network][rhel9][vdpa]Booting the guest with the vdpa device, hot unplugging it and then hot plugging it again, rebooting the guest causes qemu core dump
Product: Red Hat Enterprise Linux 9 Reporter: Lei Yang <leiyang>
Component: qemu-kvmAssignee: lulu <lulu>
qemu-kvm sub component: Networking QA Contact: Lei Yang <leiyang>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: unspecified CC: aadam, chayang, coli, jasowang, jinzhao, juzhang, lulu, pezhang, virt-maint, wquan
Version: 9.0Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-09 01:41:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lei Yang 2022-03-02 04:54:20 UTC
Description of problem:
Booting the guest with the vdpa device, hot unplugging it and then hot plugging it again, rebooting the guest causes qemu core dump

Version-Release number of selected component (if applicable):
qemu-kvm-6.2.0-10.el9.x86_64
kernel-5.14.0-68.mr552_220223_1400.el9.x86_64
git clone https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
vdpa-Add-support-to-configure-max-number-of-VQs.patch
vdpa-Remove-unsupported-command-line-option.patch
virtio-Define-bit-numbers-for-device-independent-fea.patch

# flint -d 0000:3b:00.0 q
Image type:            FS4
FW Version:            22.32.2004
FW Release Date:       13.1.2022
Product Version:       22.32.2004
Rom Info:              type=UEFI version=14.25.18 cpu=AMD64,AARCH64
                       type=PXE version=3.6.502 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             b8cef603000a110c        4
Base MAC:              b8cef60a110c            4
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_0000000359
Security Attributes:   N/A

How reproducible:
Only once so far

Unfortunately, I only encountered it once,and no qemu core dump information was captured. In the process of continuous reproduction, I encountered another: Bug 2059799. Although the steps of the two bug tests are exactly the same, they do cause different results.

When qemu core dump occurs, qemu output: 
qemu-kvm: ../hw/virtio/vhost-vdpa.c:560: int vhost_vdpa_get_vq_index(struct vhost_dev *, int): Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.

Please feel free to connect me if you have any test need to be done.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 lulu@redhat.com 2022-03-07 06:03:27 UTC
Hi lei, 
would you help try to reproduce this issue in kernel-5.14.0-68.el9.x86_64? 
I wonder if this is an issue in mlx driver, 
here is the bug in rhel8.6 
https://bugzilla.redhat.com/show_bug.cgi?id=2048060
this bug has the same assert, this issue was fixed by the MR to sync the mlx driver
https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1974

thanks
Cindy

Comment 3 Lei Yang 2022-03-07 07:58:47 UTC
(In reply to lulu from comment #2)
> Hi lei, 
> would you help try to reproduce this issue in kernel-5.14.0-68.el9.x86_64? 
> I wonder if this is an issue in mlx driver, 
> here is the bug in rhel8.6 
> https://bugzilla.redhat.com/show_bug.cgi?id=2048060
> this bug has the same assert, this issue was fixed by the MR to sync the mlx
> driver
> https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1974
> 
> thanks
> Cindy

Hello Cindy

I tried to test on kernel-5.14.0-68.el9.x86_64, doesn't reprodue this bug. Therefore the current bug should be the same issue as bug 2048060.

Test Version:
kernel-5.14.0-68.el9.x86_64
qemu-kvm-6.2.0-10.el9.x86_64

Best Regards
Lei

Comment 4 Lei Yang 2022-03-09 01:41:50 UTC
This bug is a Mellanox firmware issue, Nvidia's team is syncing the latest driver. So set the bug to "CURRENTRELEASE". Please corrent me if I'm wrong.