Bug 2082782

Summary: vm with vdpa interface set with page_per_vq will crash after start
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: qemu-kvmAssignee: Laurent Vivier <lvivier>
qemu-kvm sub component: Networking QA Contact: Lei Yang <leiyang>
Status: CLOSED DUPLICATE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: hhan, jinzhao, leiyang, lmen, lvivier, pkrempa, virt-maint, xuzhang
Version: 9.1Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-10 09:26:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yalzhang@redhat.com 2022-05-07 07:15:46 UTC
Description of problem:
vm with vdpa interface will crash after start

Version-Release number of selected component (if applicable):
# rpm -q libvirt  qemu-kvm
libvirt-8.2.0-1.el9.x86_64
qemu-kvm-7.0.0-1.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Setup the env for vdpa, prepare the vm with interface as below:
 <interface type='vdpa'>
      <mac address='00:11:22:33:44:00'/>
      <source dev='/dev/vhost-vdpa-0'/>
      <model type='virtio'/>
      <driver queues='8' page_per_vq='on'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
2. Start the vm, the vm can start successfully, but it crashed later:
# virsh start test ; virsh domstate test; sleep 10; virsh domstate test
Domain 'test' started

running

shut off

3. check the log of the vm:
# cat /var/log/libvirt/qemu/test.log
......
qemu-kvm: ../hw/virtio/vhost-vdpa.c:716: int vhost_vdpa_get_vq_index(struct vhost_dev *, int): Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
2022-05-07 07:01:08.053+0000: shutting down, reason=crashed

Actual results:
vm with vdpa interface will crash after start

Expected results:
vm should not crash

Additional info:
qemu cmd of the interface change after libvirt upgrade:
<interface type='vdpa'>
      <mac address='00:11:22:33:44:00'/>
      <source dev='/dev/vhost-vdpa-0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>

# rpm -q libvirt qemu-kvm
libvirt-8.0.0-8.1.el9_0.x86_64
qemu-kvm-6.2.0-11.el9_0.2.x86_64
 
-add-fd set=1,fd=23,opaque=/dev/vhost-vdpa-0 
-netdev vhost-vdpa,vhostdev=/dev/fdset/1,id=hostnet0 
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:11:22:33:44:00,bus=pci.1,addr=0x0
 
 
# rpm -q libvirt qemu-kvm
libvirt-8.2.0-1.el9.x86_64
qemu-kvm-6.2.0-11.el9_0.2.x86_64
 
 -add-fd set=0,fd=23,opaque=net0-vdpa 
-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 
-device {"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"00:11:22:33:44:00","bus":"pci.1","addr":"0x0"}

Comment 1 Han Han 2022-05-07 07:41:27 UTC
Security issue candidate because `virsh domstate` could be called by RO connection.

Comment 4 Han Han 2022-05-07 09:58:15 UTC
As the reporter said, the crash is not caused by `virsh domstate`. So it is not a security issue. Please ignore comment1

Comment 6 Peter Krempa 2022-05-09 07:08:38 UTC
Moving to qemu as qemu abort()s on an assertion failure.

Comment 7 Laurent Vivier 2022-05-09 10:22:22 UTC
It looks like a kernel bug, what is the host kernel version?

Comment 8 yalzhang@redhat.com 2022-05-09 10:33:26 UTC
(In reply to Laurent Vivier from comment #7)
> It looks like a kernel bug, what is the host kernel version?

It's kernel-5.14.0-85.el9.x86_64. 
And it only happens on uefi guest, with multiqueue set in the vdpa interface.

Comment 9 Laurent Vivier 2022-05-09 13:46:27 UTC
I think this should be fixed by:

[PATCH v4 0/7] vhost-vdpa multiqueue fixes
https://patchew.org/QEMU/1651890498-24478-1-git-send-email-si-wei.liu@oracle.com/

Comment 15 Laurent Vivier 2022-05-10 09:26:04 UTC

*** This bug has been marked as a duplicate of bug 2070804 ***