RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2083068 - [vhost-vdpa][rhel9.1][edk2] Boot a uefi guest with mq vhost-vdpa device occurs qemu core dump
Summary: [vhost-vdpa][rhel9.1][edk2] Boot a uefi guest with mq vhost-vdpa device occur...
Keywords:
Status: CLOSED DUPLICATE of bug 2070804
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: edk2
Version: 9.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Laurent Vivier
QA Contact: Lei Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-09 08:51 UTC by Lei Yang
Modified: 2022-05-10 09:28 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-10 09:25:32 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-121371 0 None None None 2022-05-09 08:56:01 UTC

Description Lei Yang 2022-05-09 08:51:00 UTC
Description of problem:
Boot a uefi guest with mq vhost-vdpa device occurs qemu core dump

Version-Release number of selected component (if applicable):
kernel-5.14.0-86.el9.x86_64
qemu-kvm-7.0.0-2.el9.x86_64
edk2-ovmf-20220221gitb24306f15d-1.el9.noarch
iproute-5.15.0-2.2.el9_0.x86_64

# flint -d 0000:3b:00.0 q
Image type:            FS4
FW Version:            22.33.1048
FW Release Date:       29.4.2022
Product Version:       22.33.1048
Rom Info:              type=UEFI version=14.26.17 cpu=AMD64,AARCH64
                       type=PXE version=3.6.502 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             b8cef603000a110c        4
Base MAC:              b8cef60a110c            4
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_0000000359
Security Attributes:   N/A

How reproducible:
100%

Steps to Reproduce:
1. create vdpa device
# echo 0 > /sys/bus/pci/devices/0000\:3b\:00.0/sriov_numvfs
# modprobe vhost_vdpa
# modprobe mlx5_vdpa
# echo 1 > /sys/bus/pci/devices/0000\:3b\:00.0/sriov_numvfs
# readlink /sys/bus/pci/devices/0000:3b:00.0/virtfn*
../0000:3b:00.2
# echo 0000:3b:00.2 >/sys/bus/pci/drivers/mlx5_core/unbind
# devlink dev eswitch set pci/0000:3b:00.0 mode switchdev
# echo 0000:3b:00.2 >/sys/bus/pci/drivers/mlx5_core/bind
# vdpa mgmtdev show | grep pci
pci/0000:3b:00.2: 
# vdpa dev add name vdpa0 mgmtdev pci/0000:3b:00.2 mac 00:11:22:33:44:03  max_vqp 8
# ovs-vsctl add-br vdpa_bridge
# ovs-vsctl set Open_vSwitch . other_config:hw-offload="true"
# ovs-vsctl add-port vdpa_bridge enp59s0f0np0
# ovs-vsctl add-port vdpa_bridge eth0
# ip link set vdpa_bridge up
# ip addr add 192.168.10.10/24 dev vdpa_bridge
# dnsmasq --strict-order --bind-interfaces --listen-address 192.168.10.10 --dhcp-range 192.168.10.20,192.168.10.254 --dhcp-lease-max=253 --dhcp-no-override --pid-file=/tmp/dnsmasq.pid --log-facility=/tmp/dnsmasq.log

2.Boot a uefi guest with multi queues vhost-vdpa device
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
-blockdev node-name=file_ovmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel910-64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
-machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 25600 \
-object memory-backend-ram,size=25600M,id=mem-machine_mem  \
-smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/rhel910-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:11:22:33:44:02,mq=on,vectors=18,bus=pcie-root-port-3,addr=0x0 \
-netdev vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=hostnet0,queues=8 \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
-monitor stdio \

3.qemu core dump occurs at this point
# sh edk2.sh
QEMU 7.0.0 monitor - type 'help' for more information
(qemu) qemu-kvm: ../hw/virtio/vhost-vdpa.c:716: int vhost_vdpa_get_vq_index(struct vhost_dev *, int): Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
edk2.sh: line 34: 61241 Aborted    

4. core info
# gdb /usr/libexec/qemu-kvm core.qemu-kvm.61241
......
#0  0x00007fa22be6257c in __pthread_kill_implementation () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7fa222f0c640 (LWP 61246))]
(gdb) bt full
#0  0x00007fa22be6257c in __pthread_kill_implementation () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007fa22be15d56 in raise () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007fa22bde8833 in abort () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007fa22bde875b in __assert_fail_base.cold () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007fa22be0ecd6 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#5  0x0000557833acc7a0 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:716
No locals.
#6  0x0000557833ac27e1 in vhost_virtqueue_mask (hdev=0x557835eb4900, vdev=<optimized out>, n=6, mask=<optimized out>)
    at ../hw/virtio/vhost.c:1550
        file = {index = 1, fd = 88}
        index = <optimized out>
        vvq = <optimized out>
        r = <optimized out>
#7  0x000055783397eb20 in virtio_pci_set_guest_notifier (d=<optimized out>, n=<optimized out>, assign=<optimized out>, 
    with_irqfd=<optimized out>) at ../hw/virtio/virtio-pci.c:975
        proxy = <optimized out>
        vdev = 0x55783731e6a0
        vdc = <optimized out>
        vq = <optimized out>
        notifier = 0x7fa2205b91a8
#8  0x000055783397b110 in virtio_pci_set_guest_notifiers (d=<optimized out>, nvqs=3, assign=<optimized out>)
    at ../hw/virtio/virtio-pci.c:1020
        proxy = <optimized out>
        vdev = 0x55783731e6a0
        k = 0x557835dab720
        with_irqfd = false
        n = 2
        r = <optimized out>
        notifiers_error = <optimized out>
#9  0x000055783391f443 in vhost_net_start (dev=0x55783731e6a0, ncs=<optimized out>, data_queue_pairs=<optimized out>, cvq=<optimized out>)
    at ../hw/net/vhost_net.c:361
        qbus = 0x55783731e618
--Type <RET> for more, q to quit, c to continue without paging--
        vbus = 0x55783731e618
        total_notifiers = 3
        k = 0x557835d21580
        index_end = <optimized out>
        nvhosts = 2
        n = 0x55783731e6a0
        i = <optimized out>
        peer = <optimized out>
        net = <optimized out>
        r = <optimized out>
        e = <optimized out>
        err = <optimized out>
#10 0x0000557833a91163 in virtio_net_set_status (vdev=<optimized out>, status=15 '\017') at ../hw/net/virtio-net.c:290
        n = <optimized out>
        i = <optimized out>
        q = <optimized out>
        queue_status = <optimized out>
#11 0x0000557833abb7c7 in virtio_set_status (vdev=0x55783731e6a0, val=15 '\017') at ../hw/virtio/virtio.c:1947
        k = 0x557835dab720
        ret = <optimized out>
#12 0x000055783397e5ce in virtio_pci_common_write (opaque=0x557837316300, addr=<optimized out>, val=15, size=<optimized out>)
    at ../hw/virtio/virtio-pci.c:1293
        proxy = 0x557837316300
        vdev = 0x55783731e6a0
#13 0x0000557833a3d9d9 in memory_region_dispatch_write (mr=0x557837316e10, addr=20, data=<optimized out>, op=<optimized out>, attrs=...)
    at ../softmmu/memory.c:554
        size = <optimized out>
#14 0x0000557833a46dd5 in flatview_write_continue (fv=<optimized out>, addr=34360786964, attrs=..., ptr=<optimized out>, len=1, addr1=61246, 
    l=1, mr=<optimized out>) at ../softmmu/physmem.c:2814
        buf = <optimized out>
        release_lock = true
        result = 0
        val = 6
        ram_ptr = <optimized out>
#15 0x0000557833a4ae19 in address_space_write (as=<optimized out>, addr=34360786964, attrs=..., 
    buf=0x7fa22be6257c <__pthread_kill_implementation+284>, len=1) at ../softmmu/physmem.c:2856
        result = 0
--Type <RET> for more, q to quit, c to continue without paging--
        fv = 0x7f9bd4561de0
#16 0x0000557833b6dfc0 in kvm_cpu_exec (cpu=<optimized out>) at ../softmmu/physmem.c:2962
        run = <optimized out>
        ret = <optimized out>
        run_ret = <optimized out>
#17 0x0000557833b702ea in kvm_vcpu_thread_fn (arg=0x557835eb9320) at ../accel/kvm/kvm-accel-ops.c:49
        r = <optimized out>
        cpu = <optimized out>
#18 0x0000557833da325a in qemu_thread_start (args=0x557835ec7f10) at ../util/qemu-thread-posix.c:556
        __clframe = {__cancel_routine = <optimized out>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = <synthetic pointer>}
        qemu_thread_args = 0x557835ec7f10
        start_routine = 0x557833b70170 <kvm_vcpu_thread_fn>
        arg = 0x557835eb9320
        r = <optimized out>
#19 0x00007fa22be60832 in start_thread () from /lib64/libc.so.6
No symbol table info available.
#20 0x00007fa22be004c0 in clone3 () from /lib64/libc.so.6
No symbol table info available.

Actual results:
qemu core dump

Expected results:
Guest can boot succeed

Additional info:
1. this problem only occurs "uefi guest" + "mq vhost-vdpa", single queue works well
2. Seabios guest works well on mq/single queue

Comment 2 jason wang 2022-05-09 09:17:52 UTC
Looks like a duplication of bz2069946.

Thanks

Comment 3 Lei Yang 2022-05-09 10:13:11 UTC
(In reply to jason wang from comment #2)
> Looks like a duplication of bz2069946.
> 
> Thanks

Hello Jason

According to test result, I think you're right, they are has same core dump info. I tried to test bz2069946's scenario. the core dump file are as fowllow:

1. PXE a guest with multi queues vdpa
# cat seabios.sh 
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 24576 \
-object memory-backend-ram,size=24576M,id=mem-machine_mem  \
-smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/test.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-netdev vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=hostnet0,queues=8 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mq=on,vectors=18,mac=ce:19:60:6d:33:df,bus=pcie-root-port-3,addr=0x0 \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=on \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
-monitor stdio \

2.Qemu core dump
QEMU 7.0.0 monitor - type 'help' for more information
(qemu) qemu-kvm: ../hw/virtio/vhost-vdpa.c:716: int vhost_vdpa_get_vq_index(struct vhost_dev *, int): Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.

# gdb /usr/libexec/qemu-kvm core.qemu-kvm.63750
......
#0  0x00007f374b95957c in __pthread_kill_implementation () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7f37429be640 (LWP 63755))]
(gdb) bt full
#0  0x00007f374b95957c in __pthread_kill_implementation () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f374b90cd56 in raise () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f374b8df833 in abort () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007f374b8df75b in __assert_fail_base.cold () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007f374b905cd6 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#5  0x000055cd168697a0 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:716
No locals.
#6  0x000055cd1685f7e1 in vhost_virtqueue_mask (hdev=0x55cd18256c10, vdev=<optimized out>, n=6, mask=<optimized out>)
    at ../hw/virtio/vhost.c:1550
        file = {index = 1, fd = 86}
        index = <optimized out>
        vvq = <optimized out>
        r = <optimized out>
#7  0x000055cd1671bb20 in virtio_pci_set_guest_notifier (d=<optimized out>, n=<optimized out>, assign=<optimized out>, 
    with_irqfd=<optimized out>) at ../hw/virtio/virtio-pci.c:975
        proxy = <optimized out>
        vdev = 0x55cd19533af0
        vdc = <optimized out>
        vq = <optimized out>
        notifier = 0x7f373beaa1a8
#8  0x000055cd16718110 in virtio_pci_set_guest_notifiers (d=<optimized out>, nvqs=3, assign=<optimized out>)
    at ../hw/virtio/virtio-pci.c:1020
        proxy = <optimized out>
        vdev = 0x55cd19533af0
        k = 0x55cd1816ab20
        with_irqfd = false
        n = 2
        r = <optimized out>
        notifiers_error = <optimized out>
#9  0x000055cd166bc443 in vhost_net_start (dev=0x55cd19533af0, ncs=<optimized out>, data_queue_pairs=<optimized out>, cvq=<optimized out>)
    at ../hw/net/vhost_net.c:361
        qbus = 0x55cd19533a68
--Type <RET> for more, q to quit, c to continue without paging--
        vbus = 0x55cd19533a68
        total_notifiers = 3
        k = 0x55cd180df7f0
        index_end = <optimized out>
        nvhosts = 2
        n = 0x55cd19533af0
        i = <optimized out>
        peer = <optimized out>
        net = <optimized out>
        r = <optimized out>
        e = <optimized out>
        err = <optimized out>
#10 0x000055cd1682e163 in virtio_net_set_status (vdev=<optimized out>, status=15 '\017') at ../hw/net/virtio-net.c:290
        n = <optimized out>
        i = <optimized out>
        q = <optimized out>
        queue_status = <optimized out>
#11 0x000055cd168587c7 in virtio_set_status (vdev=0x55cd19533af0, val=15 '\017') at ../hw/virtio/virtio.c:1947
        k = 0x55cd1816ab20
        ret = <optimized out>
#12 0x000055cd1671b5ce in virtio_pci_common_write (opaque=0x55cd1952b750, addr=<optimized out>, val=15, size=<optimized out>)
    at ../hw/virtio/virtio-pci.c:1293
        proxy = 0x55cd1952b750
        vdev = 0x55cd19533af0
#13 0x000055cd167da9d9 in memory_region_dispatch_write (mr=0x55cd1952c260, addr=20, data=<optimized out>, op=<optimized out>, attrs=...)
    at ../softmmu/memory.c:554
        size = <optimized out>
#14 0x000055cd167e3dd5 in flatview_write_continue (fv=<optimized out>, addr=4246732820, attrs=..., ptr=<optimized out>, len=1, addr1=63755, 
    l=1, mr=<optimized out>) at ../softmmu/physmem.c:2814
        buf = <optimized out>
        release_lock = true
        result = 0
        val = 6
        ram_ptr = <optimized out>
#15 0x000055cd167e7e19 in address_space_write (as=<optimized out>, addr=4246732820, attrs=..., 
    buf=0x7f374b95957c <__pthread_kill_implementation+284>, len=1) at ../softmmu/physmem.c:2856
        result = 0
--Type <RET> for more, q to quit, c to continue without paging--
        fv = 0x7f31347130b0
#16 0x000055cd1690afc0 in kvm_cpu_exec (cpu=<optimized out>) at ../softmmu/physmem.c:2962
        run = <optimized out>
        ret = <optimized out>
        run_ret = <optimized out>
#17 0x000055cd1690d2ea in kvm_vcpu_thread_fn (arg=0x55cd1825b440) at ../accel/kvm/kvm-accel-ops.c:49
        r = <optimized out>
        cpu = <optimized out>
#18 0x000055cd16b4025a in qemu_thread_start (args=0x55cd1826adc0) at ../util/qemu-thread-posix.c:556
        __clframe = {__cancel_routine = <optimized out>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = <synthetic pointer>}
        qemu_thread_args = 0x55cd1826adc0
        start_routine = 0x55cd1690d170 <kvm_vcpu_thread_fn>
        arg = 0x55cd1825b440
        r = <optimized out>
#19 0x00007f374b957832 in start_thread () from /lib64/libc.so.6
No symbol table info available.
#20 0x00007f374b8f74c0 in clone3 () from /lib64/libc.so.6
No symbol table info available.

Comment 4 Lei Yang 2022-05-09 10:20:29 UTC
Hello Jason

Could you please help review Bug 2082782, maybe this bug is also a duplication of bz2069946.

Thanks
Lei

Comment 5 jason wang 2022-05-10 07:30:56 UTC
(In reply to Lei Yang from comment #4)
> Hello Jason
> 
> Could you please help review Bug 2082782, maybe this bug is also a
> duplication of bz2069946.
> 
> Thanks
> Lei

Yes, I think it's another duplication.

Thanks

Comment 7 Laurent Vivier 2022-05-10 09:25:32 UTC

*** This bug has been marked as a duplicate of bug 2070804 ***


Note You need to log in before you can comment on or make changes to this bug.