Description of problem: As virtual devices, virtio devices should support FLR. When exposed to userspace, such as in bug 1662901 where dpdk is being used, supporting FLR makes it easier to reset the device between userspace and kernel usage and helps to provide consistent behavior in configurations where a bus reset may not be practical. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Posted upstream: [Qemu-devel] [PATCH] virtio-pci: Add Function Level Reset support
Merged upstream and will be part of qemu-4.2, commit eb1556c493
This is fixed upstream in qemu-4.2. Julia, please next time move the BZ to POST as you did, but also set Fixed-in: qemu-4.2.
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks
Hi Julia, I'm verifying this bug. Could you please check if below steps are enough to verify this bug? Does QE need any further function testing about Function Level Reset? Thank you. Versions: qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904.x86_64 == Part 1: Check x-pcie-flr-init attribute == Test PASS. Steps: 1. Boot qemu with two kinds of virtio devices: virtio-net-pci, virtio-blk-pci 2. Check x-pcie-flr-init of all virtio-devices. All has "x-pcie-flr-init = true" attribute. (qemu) info qtree bus: main-system-bus type System ... dev: q35-pcihost, id "" ... bus: pcie.0 type PCIE dev: pcie-root-port, id "pci.7" ... bus: pci.7 type PCIE dev: virtio-net-pci, id "net2" ... x-pcie-flr-init = true dev: pcie-root-port, id "pci.6" ... bus: pci.6 type PCIE dev: virtio-net-pci, id "net1" ... x-pcie-flr-init = true ... dev: pcie-root-port, id "pci.3" ... bus: pci.3 type PCIE dev: virtio-net-pci, id "net0" ... x-pcie-flr-init = true dev: pcie-root-port, id "pci.2" ... bus: pci.2 type PCIE dev: virtio-blk-pci, id "virtio-disk0" ... x-pcie-flr-init = true ... ... dev: kvmvapic, id "" (qemu) == Part 2: NFV acceptance testing == All scenarios cover virtio-net-pci (vhost-user) + dpdk. Test PASS. (1)Guest with device assignment(PF) throughput testing(1G hugepage size): PASS (2)Guest with device assignment(PF) throughput testing(2M hugepage size) : PASS (3)Guest with device assignment(VF) throughput testing: PASS (4)PVP (host dpdk testpmd as vswitch) 1Q: throughput testing: PASS (5)PVP vhost-user 2Q throughput testing: PASS (6)PVP vhost-user 1Q - cross numa node throughput testing: PASS (7)Guest with vhost-user 2 queues throughput testing: PASS (8)vhost-user reconnect with dpdk-client, qemu-server: ovs reconnect: PASS (9)vhost-user reconnect with dpdk-client, qemu-server: qemu reconnect: PASS (10)PVP 1Q live migration testing: PASS (11)PVP 1Q cross numa node live migration testing: PASS (12)Guest with ovs+dpdk+vhost-user 1Q live migration testing: PASS (13)Guest with ovs+dpdk+vhost-user 1Q live migration testing (2M): PASS (14)Guest with ovs+dpdk+vhost-user 2Q live migration testing: PASS Best regards, Pei
Hi Pei, Yes, function testing is needed. It can be done like this: 1. Boot QEMU with Linux guest and a multifunctional virtio-pci device inside (any device types are fine): -device pcie-root-port,id=rp0,bus=pcie.0,chassis=2,addr=2 \ -device virtio-rng-pci,id=dev0,multifunction=on,bus=rp0,addr=0.0 \ -device virtio-rng-pci,id=dev1,x-pcie-flr-init=off,bus=rp0,addr=0.1 \ (FLR on dev1 is off) 2. Enable qdev_reset trace event. 3. Check that reset works on dev0: Inside guest terminal: # echo 1 > /sys/bus/pci/devices/(dev0 pci address)/reset You should see trace events like this: 27765:qdev_reset obj=0x563b89ec2960(virtio-rng-device) 27765:qdev_reset obj=0x563b89eba7d0(virtio-rng-pci) 4. Check that reset doesn't work on dev1: # echo 1 > /sys/bus/pci/devices/(dev1 pci address)/reset No trace events
(In reply to Julia Suvorova from comment #9) > Hi Pei, > Yes, function testing is needed. It can be done like this: > > 1. Boot QEMU with Linux guest and a multifunctional virtio-pci device inside > (any device types are fine): > > -device pcie-root-port,id=rp0,bus=pcie.0,chassis=2,addr=2 \ > -device virtio-rng-pci,id=dev0,multifunction=on,bus=rp0,addr=0.0 \ > -device virtio-rng-pci,id=dev1,x-pcie-flr-init=off,bus=rp0,addr=0.1 \ > > (FLR on dev1 is off) > > 2. Enable qdev_reset trace event. Hi Julia, Thank you very much for the instructions. Could you please share how to enable qdev_reset trace event? I tried below methods, but seems they don't work. 1. In host: # strace -p $(pidof qemu-kvm) -e qdev_reset strace: invalid system call 'qdev_reset' 2. In host: Check qmp cmd, but no any return. Best regards, Pei
(In reply to Pei Zhang from comment #10) > (In reply to Julia Suvorova from comment #9) > > Hi Pei, > > Yes, function testing is needed. It can be done like this: > > > > 1. Boot QEMU with Linux guest and a multifunctional virtio-pci device inside > > (any device types are fine): > > > > -device pcie-root-port,id=rp0,bus=pcie.0,chassis=2,addr=2 \ > > -device virtio-rng-pci,id=dev0,multifunction=on,bus=rp0,addr=0.0 \ > > -device virtio-rng-pci,id=dev1,x-pcie-flr-init=off,bus=rp0,addr=0.1 \ > > > > (FLR on dev1 is off) > > > > 2. Enable qdev_reset trace event. > > Hi Julia, > > Thank you very much for the instructions. > > Could you please share how to enable qdev_reset trace event? I tried below > methods, but seems they don't work. > > 1. In host: > # strace -p $(pidof qemu-kvm) -e qdev_reset > strace: invalid system call 'qdev_reset' > > 2. In host: > Check qmp cmd, but no any return. 3. qemu cmd with "-trace enable=qdev_reset" also doesn't work well. # /usr/libexec/qemu-kvm -trace enable=qdev_reset qemu-kvm: -trace enable=qdev_reset: warning: trace event 'qdev_reset' does not exist VNC server running on ::1:5900
Another update: With step "3 Check that reset works on dev0", I tried virtio-rng-pci, virtio-net-pci vhost and virtio-net-pci vhost-user. After executing # echo 1 > /sys/bus/pci/devices/(dev0 pci address)/reset: virtio-rng-pci: qemu, guest and host work well. virtio-net-pci vhost: qemu, guest and host work well. virtio-net-pci vhost-user: qemu and host work well, but guest hang. I filed a new bug to track the virtio-net-pci vhost-user issue: Bug 1805656 - Guest hang after "echo 1 > /sys/bus/pci/devices/$vhost_user_nic_pcie/reset"
(In reply to Pei Zhang from comment #11) > (In reply to Pei Zhang from comment #10) > > (In reply to Julia Suvorova from comment #9) > > > Hi Pei, > > > Yes, function testing is needed. It can be done like this: > > > > > > 1. Boot QEMU with Linux guest and a multifunctional virtio-pci device inside > > > (any device types are fine): > > > > > > -device pcie-root-port,id=rp0,bus=pcie.0,chassis=2,addr=2 \ > > > -device virtio-rng-pci,id=dev0,multifunction=on,bus=rp0,addr=0.0 \ > > > -device virtio-rng-pci,id=dev1,x-pcie-flr-init=off,bus=rp0,addr=0.1 \ > > > > > > (FLR on dev1 is off) > > > > > > 2. Enable qdev_reset trace event. > > > > Hi Julia, > > > > Thank you very much for the instructions. > > > > Could you please share how to enable qdev_reset trace event? I tried below > > methods, but seems they don't work. > > > > 1. In host: > > # strace -p $(pidof qemu-kvm) -e qdev_reset > > strace: invalid system call 'qdev_reset' > > > > 2. In host: > > Check qmp cmd, but no any return. > > 3. qemu cmd with "-trace enable=qdev_reset" also doesn't work well. > # /usr/libexec/qemu-kvm -trace enable=qdev_reset > qemu-kvm: -trace enable=qdev_reset: warning: trace event 'qdev_reset' does > not exist > VNC server running on ::1:5900 These are qemu traces, so the third variant is correct. Sorry, didn't notice that these events are quite new, and aren't included in 4.2 QEMU. Will it be enough for testing if I provide you scratch build with trace events included? Otherwise, I'll try to find another type of verification.
(In reply to Julia Suvorova from comment #13) ... > > > > 3. qemu cmd with "-trace enable=qdev_reset" also doesn't work well. > > # /usr/libexec/qemu-kvm -trace enable=qdev_reset > > qemu-kvm: -trace enable=qdev_reset: warning: trace event 'qdev_reset' does > > not exist > > VNC server running on ::1:5900 > > These are qemu traces, so the third variant is correct. > > Sorry, didn't notice that these events are quite new, and aren't included in > 4.2 QEMU. > Will it be enough for testing if I provide you scratch build with trace > events included? > > Otherwise, I'll try to find another type of verification. Hi Julia, The scratch build is OK to verify this bug. However I have one concern, if we add this new feature to test plan to do regression testing, seems we cannot test like this way. It would be great if you could find another type of verification. Thank you very much. Best regards, Pei
(In reply to Pei Zhang from comment #12) > Another update: > > With step "3 Check that reset works on dev0", I tried virtio-rng-pci, > virtio-net-pci vhost and virtio-net-pci vhost-user. > > After executing # echo 1 > /sys/bus/pci/devices/(dev0 pci address)/reset: > > virtio-rng-pci: qemu, guest and host work well. > virtio-net-pci vhost: qemu, guest and host work well. > virtio-net-pci vhost-user: qemu and host work well, but guest hang. > > I filed a new bug to track the virtio-net-pci vhost-user issue: > Bug 1805656 - Guest hang after "echo 1 > > /sys/bus/pci/devices/$vhost_user_nic_pcie/reset" Does this happen without multifunctional devices involved?
(In reply to Julia Suvorova from comment #15) > (In reply to Pei Zhang from comment #12) > > Another update: > > > > With step "3 Check that reset works on dev0", I tried virtio-rng-pci, > > virtio-net-pci vhost and virtio-net-pci vhost-user. > > > > After executing # echo 1 > /sys/bus/pci/devices/(dev0 pci address)/reset: > > > > virtio-rng-pci: qemu, guest and host work well. > > virtio-net-pci vhost: qemu, guest and host work well. > > virtio-net-pci vhost-user: qemu and host work well, but guest hang. > > > > I filed a new bug to track the virtio-net-pci vhost-user issue: > > Bug 1805656 - Guest hang after "echo 1 > > > /sys/bus/pci/devices/$vhost_user_nic_pcie/reset" > > Does this happen without multifunctional devices involved? Yes, this issue can be reproduced without multifunctional devices. No matter with or without multifunctional, this issue can be reproduced with virtio-net-pci vhost-user.
Thank you Julia very much for your instructions and efforts. It works well now. Verification: 1. Boot qemu with 2 virtio devices with multifunction, one with default x-pcie-flr-init, one with x-pcie-flr-init=off. -device virtio-rng-pci,id=dev0,multifunction=on,bus=pci.6,addr=0.0 \ -device virtio-rng-pci,id=dev1,x-pcie-flr-init=off,bus=pci.6,addr=0.1 \ 2. In Host, trace pcie_flr_reset event. # qemu-trace-stap -v run /usr/libexec/qemu-kvm 'pcie_flr_reset' 3. In guest, enable pcie reset of these 2 devices. pcie_flr_reset trace shows in dev0 and no trace with dev1. This is expected. (1) With dev0, there are pcie_flr_reset trace shows in qemu-trace-stap. # echo 1 > /sys/bus/pci/devices/0000\:06\:00.0/reset 13104@1582691208853984170 pcie_flr_reset dev virtio-rng-pci (2) With dev1, no any trace shows in qemu-trace-stap. # echo 1 > /sys/bus/pci/devices/0000\:06\:00.1/reset (No trace) So this bug has been fixed very well. Move to 'VERIFIED'.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2017