This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1947422 - [RHEL9]packed=on: guest fails to recover receiving packets after vhost-user reconnect
Summary: [RHEL9]packed=on: guest fails to recover receiving packets after vhost-user ...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: beta
: ---
Assignee: Eugenio Pérez Martín
QA Contact: Yanghang Liu
Daniel Vozenilek
URL:
Whiteboard:
Depends On: 1792683
Blocks: 1897025
TreeView+ depends on / blocked
 
Reported: 2021-04-08 13:02 UTC by Pei Zhang
Modified: 2023-06-08 08:44 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
.Restarting the OVS service on a host might block network connectivity on its running VMs When the Open vSwitch (OVS) service restarts or crashes on a host, virtual machines (VMs) that are running on this host cannot recover the state of the networking device. As a consequence, VMs might be completely unable to receive packets. This problem only affects systems that use the packed virtqueue format in their `virtio` networking stack. To work around this problem, use the `packed=off` parameter in the `virtio` networking device definition to disable packed virtqueue. With packed virtqueue disabled, the state of the networking device can, in some situations, be recovered from RAM.
Clone Of: 1792683
Environment:
Last Closed: 2023-04-08 07:28:03 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-333 0 None None None 2023-06-08 08:44:26 UTC

Description Pei Zhang 2021-04-08 13:02:04 UTC
Created attachment 1770258 [details]
Guest XML

+++ This bug was initially created as a clone of Bug #1792683 +++

Description of problem:
Boot guest with vhost-user packed=on. Then re-connect vhost-user by re-starting dpdk's testpmd, testpmd in guest fails to recover receiving packets.

Version-Release number of selected component (if applicable):
5.12.0-0.rc5.180.el9.x86_64
qemu-kvm-5.2.0-11.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot dpdk's testpmd as vhost-user client

# cat pvp.sh
/usr/bin/dpdk-testpmd \
	-l 2,4,6,8,10 \
	--socket-mem 1024,1024 \
	-n 4 \
	--vdev net_vhost0,iface=/tmp/vhost-user1,queues=1,client=1,iommu-support=1 \
	--vdev net_vhost1,iface=/tmp/vhost-user2,queues=1,client=1,iommu-support=1 \
	--block 0000:3b:00.0 --block 0000:3b:00.1 \
	-d /usr/lib64/librte_net_vhost.so \
	-- \
	--portmask=f \
	-i \
	--rxd=512 --txd=512 \
	--rxq=1 --txq=1 \
	--nb-cores=4 \
	--forward-mode=io

# sh pvp.sh

testpmd> set portlist 0,2,1,3
testpmd> 
Port 0: link state change event

Port 1: link state change event

testpmd> 
testpmd> start 


2. Boot guest with vhost-user packed=on. Full XML is attached.

<domain type='kvm' id='1' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>rhel9.0</name>
...
  <devices>
...
    <interface type='bridge'>
      <mac address='88:66:da:5f:dd:01'/>
      <source bridge='switch'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='88:66:da:5f:dd:12'/>
      <source type='unix' path='/tmp/vhost-user1' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='1024' iommu='on' ats='on'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='88:66:da:5f:dd:13'/>
      <source type='unix' path='/tmp/vhost-user2' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='1024' iommu='on' ats='on'/>
      <alias name='net2'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
...
  </devices>
...
  <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.net1.packed=on'/>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.net2.packed=on'/>
  </qemu:commandline>
</domain>



3. Start testpmd in guest and start Moongen in another host, guest can receive packets.


testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 2742484    RX-missed: 0          RX-bytes:  164549040
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 2738731    TX-errors: 0          TX-bytes:  164323860

  Throughput (since last show)
  Rx-pps:        70846          Rx-bps:     34006104
  Tx-pps:        70772          Tx-bps:     33971032
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 2740010    RX-missed: 0          RX-bytes:  164400600
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 2741234    TX-errors: 0          TX-bytes:  164474040

  Throughput (since last show)
  Rx-pps:        70771          Rx-bps:     33970192
  Tx-pps:        70853          Tx-bps:     34009640
  ############################################################################


4. Re-connect vhost-user by re-start dpdk's testpmd in host

# pkill testpmd

# sh pvp.sh

5. Check testpmd in guest, the packets receiving can not recover.

testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 3034162    RX-missed: 0          RX-bytes:  182049720
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 3030089    TX-errors: 0          TX-bytes:  181805340

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 3031364    RX-missed: 0          RX-bytes:  181881840
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 3032908    TX-errors: 0          TX-bytes:  181974480

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################


Actual results:
dpdk packets receiving can not recover after vhost-user reconnect.

Expected results:
dpdk packets receiving should recover well after vhost-user reconnect.


Additional info:
1. Without packed=on, dpdk packets receiving can be recovered very well.

Comment 5 Laurent Vivier 2022-07-19 15:51:20 UTC
Eugenio,

if you don't plan to fix this issue in 9.1.0, could you set the ITR to 9.2.0 or '---' (backlog).

Thanks

Comment 6 Eugenio Pérez Martín 2022-07-20 05:29:25 UTC
(In reply to Laurent Vivier from comment #5)
> Eugenio,
> 
> if you don't plan to fix this issue in 9.1.0, could you set the ITR to 9.2.0
> or '---' (backlog).
> 
> Thanks

Moving to ITR 9.2.0, as there are still a few features needed to make this work.

Thanks!

Comment 7 Yanghang Liu 2022-07-22 13:10:00 UTC
This problem can be reproduced in the following test env:
5.14.0-133.el9.x86_64
qemu-kvm-7.0.0-9.el9.x86_64
dpdk-21.11-1.el9_0.x86_64
libvirt-8.5.0-2.el9.x86_64



The detailed test step:
(1) setup the first host's test env
grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 
echo "isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,31,29,27,25,23,21,19,17,15,13,11"  >> /etc/tuned/cpu-partitioning-variables.conf  
tuned-adm profile cpu-partitioning
reboot
echo 20 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
echo 20 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
modprobe vfio
modprobe vfio-pci
dpdk-devbind.py --bind=vfio-pci 0000:5e:00.0
dpdk-devbind.py --bind=vfio-pci 0000:5e:00.1


(3) start a testpmd on the first host
# dpdk-testpmd -l 2,4,6,8,10 --socket-mem 1024,1024 -n 4  --vdev 'net_vhost0,iface=/tmp/vhost-user1,queues=1,client=1,iommu-support=1' --vdev 'net_vhost1,iface=/tmp/vhost-user2,queues=1,client=1,iommu-support=1'  -b 0000:3b:00.0 -b 0000:3b:00.1  -d /usr/lib64/librte_net_vhost.so  -- --portmask=f -i --rxd=512 --txd=512 --rxq=1 --txq=1 --nb-cores=4 --forward-mode=io

testpmd> set portlist 0,2,1,3
testpmd> start 


(4) start a vm with two packed=on vhostuser interfaces [1][2]

# virt-install  --graphics type=vnc,listen=0.0.0.0 --name=rhel9.1 --machine q35 --vcpu=6,vcpu.placement="static" --memory=8192,hugepages=yes --memorybacking hugepages=yes,size=1,unit=G,locked=yes,access.mode=shared --cpu host,numa.cell0.memory=8388608,numa.cell0.unit='KiB',numa.cell0.id="0",numa.cell0.cpus="0-5",numa.cell0.memAccess="shared" --numatune memory.mode="strict",memory.nodeset="0",memnode.cellid="0",memnode.mode="strict",memnode.nodeset="0" --features pmu.state="off",ioapic.driver="qemu" --memballoon virtio,driver.iommu=on,driver.ats=on --disk path=/home/images_nfv-virt-rt-kvm/rhel9.1.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20,driver.iommu=on,driver.ats=on   --network bridge=switch,model=virtio,mac=88:66:da:5f:dd:11,driver.iommu=on,driver.ats=on  --osinfo detect=on,require=off --check all=off --iommu model='intel',driver.intremap='on',driver.caching_mode='on',driver.iotlb='on' --cputune vcpupin0.vcpu=0,vcpupin0.cpuset=30,vcpupin1.vcpu=1,vcpupin1.cpuset=28,vcpupin2.vcpu=2,vcpupin2.cpuset=26,vcpupin3.vcpu=3,vcpupin3.cpuset=24,vcpupin4.vcpu=4,vcpupin4.cpuset=22,vcpupin5.vcpu=5,vcpupin5.cpuset=20,emulatorpin.cpuset="25,27,29,31" --network type=vhostuser,mac.address=18:66:da:5f:dd:22,model=virtio,source.type=unix,source.path=/tmp/vhost-user1,source.mode=server,driver.name=vhost,driver.iommu=on,driver.ats=on,driver.packed=on --network type=vhostuser,mac.address=18:66:da:5f:dd:23,model=virtio,source.type=unix,source.path=/tmp/vhost-user2,source.mode=server,driver.name=vhost,driver.iommu=on,driver.ats=on,driver.packed=on --import --noautoconsole --noreboot 


[1] --network type=vhostuser,mac.address=18:66:da:5f:dd:22,model=virtio,source.type=unix,source.path=/tmp/vhost-user1,source.mode=server,driver.name=vhost,driver.iommu=on,driver.ats=on,driver.packed=on 
[2] --network type=vhostuser,mac.address=18:66:da:5f:dd:23,model=virtio,source.type=unix,source.path=/tmp/vhost-user2,source.mode=server,driver.name=vhost,driver.iommu=on,driver.ats=on,driver.packed=on


(5) setup the vm's kernel option
grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 
echo "isolated_cores=1,2,3,4,5"  >> /etc/tuned/cpu-partitioning-variables.conf  
tuned-adm profile cpu-partitioning
reboot
echo 2 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
modprobe vfio
modprobe vfio-pci
dpdk-devbind.py --bind=vfio-pci 0000:02:00.0
dpdk-devbind.py --bind=vfio-pci 0000:03:00.0

(6) start a testpmd in the vm

dpdk-testpmd -l 1,2,3 -n 4  -d /usr/lib64/librte_net_virtio.so  -- --nb-cores=2 -i --disable-rss --rxd=512 --txd=512 --rxq=1 --txq=1 
testpmd> start

(7) setup the second host for downloading the Moongen tool in the /home/ dir and then running the Moongen
grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 
echo "isolated_cores=2,4,6,8,10,12,14,16,18"  >> /etc/tuned/cpu-partitioning-variables.conf  
tuned-adm profile cpu-partitioning
reboot
tuned-adm profile cpu-partitioning
dpdk-devbind.py --bind=vfio-pci 0000:82:00.0
dpdk-devbind.py --bind=vfio-pci 0000:82:00.1
echo 10 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
echo 10 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
cd /home/MoonGen/
./build/MoonGen examples/opnfv-vsperf.lua > /tmp/throughput.log

(8) check the Traffic Statistics in the vm

testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 8046105833 RX-missed: 0          RX-bytes:  482766349980
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 5578939316 TX-errors: 0          TX-bytes:  334736358960

  Throughput (since last show)
  Rx-pps:      8274489          Rx-bps:   3971755176
  Tx-pps:      7096092          Tx-bps:   3406124624
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 5578939316 RX-missed: 0          RX-bytes:  334736358960
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 8046105821 TX-errors: 0          TX-bytes:  482766349260

  Throughput (since last show)
  Rx-pps:      7096092          Rx-bps:   3406124624
  Tx-pps:      8274471          Tx-bps:   3971746368
  ############################################################################


(9) restart the first host's dpdk-testpmd
# pkill dpdk-testpmd
# dpdk-testpmd -l 2,4,6,8,10 --socket-mem 1024,1024 -n 4  --vdev 'net_vhost0,iface=/tmp/vhost-user1,queues=1,client=1,iommu-support=1' --vdev 'net_vhost1,iface=/tmp/vhost-user2,queues=1,client=1,iommu-support=1'  -b 0000:3b:00.0 -b 0000:3b:00.1  -d /usr/lib64/librte_net_vhost.so  -- --portmask=f -i --rxd=512 --txd=512 --rxq=1 --txq=1 --nb-cores=4 --forward-mode=io
testpmd> set portlist 0,2,1,3
testpmd> start 


(10) re-run the Moongen
cd /home/MoonGen/
./build/MoonGen examples/opnfv-vsperf.lua > /tmp/throughput.log


(11) re-check the Traffic Statistics in the vm
testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 8046105833 RX-missed: 0          RX-bytes:  482766349980
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 5578939316 TX-errors: 0          TX-bytes:  334736358960

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 5578939316 RX-missed: 0          RX-bytes:  334736358960
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 8046105821 TX-errors: 0          TX-bytes:  482766349260

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

Comment 8 Yanghang Liu 2022-07-22 15:44:34 UTC
This bug can not be reproduced when vhost-user interfaces do not have the packed=on option



The traffic statistics in the vm after vhost-user interfaces reconnect are as following:

testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 17545208821 RX-missed: 0          RX-bytes:  1052712529260
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 11816831315 TX-errors: 0          TX-bytes:  709009878900

  Throughput (since last show)
  Rx-pps:      1157235          Rx-bps:    555472864
  Tx-pps:       860093          Tx-bps:    412844640
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 11816831315 RX-missed: 0          RX-bytes:  709009878900
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 17545208629 TX-errors: 0          TX-bytes:  1052712517740

  Throughput (since last show)
  Rx-pps:       860092          Rx-bps:    412844632
  Tx-pps:      1157234          Tx-bps:    555472488
  ############################################################################

Comment 14 RHEL Program Management 2023-04-08 07:28:03 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.