The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1791267 - Guest vhost-user ports stop receiving MoonGen packets after migration
Summary: Guest vhost-user ports stop receiving MoonGen packets after migration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 20.A
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Maxime Coquelin
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-15 11:56 UTC by Pei Zhang
Modified: 2020-04-14 14:31 UTC (History)
11 users (show)

Fixed In Version: openvswitch-2.9.0-126.el7fdn
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-14 14:31:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:1456 0 None None None 2020-04-14 14:31:30 UTC

Description Pei Zhang 2020-01-15 11:56:07 UTC
Description of problem:
vhost-user stop receiving packets after migration.

Version-Release number of selected component (if applicable):
openvswitch-2.9.0-124.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-15.el7fdp.noarch
3.10.0-1122.el7.x86_64
qemu-kvm-rhev-2.12.0-42.el7.x86_64
libvirt-4.5.0-31.el7.x86_64
dpdk-18.11.2-1.el7_6.x86_64

How reproducible:
100%

Steps to Reproduce:

1. Boot ovs with dpdkvhostuserclient ports on both src and des hosts, refer to[1]

# ovs-vsctl show
83dd9e54-2fb4-4eed-be58-cab15e4d4c3e
    Bridge "ovsbr1"
        Port "ovsbr1"
            Interface "ovsbr1"
                type: internal
        Port "vhost-user1"
            Interface "vhost-user1"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser1.sock"}
        Port "dpdk1"
            Interface "dpdk1"
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.1", n_rxq="1"}
    Bridge "ovsbr0"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.0", n_rxq="1"}
        Port "vhost-user0"
            Interface "vhost-user0"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser0.sock"}

2. Boot VM on src host

3. Start testpmd in guest

# /usr/bin/testpmd \
-l 1,2,3,4,5 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so \
-w 0000:06:00.0 -w 0000:07:00.0 \
--iova-mode pa \
-- \
--nb-cores=4 \
-i \
--disable-rss \
--rxd=512 --txd=512 \
--rxq=1 --txq=1


4. Start MoonGen in another host

# ./build/MoonGen /home/nfv-virt-rt-kvm/tests/utils/rfc1242.lua 0 1 64 3000000 4.1

5. Check testpmd in guest, it can receive packets well.

testpmd> show port stats all

  ######################## NIC statistics for port 0  ########################
  RX-packets: 10094535   RX-missed: 0          RX-bytes:  605672100
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 10093278   TX-errors: 0          TX-bytes:  605596680

  Throughput (since last show)
  Rx-pps:       995333
  Tx-pps:       995336
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 10094564   RX-missed: 0          RX-bytes:  605673840
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 10093305   TX-errors: 0          TX-bytes:  605598300

  Throughput (since last show)
  Rx-pps:       995324
  Tx-pps:       995321
  ############################################################################


6. Migrate guest from src host to des host

# /bin/virsh migrate --verbose --persistent --live rhel7.8 qemu+ssh://10.73.72.196/system

7. Check in guest, testpmd stop receiving packets, Rx-pps and TX-pps become 0.

testpmd> show port stats all

  ######################## NIC statistics for port 0  ########################
  RX-packets: 124726123  RX-missed: 0          RX-bytes:  7483569564
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 124724797  TX-errors: 0          TX-bytes:  7483490008

  Throughput (since last show)
  Rx-pps:            0
  Tx-pps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 124726111  RX-missed: 0          RX-bytes:  7483568844
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 124724482  TX-errors: 0          TX-bytes:  7483471108

  Throughput (since last show)
  Rx-pps:            0
  Tx-pps:            0
  ############################################################################



Actual results:
vhost-user stop receiving packets after migration.

Expected results:
vhost-user should keep receiving packets after migration.

Additional info:
1. This is a regression bug.

openvswitch-2.9.0-122.el7fdp.x86_64  works well


Reference:
[1]
# cat boot_ovs_client.sh 
#!/bin/bash

set -e

echo "killing old ovs process"
pkill -f ovs-vswitchd || true
sleep 5
pkill -f ovsdb-server || true

echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch

echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch
rm -f $DB_FILE

echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init

echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

echo "creating bridge and ports"

ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:5e:00.0 
ovs-vsctl add-port ovsbr0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser0.sock
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl --if-exists del-br ovsbr1
ovs-vsctl add-br ovsbr1 -- set bridge ovsbr1 datapath_type=netdev
ovs-vsctl add-port ovsbr1 dpdk1 -- set Interface dpdk1 type=dpdk options:dpdk-devargs=0000:5e:00.1
ovs-vsctl add-port ovsbr1 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser1.sock
ovs-ofctl del-flows ovsbr1
ovs-ofctl add-flow ovsbr1 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr1 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1554
ovs-vsctl set Interface dpdk0 options:n_rxq=1
ovs-vsctl set Interface dpdk1 options:n_rxq=1

echo "all done"

Comment 1 Eelco Chaudron 2020-01-16 10:49:31 UTC
The only related change that got in is the following:

+Patch1270: 0001-vhost-add-number-of-fds-to-vhost-user-messages.patch
+Patch1271: 0002-vhost-fix-possible-denial-of-service-on-SET_VRING_NU.patch
+Patch1272: 0003-vhost-fix-possible-denial-of-service-by-leaking-FDs.patch

As I'm not too familiar with the VHOST code does it make more sense for Maxime to take a look.
Maybe Maxime already has an idea what could cause this, Maxime?

Comment 2 Maxime Coquelin 2020-01-16 10:53:50 UTC
Hi Pei,

Could you please share the ovs-vswitchd logs?

Comment 9 Pei Zhang 2020-02-19 12:34:16 UTC
Verified with openvswitch-2.9.0-126.el7fdn.x86_64:

DPDK testpmd in guest keep receiving packets very well after migration. 

Other testing versions:
3.10.0-1127.el7.x86_64
qemu-kvm-rhev-2.12.0-44.el7.x86_64
tuned-2.11.0-8.el7.noarch
libvirt-4.5.0-33.el7.x86_64
dpdk-18.11.2-1.el7.x86_64
openvswitch-2.9.0-126.el7fdn.x86_64

So this bug has been fixed very well. Move to 'VERIFIED'.

Comment 13 errata-xmlrpc 2020-04-14 14:31:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1456


Note You need to log in before you can comment on or make changes to this bug.