Bug 1493358 - The nic inside guest changed to down when change mtu many times with ping traffic [NEEDINFO]
Summary: The nic inside guest changed to down when change mtu many times with ping tra...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openvswitch
Version: 7.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Eelco Chaudron
QA Contact: liting
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-20 03:01 UTC by liting
Modified: 2017-12-14 14:09 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-14 14:09:31 UTC
Target Upstream Version:
echaudro: needinfo? (tli)


Attachments (Terms of Use)

Description liting 2017-09-20 03:01:16 UTC
Description of problem:
The nic inside guest changed to down when change mtu many times with ping traffic

Version-Release number of selected component (if applicable):
openvswitch-2.7.2-8.git20170719.el7fdp.x86_64.rpm 

How reproducible:


Steps to Reproduce:
In the testing, there were two machines, one server was "cisco-c220m3-01.rhts.eng.pek2.redhat.com", the other was "dell-per730-02.rhts.eng.pek2.redhat.com", the two machines connect directly.
1. 
configured ovs on cisco machine as following,
/usr/bin/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
 /usr/bin/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x2
/usr/bin/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1024,0
/usr/bin/ovs-vsctl --timeout 10 set Open_vSwitch . other_config:pmd-cpu-mask=1
ovs-vsctl --timeout 10 set Open_vSwitch . other_config:pmd-cpu-mask=30
systemctl restart openvswitch
/usr/bin/ovs-vsctl --timeout 10 add-br br0 -- set bridge br0 datapath_type=netdev
/usr/bin/ovs-vsctl --timeout 10 add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:0b:00.0 
/usr/bin/ovs-vsctl --timeout 10 add-port br0 dpdkvhostuser0 -- set Interface dpdkvhostuser0 type=dpdkvhostuser 
sudo /usr/bin/ovs-ofctl -O OpenFlow13 --timeout 10 del-flows br0 
 /usr/bin/ovs-ofctl -O OpenFlow13 --timeout 10 add-flow br0 idle_timeout=0,in_port=1,action=output:2
/usr/bin/ovs-ofctl -O OpenFlow13 --timeout 10 add-flow br0 idle_timeout=0,in_port=2,action=output:1
chmod 777 /var/run/openvswitch/dpdkvhostuser0

There is guest inside cisco machine, configured ip of guest eth1 as following:
ip addr add 192.168.10.1/24 dev eth1

2.
configured ip on dell02 machine:
ip addr add 192.168.10.2/24 dev p3p1
and ping the guest inside cisco machine as following:
ping -n -i 0.001 192.168.10.1

3.
changed mtu many times on cisco machine, such as following commands.
ovs-vsctl set int dpdkvhostuser0 mtu_request=1900
ovs-vsctl set int dpdkvhostuser0 mtu_request=2000
ovs-vsctl set int dpdkvhostuser0 mtu_request=2200
ovs-vsctl set int dpdkvhostuser0 mtu_request=2300
ovs-vsctl set int dpdkvhostuser0 mtu_request=9000
ovs-vsctl set int dpdkvhostuser0 mtu_request=2000
ovs-vsctl set int dpdkvhostuser0 mtu_request=1500


Actual results:
After changed mtu many times with ping traffic, the eth1 inside cisco machine' guest changed to down as following, and dell02 ping "192.168.10.1" failed. 
eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
    link/ether 56:48:4f:53:54:01 brd ff:ff:ff:ff:ff:ff
    inet 192.168.10.1/24 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::5448:4fff:fe53:5401/64 scope link 
       valid_lft forever preferred_lft forever

There was no segfault in /var/log/messages. 
And the ovs service seems worked well as following.
[root@cisco-c220m3-01 openvswitch]# ovs-vsctl show
8ab35f85-52e0-498d-a1fd-3dd3408ba1ce
    Bridge "br0"
        Port "dpdkvhostuser0"
            Interface "dpdkvhostuser0"
                type: dpdkvhostuser
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                options: {dpdk-devargs="0000:0b:00.0"}
        Port "br0"
            Interface "br0"
                type: internal
    ovs_version: "2.7.2"

Run "ip link set eth1 up" failed, it need to restart guest,and then the eth1 change to up.

I tested it on ixgbe nic and i40e nic, they also had this issue.

Expected results:
The eth1 should be keep up when changed mtu many times with ping traffic.

Additional info:

Comment 2 Eelco Chaudron 2017-10-27 11:40:33 UTC
I tried to replicate this issue on two of my setups with the configuration below, but I was not able to see the issue. Can you make your setup available so I can look at it? Also, do you have any vswitchd/system logs? Versions of Queme etc.? Maybe openvswitch is restarting, which will cause traffic to stop, etc.


INFO ON THE TWO SYSTEMS I TRIED IT ON:
======================================


[wsfd-netdev67:~]$ rpm -q openvswitch kernel qemu-kvm-rhev libvirt
openvswitch-2.7.2-8.git20170719.el7fdp.x86_64
kernel-3.10.0-693.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.8.x86_64
libvirt-3.2.0-14.el7_4.3.x86_64


[wsfd-netdev67:~]$ lshw -c network -businfo
Bus info          Device      Class          Description
========================================================
pci@0000:03:00.0  p6p1        network        MT27520 Family [ConnectX-3 Pro]
pci@0000:01:00.0              network        82599ES 10-Gigabit SFI/SFP+ Network Connection
pci@0000:01:00.1              network        82599ES 10-Gigabit SFI/SFP+ Network Connection



[wsfd-netdev64:~]$ rpm -q openvswitch kernel qemu-kvm-rhev libvirt
openvswitch-2.7.2-8.git20170719.el7fdp.x86_64
kernel-3.10.0-680.el7.gre_test_branch.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64
libvirt-3.2.0-9.el7.x86_64

[wsfd-netdev64:~]$ lshw -c network -businfo
Bus info          Device       Class          Description
=========================================================
pci@0000:05:00.0               network        Ethernet Controller XL710 for 40GbE QSFP+
pci@0000:05:00.1               network        Ethernet Controller XL710 for 40GbE QSFP+



for i in {1..100}
do
    echo "========== RUN $i =========="
    ovs-vsctl set int vhost0 mtu_request=1900
    sleep 1
    ovs-vsctl set int vhost0 mtu_request=2000
    sleep 1
    ovs-vsctl set int vhost0 mtu_request=2200
    sleep 1
    ovs-vsctl set int vhost0 mtu_request=2300
    sleep 1
    ovs-vsctl set int vhost0 mtu_request=9000
    sleep 1
    ovs-vsctl set int vhost0 mtu_request=2000
    sleep 1
    ovs-vsctl set int vhost0 mtu_request=1500
    sleep 5
done

Comment 3 Eelco Chaudron 2017-12-14 14:09:31 UTC
This BZ has been in need info/setup access for 1+ month. Will close the BZ for now, please re-open when the setup and additional info is ready.


Note You need to log in before you can comment on or make changes to this bug.