The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1663373 - [i40e] ping fail for DPDK over SR-IOV VF after ethtool changing vlan offload setting on PF
Summary: [i40e] ping fail for DPDK over SR-IOV VF after ethtool changing vlan offload ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 18.12
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: FDP 19.03
Assignee: Eelco Chaudron
QA Contact: Jiying Qiu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-04 06:11 UTC by qding
Modified: 2019-04-19 08:14 UTC (History)
8 users (show)

Fixed In Version: 3.10.0-1034.el7.x86_64
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-19 08:14:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
the log (17.78 KB, text/plain)
2019-01-04 06:11 UTC, qding
no flags Details
reproducer (958 bytes, text/plain)
2019-04-12 03:09 UTC, qding
no flags Details

Description qding 2019-01-04 06:11:59 UTC
Created attachment 1518296 [details]
the log

Description of problem:

The issue is found during verifying bz1637893.

Two hosts are connected back-to-back

on dell-per730-51.rhts.eng.pek2.redhat.com

ovs-vsctl del-br br0
systemctl stop openvswitch
ip link set p4p1 vf 0 vlan 0
driverctl unset-override 0000:82:02.0
driverctl unset-override 0000:82:02.1
modprobe -r i40e
modprobe  i40e

echo 2 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs
ip link show
ip link set p4p1 vf 0 mac 00:00:00:00:00:01 spoofchk off
ip link show p4p1
driverctl set-override 0000:82:02.0 vfio-pci
driverctl set-override 0000:82:02.1 vfio-pci
dpdk-devbind -s
ip link show p4p1
systemctl start openvswitch
ovs-vsctl get Open_vSwitch . other_config
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:82:02.0 ofport_request=10
ovs-vsctl show
ifconfig br0 up
ip add add 192.168.124.1/24 dev br0
ping 192.168.124.2
ip link set p4p1 vf 0 vlan 10
ping 192.168.124.2
ethtool -k p4p1 | grep vlan
ethtool -K p4p1 rxvlan off
ping 192.168.124.2


The remote host:

[root@dell-per730-55 rpmbuild]# ifconfig p6p1.10
p6p1.10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.124.2  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::faf2:1eff:fe02:c4a0  prefixlen 64  scopeid 0x20<link>
        ether f8:f2:1e:02:c4:a0  txqueuelen 1000  (Ethernet)
        RX packets 495  bytes 36564 (35.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 437  bytes 38498 (37.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@dell-per730-55 rpmbuild]# 

Please note that reload pf driver is important of the first several steps to clean the old config.


Version-Release number of selected component (if applicable):
openvswitch2.10-2.10.0-28.el7fdp.1.x86_64

82:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)


How reproducible: always


Actual results:
ping fail

Expected results:
ping succeed

Additional info:
please see attached log bz1637893.log

Comment 1 Rashid Khan 2019-02-12 14:19:51 UTC
Is this how OSP going to use Nic Partitioning. 
We want to work on OSP's way of doing Nic Partitioning. 
Anita mentioned this test setup is not coming from her. 

Christophe to confirm if this a valid setup from OSP perspective or not. 
If not then we might have to lower the priority of this a lot

Comment 5 Eelco Chaudron 2019-04-10 09:08:52 UTC
(In reply to qding from comment #0)
> Created attachment 1518296 [details]
> the log
> 
> Description of problem:
> 
> The issue is found during verifying bz1637893.
> 
> Two hosts are connected back-to-back
> 
> on dell-per730-51.rhts.eng.pek2.redhat.com
> 
> ovs-vsctl del-br br0
> systemctl stop openvswitch
> ip link set p4p1 vf 0 vlan 0
> driverctl unset-override 0000:82:02.0
> driverctl unset-override 0000:82:02.1
> modprobe -r i40e
> modprobe  i40e
> 
> echo 2 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs
> ip link show
> ip link set p4p1 vf 0 mac 00:00:00:00:00:01 spoofchk off
> ip link show p4p1
> driverctl set-override 0000:82:02.0 vfio-pci
> driverctl set-override 0000:82:02.1 vfio-pci
> dpdk-devbind -s
> ip link show p4p1
> systemctl start openvswitch
> ovs-vsctl get Open_vSwitch . other_config
> ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> options:dpdk-devargs=0000:82:02.0 ofport_request=10
> ovs-vsctl show
> ifconfig br0 up
> ip add add 192.168.124.1/24 dev br0
> ping 192.168.124.2
> ip link set p4p1 vf 0 vlan 10
> ping 192.168.124.2
> ethtool -k p4p1 | grep vlan
> ethtool -K p4p1 rxvlan off

If you disable VLAN tripping on the physical port, I assume this also stop stripping of the VLANs on the VFs.
Meaning that you need to add VLAN handling to your OVS configuration. Can you try this and see if it works?

> ping 192.168.124.2

Comment 6 qding 2019-04-11 07:37:51 UTC
(In reply to Eelco Chaudron from comment #5)
> > 
> > 
> If you disable VLAN tripping on the physical port, I assume this also stop
> stripping of the VLANs on the VFs.
> Meaning that you need to add VLAN handling to your OVS configuration. Can
> you try this and see if it works?
> 

Sorry for late to answer the question. I don't think it's the setting problem because without steps below it's hard to reproduce the issue. And how to "disable VLAN tripping on the physical port"? Probably it's the same issue as bz1691682. Any more needs to do, please let me know. Thanks.

modprobe -r i40e
modprobe  i40e

Comment 7 Eelco Chaudron 2019-04-11 09:39:06 UTC
So if you repeat the steps above without the modprobe the second time it works fine?
Can you be more clear on when it does and does not work?

So, for example, the following sequence would work?

> ovs-vsctl del-br br0
> systemctl stop openvswitch
> ip link set p4p1 vf 0 vlan 0
> driverctl unset-override 0000:82:02.0
> driverctl unset-override 0000:82:02.1
> modprobe -r i40e
> modprobe  i40e
 
> echo 2 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs
> ip link show
> ip link set p4p1 vf 0 mac 00:00:00:00:00:01 spoofchk off
> ip link show p4p1
> driverctl set-override 0000:82:02.0 vfio-pci
> driverctl set-override 0000:82:02.1 vfio-pci
> dpdk-devbind -s
> ip link show p4p1
> systemctl start openvswitch
> ovs-vsctl get Open_vSwitch . other_config
> ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> options:dpdk-devargs=0000:82:02.0 ofport_request=10
> ovs-vsctl show
> ifconfig br0 up
> ip add add 192.168.124.1/24 dev br0
> ping 192.168.124.2
> ip link set p4p1 vf 0 vlan 10
> ping 192.168.124.2
> ethtool -k p4p1 | grep vlan
> ethtool -K p4p1 rxvlan off

> echo 0 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs

> ovs-vsctl del-br br0
> systemctl stop openvswitch
> ip link set p4p1 vf 0 vlan 0
> driverctl unset-override 0000:82:02.0
> driverctl unset-override 0000:82:02.1
 
> echo 2 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs
> ip link show
> ip link set p4p1 vf 0 mac 00:00:00:00:00:01 spoofchk off
> ip link show p4p1
> driverctl set-override 0000:82:02.0 vfio-pci
> driverctl set-override 0000:82:02.1 vfio-pci
> dpdk-devbind -s
> ip link show p4p1
> systemctl start openvswitch
> ovs-vsctl get Open_vSwitch . other_config
> ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> options:dpdk-devargs=0000:82:02.0 ofport_request=10
> ovs-vsctl show
> ifconfig br0 up
> ip add add 192.168.124.1/24 dev br0
> ping 192.168.124.2
> ip link set p4p1 vf 0 vlan 10
> ping 192.168.124.2
> ethtool -k p4p1 | grep vlan
> ethtool -K p4p1 rxvlan off

Please let me know, and I'll try to run some test early next week.

Comment 8 qding 2019-04-12 03:06:35 UTC
(In reply to Eelco Chaudron from comment #7)
> So if you repeat the steps above without the modprobe the second time it
> works fine?
> Can you be more clear on when it does and does not work?
> 
> So, for example, the following sequence would work?
> 
> > ovs-vsctl del-br br0
> > systemctl stop openvswitch
> > ip link set p4p1 vf 0 vlan 0
> > driverctl unset-override 0000:82:02.0
> > driverctl unset-override 0000:82:02.1
> > modprobe -r i40e
> > modprobe  i40e
>  
> > echo 2 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs
> > ip link show
> > ip link set p4p1 vf 0 mac 00:00:00:00:00:01 spoofchk off
> > ip link show p4p1
> > driverctl set-override 0000:82:02.0 vfio-pci
> > driverctl set-override 0000:82:02.1 vfio-pci
> > dpdk-devbind -s
> > ip link show p4p1
> > systemctl start openvswitch
> > ovs-vsctl get Open_vSwitch . other_config
> > ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> > ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> > options:dpdk-devargs=0000:82:02.0 ofport_request=10
> > ovs-vsctl show
> > ifconfig br0 up
> > ip add add 192.168.124.1/24 dev br0
> > ping 192.168.124.2
> > ip link set p4p1 vf 0 vlan 10
> > ping 192.168.124.2
> > ethtool -k p4p1 | grep vlan
> > ethtool -K p4p1 rxvlan off
> 

Reproduced with the above sequence of commands

> > echo 0 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs

The command to remove VFs doesn't make sense here. Before "systemctl stop openvswitch", the command will hang up.

Just run commands below I never see the issue.

And a new finding is that with the latest RHEL-7.7 distro, the issue isn't reproduced with the same sequence of commands above.

> 
> > ovs-vsctl del-br br0
> > systemctl stop openvswitch
> > ip link set p4p1 vf 0 vlan 0
> > driverctl unset-override 0000:82:02.0
> > driverctl unset-override 0000:82:02.1
>  
> > echo 2 > /sys/bus/pci/devices/0000:82:00.0/sriov_numvfs
> > ip link show
> > ip link set p4p1 vf 0 mac 00:00:00:00:00:01 spoofchk off
> > ip link show p4p1
> > driverctl set-override 0000:82:02.0 vfio-pci
> > driverctl set-override 0000:82:02.1 vfio-pci
> > dpdk-devbind -s
> > ip link show p4p1
> > systemctl start openvswitch
> > ovs-vsctl get Open_vSwitch . other_config
> > ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> > ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> > options:dpdk-devargs=0000:82:02.0 ofport_request=10
> > ovs-vsctl show
> > ifconfig br0 up
> > ip add add 192.168.124.1/24 dev br0
> > ping 192.168.124.2
> > ip link set p4p1 vf 0 vlan 10
> > ping 192.168.124.2
> > ethtool -k p4p1 | grep vlan
> > ethtool -K p4p1 rxvlan off
> 
> Please let me know, and I'll try to run some test early next week.

Comment 9 qding 2019-04-12 03:09:33 UTC
Created attachment 1554703 [details]
reproducer

The script to reproduce.

Comment 10 Eelco Chaudron 2019-04-15 13:32:49 UTC
Replicated this issue on my 7.5 system, with kernel-3.10.0-957.1.3, however when I upgraded to the latest available kernel, kernel-3.10.0-1038 I no longer see the problem.

As you indicate you no longer see this either with the latest RHEL7.7 I would suggest you double check and close this BZ.

Comment 11 qding 2019-04-16 07:29:06 UTC
rhel7.7 is still under development. The latest rhel7.6 has the issue. I'm not sure if it's ok to close it.

Comment 12 Eelco Chaudron 2019-04-16 08:36:35 UTC
(In reply to qding from comment #11)
> rhel7.7 is still under development. The latest rhel7.6 has the issue. I'm
> not sure if it's ok to close it.

Can you try the latest kernel-3.10.0 kernel, which is the same I tested on rhel7.5.
I don't have a rhel7.6 setup, but I think its also on the 3.10 kernel.

Comment 14 qding 2019-04-16 09:42:03 UTC
Tested with 3.10.0-957.el7.x86_64 which is for rhel7.6, it has the issue.
tested with 3.10.0-1034.el7.x86_64, it doesn't have the issue.

Comment 15 Eelco Chaudron 2019-04-18 10:31:14 UTC
Ok, can we close this BZ as fixed in 1034?

Comment 16 qding 2019-04-19 05:17:11 UTC
(In reply to Eelco Chaudron from comment #15)
> Ok, can we close this BZ as fixed in 1034?

Yes. No problem for me.

Thanks

Comment 17 Eelco Chaudron 2019-04-19 08:14:44 UTC
Issue is no longer present in the latest kernel build, 3.10.0-1034.el7.x86_64, so will be closing this BZ


Note You need to log in before you can comment on or make changes to this bug.