Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2231081

Summary: i40e driver: both ipv4 and ipv6 ping failed when sr-iov vf add to ovs bridge on ovs3.2
Product: Red Hat Enterprise Linux Fast Datapath Reporter: liting <tli>
Component: openvswitchAssignee: Eelco Chaudron <echaudro>
openvswitch sub component: ovs-dpdk QA Contact: liting <tli>
Status: CLOSED NEXTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: ctrautma, dmarchan, echaudro, fleitner, jhsiao, ktraynor
Version: FDP 23.F   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openvswitch3.2-3.2.0-17.el9fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-12-15 13:42:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description liting 2023-08-10 13:47:34 UTC
Description of problem:


Version-Release number of selected component (if applicable):
[root@dell-per730-52 ~]# uname -r
5.14.0-284.27.1.el9_2.x86_64
[root@dell-per730-52 ~]# rpm -qa|grep openvswitch
openvswitch-selinux-extra-policy-1.0-34.el9fdp.noarch
openvswitch3.2-3.2.0-0.2.el9fdp.x86_64
[root@dell-per730-52 ~]# rpm -qa|grep dpdk
dpdk-22.11-3.el9_2.x86_64
dpdk-tools-22.11-3.el9_2.x86_64


How reproducible:


Steps to Reproduce:
 dell730-52 i40e <--> dell730-53 i40e

on server side:
1. create two vfs for one pf 
[root@dell-per730-52 ~]# ip link show
7: enp7s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:ad:bc:e8 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:60:2f:c5:65:b3 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether 52:54:00:11:8f:ea brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on


2.Bind vf 1 to dpdk, and add dpdk0 to ovs bridge.
 /usr/share/dpdk/usertools/dpdk-devbind.py -b vfio-pci 0000:07:02.1
 systemctl restart openvswitch
 ovs-vsctl set Open_vSwitch . 'other_config={}'
 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1024,1024
 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x50000005000000
 ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
 ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk type=dpdk options:dpdk-devargs=0000:07:02.1
 ovs-vsctl add-port ovsbr0 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient -- set Interface dpdkvhostuserclient0 options:vhost-server-path=/tmp/dpdkvhostuserclient0

[root@dell-per730-52 ~]# ovs-vsctl show
2b1cdab2-dca9-44b7-b666-3cfa27edf054
    Bridge ovsbr0
        datapath_type: netdev
        Port dpdkvhostuserclient0
            Interface dpdkvhostuserclient0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:07:02.1"}
    ovs_version: "3.2.0"


3.Configure the vf 1 mac same with guest's mac
ip link set enp3s0f0 vf 1 mac $guest_mac

4. Inside guest, configure ipv4 and ipv6 address for eth0
ip addr add 20.0.0.1/24 dev enp4s0
ip addr add 2001:5c0:9168::1/24 dev enp4s0

inside guest:
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:bb:63:7b brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.138/24 brd 192.168.122.255 scope global dynamic noprefixroute enp2s0
       valid_lft 3257sec preferred_lft 3257sec
    inet6 fe80::da5d:7eab:c8e6:e0dc/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:8f:ea brd ff:ff:ff:ff:ff:ff
    inet 20.0.0.1/24 scope global enp4s0
       valid_lft forever preferred_lft forever
    inet6 2001:5c0:9168::1/24 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::522a:fe0c:c0c7:8045/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


on client side:
[root@dell-per730-53 ~]# ip link show
6: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:ad:bf:c4 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:60:2f:1b:23:73 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether 52:54:00:11:8f:e9 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on

2.Bind vf 1 to dpdk, and add dpdk0 to ovs bridge.
 /usr/share/dpdk/usertools/dpdk-devbind.py -b vfio-pci 0000:04:02.1
 systemctl restart openvswitch
 ovs-vsctl set Open_vSwitch . 'other_config={}'
 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1024,1024
 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x50000005000000
 ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
 ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk type=dpdk options:dpdk-devargs=0000:04:02.1
 ovs-vsctl add-port ovsbr0 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient -- set Interface dpdkvhostuserclient0 options:vhost-server-path=/tmp/dpdkvhostuserclient0

[root@dell-per730-53 ~]# ovs-vsctl show
7bbd9e22-cbf4-434a-86a5-1e8e2949638f
    Bridge ovsbr0
        datapath_type: netdev
        Port dpdkvhostuserclient0
            Interface dpdkvhostuserclient0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:04:02.1"}
        Port ovsbr0
            Interface ovsbr0
                type: internal
    ovs_version: "3.2.0"

3.Configure the vf 1 mac same with guest's mac
ip link set enp3s0f0 vf 1 mac $guest_mac

4. Inside guest, configure ipv4 and ipv6 address for eth0
ip addr add 20.0.0.2/24 dev enp4s0
ip addr add 2001:5c0:9168::2/24 dev enp4s0

guest:
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:bb:63:7b brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.138/24 brd 192.168.122.255 scope global dynamic noprefixroute enp2s0
       valid_lft 3025sec preferred_lft 3025sec
    inet6 fe80::da5d:7eab:c8e6:e0dc/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:8f:e9 brd ff:ff:ff:ff:ff:ff
    inet 20.0.0.2/24 scope global enp4s0
       valid_lft forever preferred_lft forever
    inet6 2001:5c0:9168::2/24 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::ee61:e1d4:cbef:19a9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


5. inside guest, ping the ipv4 and ipv6 address indise guest of server
ping 20.0.0.1
ping6 2001:5c0:9168::1

Actual results:
Ping ipv4 and ipv6 failed.


Expected results:
Ping ipv4 and ipv6 successfully.

Additional info:
https://beaker.engineering.redhat.com/jobs/8173772

Comment 3 Flavio Leitner 2023-08-10 18:30:04 UTC
Does it work with a recent OVS 3.1?
Thanks
fbl

Comment 8 Eelco Chaudron 2023-08-17 13:28:51 UTC
Looks like this is caused by the following commit:

 * 5d11c47d3 userspace: Enable IP checksum offloading by default.

Needs a follow up why.

Comment 9 Eelco Chaudron 2023-08-21 08:23:26 UTC
David figured out where the problem was in DPDK, and submitted the following patch:

https://patchwork.dpdk.org/project/dpdk/patch/20230818090351.2402519-1-david.marchand@redhat.com/

I tried it on @liting's setup, and it solved the problem.

We need to wait for the patch to make it to 22.11.x

Comment 10 David Marchand 2023-08-28 19:16:33 UTC
The net/iavf driver in DPDK is faulty, in that, it was evaluating some fields in the mbuf because of an (unneeded) flag present in ol_flags.
For this, a fix has been sent in DPDK and merged in next-net-intel, as it could affect other DPDK applications.
https://patchwork.dpdk.org/project/dpdk/patch/20230823062911.2483926-1-david.marchand@redhat.com/


But, out of safety (in case other DPDK drivers are showing the same bug), a OVS change upstream now clears the (unneeded) flag before calling the DPDK drivers.
This change has been backported to OVS 3.2 upstream branch, so hopefully, it will get picked up with our downstream robot, soon.
https://github.com/openvswitch/ovs/commit/9b7e1a75378f

Comment 11 Eelco Chaudron 2023-09-18 09:52:59 UTC
The fix has been integrated in openvswitch3.2-3.2.0-10.el9fdp, can you please verify.

Comment 12 liting 2023-09-25 00:28:08 UTC
(In reply to Eelco Chaudron from comment #11)
> The fix has been integrated in openvswitch3.2-3.2.0-10.el9fdp, can you
> please verify.

I run the case with openvswitch3.2-3.2.0-10.el9fdp, it still not work.
https://beaker.engineering.redhat.com/jobs/8347458

Comment 13 Eelco Chaudron 2023-09-25 08:52:48 UTC
This is odd, David verified it on his setup with E810 cards, and it works fine there. Can you make your setup available to me, so I can take another look?

Comment 14 liting 2023-10-09 08:49:43 UTC
(In reply to Eelco Chaudron from comment #13)
> This is odd, David verified it on his setup with E810 cards, and it works
> fine there. Can you make your setup available to me, so I can take another
> look?

I run case and it also not work well on e810 card.
e810 card job.
https://beaker.engineering.redhat.com/jobs/8403091
I have the test env of i40e card. You can access to them have a look. i40e card of dell730-52 <--directly connect---> i40e card of dell730-53. 
dell-per730-52.rhts.eng.pek2.redhat.com(root/redhat)
dell-per730-53.rhts.eng.pek2.redhat.com(root/redhat)

Comment 15 Eelco Chaudron 2023-10-11 12:08:34 UTC
I was able to replicate the same issue on my local system with an Intel E810 card, where I looped back the two ports with a cable, so all I needed was a single machine. Here are my instructions to replicate it:


Setup NIC partitioning, and the create VFs:
===========================================

  lshw -c network -businfo
  Bus info          Device     Class          Description
  =======================================================
  pci@0000:31:00.0  eno12399   network        Ethernet Controller E810-XXV for SFP
  pci@0000:31:00.1  eno12409   network        Ethernet Controller E810-XXV for SFP

  echo 0 > /sys/class/net/eno12399/device/sriov_numvfs
  echo 2 > /sys/class/net/eno12399/device/sriov_numvfs


  lshw -c network -businfo
  Bus info          Device      Class          Description
  ========================================================
  pci@0000:31:00.0  eno12399    network        Ethernet Controller E810-XXV for SFP
  pci@0000:31:00.1  eno12409    network        Ethernet Controller E810-XXV for SFP
  pci@0000:31:01.0  eno12399v0  network        Ethernet Adaptive Virtual Function
  pci@0000:31:01.1  eno12399v1  network        Ethernet Adaptive Virtual Function


Setup the second vf (vf 1):
===========================

  ip link show dev eno12399
  4: eno12399: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
      link/ether b4:83:51:01:ce:52 brd ff:ff:ff:ff:ff:ff
      vf 0     link/ether aa:c4:23:98:85:0a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
      vf 1     link/ether c2:9c:1c:e8:dc:f2 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
      altname enp49s0f0

  ip link set dev eno12399 vf 1 spoofchk on trust on

  ip link show dev eno12399
  4: eno12399: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
      link/ether b4:83:51:01:ce:52 brd ff:ff:ff:ff:ff:ff
      vf 0     link/ether aa:c4:23:98:85:0a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
      vf 1     link/ether c2:9c:1c:e8:dc:f2 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust on
      altname enp49s0f0


Setup openvswitch + driverctl:
==============================

  driverctl -v set-override 0000:31:01.1 vfio-pci

  ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=2048
  ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x4000000040000000
  ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true

  ovs-vsctl del-br ovs_pvp_br0

  ovs-vsctl add-br ovs_pvp_br0 -- \
            set bridge ovs_pvp_br0 datapath_type=netdev

  ovs-vsctl add-port ovs_pvp_br0 dpdk0p0 -- \
            set Interface dpdk0p0 type=dpdk -- \
            set Interface dpdk0p0 options:dpdk-devargs=0000:31:01.1 -- \
            set interface dpdk0p0 options:n_rxq=2

  ovs-vsctl add-port ovs_pvp_br0 vhost0 -- \
            set Interface vhost0 type=dpdkvhostuserclient -- \
            set Interface vhost0 options:vhost-server-path='/tmp/vhost-sock0' -- \
            set Interface vhost0 options:n_rxq=2


Start VM and get it's MAC address:
==================================

  virsh start rhel9_loopback
  virsh console rhel9_loopback

  driverctl unset-override 0000:01:00.0

  ip a show dev eth0
  4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
      link/ether 52:54:00:9e:e6:3d brd ff:ff:ff:ff:ff:ff
      altname enp1s0
      inet 1.1.1.1/24 brd 1.1.1.255 scope global noprefixroute eth0
         valid_lft forever preferred_lft forever


Set the VFs MAC to the VM one:
==============================

  ip link set eno12399 vf 1 mac 52:54:00:9e:e6:3d



Add IP to OVS bridge to make sure the client can ping it:
=========================================================

  ip a a 1.1.1.10/24 dev ovs_pvp_br0

  ip link set ovs_pvp_br0 up

  ping 1.1.1.1
  PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
  64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.196 ms
  64 bytes from 1.1.1.1: icmp_seq=2 ttl=64 time=0.060 ms
  64 bytes from 1.1.1.1: icmp_seq=3 ttl=64 time=0.043 ms
  ^C
  --- 1.1.1.1 ping statistics ---
  3 packets transmitted, 3 received, 0% packet loss, time 2047ms
  rtt min/avg/max/mdev = 0.043/0.099/0.196/0.068 ms

  ovs-appctl dpctl/dump-flows
  flow-dump from the main thread:
  recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=a6:53:f9:9f:44:4e,dst=52:54:00:9e:e6:3d),eth_type(0x0800),ipv4(frag=no), packets:3, bytes:294, used:1.530s, actions:4
  flow-dump from pmd on cpu core: 62
  recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=52:54:00:9e:e6:3d,dst=a6:53:f9:9f:44:4e),eth_type(0x0800),ipv4(frag=no), packets:3, bytes:294, used:1.530s, actions:2

  [VM] ping 1.1.1.10
       PING 1.1.1.10 (1.1.1.10) 56(84) bytes of data.
       64 bytes from 1.1.1.10: icmp_seq=1 ttl=64 time=0.137 ms
       64 bytes from 1.1.1.10: icmp_seq=2 ttl=64 time=0.061 ms
       64 bytes from 1.1.1.10: icmp_seq=3 ttl=64 time=0.052 ms

       --- 1.1.1.10 ping statistics ---
       3 packets transmitted, 3 received, 0% packet loss, time 2063ms
       rtt min/avg/max/mdev = 0.052/0.083/0.137/0.038 ms


We looped port 1 and 2 of the NIC, so setup net_ns to ping the other port:
==========================================================================
  ip netns add ns_remote

  ip link set eno12409 netns ns_remote

  ip -n ns_remote link set dev lo up
  ip -n ns_remote link set dev eno12409 up
  ip -n ns_remote address add 1.1.1.100/24 dev eno12409


Now we should be able to ping from the VM to the ns_remote IP:
==============================================================

  However, this is not working due to the issue at hand :(

  [root@loopback ~]# ping 1.1.1.100
  PING 1.1.1.100 (1.1.1.100) 56(84) bytes of data.

  ^C
  --- 1.1.1.100 ping statistics ---
  6763 packets transmitted, 0 received, 100% packet loss, time 6924274ms

Comment 18 Eelco Chaudron 2023-10-12 07:51:20 UTC
So it looks like there was a problem with the internal build process, which is now fixed. The openvswitch3.2-3.2.0-17.el9fdp.x86_64.rpm version should work. I did verify this on my test setup with E810 cards. Please verify this on your setup.

Comment 19 liting 2023-10-18 01:10:49 UTC
(In reply to Eelco Chaudron from comment #18)
> So it looks like there was a problem with the internal build process, which
> is now fixed. The openvswitch3.2-3.2.0-17.el9fdp.x86_64.rpm version should
> work. I did verify this on my test setup with E810 cards. Please verify this
> on your setup.

The ipv4 ping successfully on openvswitch3.2-3.2.0-17.el9fdp.x86_64. Following is the job. 
https://beaker.engineering.redhat.com/jobs/8440484
But the ipv6 still ping failed. It has following known issue. So this bug should be verify pass.
https://bugzilla.redhat.com/show_bug.cgi?id=1921462

Comment 20 Eelco Chaudron 2023-12-15 13:42:15 UTC
Closing this BZ as the issue is solved.