Bug 1657467

Summary: [OVS-DPDK] ping failed over OVS-dpdk/Synergy QEDE
Product: Red Hat Enterprise Linux 7 Reporter: Jean-Tsung Hsiao <jhsiao>
Component: openvswitchAssignee: Rasesh Mody <rasesh.mody>
Status: CLOSED DUPLICATE QA Contact: Jean-Tsung Hsiao <jhsiao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.5CC: arahman, atragler, ctrautma, jhsiao, ktraynor, kzhang, ovs-qe, qding, tredaelli
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-28 09:27:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jean-Tsung Hsiao 2018-12-08 16:25:11 UTC
Description of problem: [OVS-DPDK] ping failed over  OVS-dpdk/Synergy QEDE

[root@hpe-netqe-syn480g10-07 jhsiao]# ovs-vsctl show
6c65ca24-43e4-432f-8ada-57b9b8a42294
    Bridge "ovsbr0"
        Port "dpdk-10"
            Interface "dpdk-10"
                type: dpdk
                options: {dpdk-devargs="0000:37:00.0", n_rxq="1"}
        Port "int0"
            Interface "int0"
                type: internal
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
    ovs_version: "2.10.0"

Version-Release number of selected component (if applicable):
[root@hpe-netqe-syn480g10-07 jhsiao]# uname -a
Linux hpe-netqe-syn480g10-07.knqe.lab.eng.bos.redhat.com 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@hpe-netqe-syn480g10-07 jhsiao]# rpm -q openvswitch2.10
openvswitch2.10-2.10.0-28.el7fdp.x86_64
[root@hpe-netqe-syn480g10-07 jhsiao]# driverctl -v list-overrides
0000:37:00.0 vfio-pci (FastLinQ QL45000 Series 25GbE Controller)
0000:37:00.1 vfio-pci (FastLinQ QL45000 Series 25GbE Controller)
[root@hpe-netqe-syn480g10-07 jhsiao]# ovs-vsctl show
6c65ca24-43e4-432f-8ada-57b9b8a42294
    Bridge "ovsbr0"
        Port "dpdk-10"
            Interface "dpdk-10"
                type: dpdk
                options: {dpdk-devargs="0000:37:00.0", n_rxq="1"}
        Port "int0"
            Interface "int0"
                type: internal
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
    ovs_version: "2.10.0"


How reproducible: Reproducible


Steps to Reproduce: 
1. Configure OVS-dpdk on two Synergy servers
2. Add QEDE as a dpdk interface to each OVS-dpdk bridge
3. Add int0 as an internal port to each OVS-dpdk bridge
4. Ping should fail

Actual results:


Expected results:


Additional info:

Comment 2 Jean-Tsung Hsiao 2018-12-08 16:30:58 UTC
NOTE: OVS-kernel case is working. Switch fabric configuration could be an issue here based on OVS-dpdk over Cisco UCS experience.

Comment 3 qding 2018-12-10 07:05:36 UTC
When qede port running DPDK, all packets received on the port will have QinQ header and packets sent out from the port cannot be seen on the peer which may be filtered by switch in between.

Please see log below, the only change between the two test is qede port running as kernel interface or dpdk

1. When qede port added to ovs bridge as kernel interface

[root@hpe-netqe-syn480g10-07 ~]# ovs-vsctl show
ad14b061-c705-4eeb-91bc-e316be40ecd4
    Bridge "ovsbr0"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "ens3f0"
            Interface "ens3f0"
    ovs_version: "2.10.0"
[root@hpe-netqe-syn480g10-07 ~]# 
[root@hpe-netqe-syn480g10-07 ~]# ovs-tcpdump -nev -i ovsbr0
...

01:25:53.807497 14:02:ec:d3:8e:34 > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.123.1 tell 192.168.123.2, length 46
01:25:54.809662 14:02:ec:d3:8e:34 > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.123.1 tell 192.168.123.2, length 46

[root@hpe-netqe-syn480g10-07 ~]# ovs-appctl   dpif/dump-flows ovsbr0
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=14:02:ec:d3:8e:34,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=192.168.123.2,tip=192.168.123.1,op=1/0xff), packets:23, bytes:1380, used:1.252s, actions:1
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:05:73:b2:81:db,dst=01:00:0c:cc:cc:cd),eth_type(0/0xffff), packets:18, bytes:1152, used:0.220s, actions:drop
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=08:97:34:2a:65:fd,dst=01:14:c2:44:1e:cc),eth_type(0/0xffff), packets:7, bytes:420, used:0.210s, actions:1
[root@hpe-netqe-syn480g10-07 ~]# 

2. When qede port added to ovs bridge as DPDK interface

[root@hpe-netqe-syn480g10-07 ~]# ovs-vsctl show
ad14b061-c705-4eeb-91bc-e316be40ecd4
    Bridge "ovsbr0"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "dpdk-10"
            Interface "dpdk-10"
                type: dpdk
                options: {dpdk-devargs="0000:37:00.0", n_rxq="1"}
    ovs_version: "2.10.0"
[root@hpe-netqe-syn480g10-07 ~]# ovs-tcpdump -nnev -i dpdk-10
...

01:18:17.246741 14:02:ec:d3:8e:34 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 64: vlan 2, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.123.1 tell 192.168.123.2, length 46

[root@hpe-netqe-syn480g10-07 ~]# ovs-appctl   dpif/dump-flows ovsbr0
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:05:73:b2:81:db,dst=01:00:0c:cc:cc:cd),eth_type(0x88a8),vlan(vid=2),encap(), packets:260, bytes:17680, used:1.141s, actions:drop
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=08:97:34:2a:65:fd,dst=01:14:c2:44:1e:cc),eth_type(0x88a8),vlan(vid=2),encap(), packets:101, bytes:6464, used:3.045s, actions:1
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth(src=14:02:ec:d3:8e:34,dst=ff:ff:ff:ff:ff:ff),eth_type(0x88a8),vlan(vid=2),encap(eth_type(0x0806),arp(sip=192.168.123.2,tip=192.168.123.1,op=1/0xff)), packets:66, bytes:4224, used:0.663s, actions:1

Comment 4 qding 2018-12-10 07:52:16 UTC
more info:

[root@hpe-netqe-syn480g10-07 qed]# lspci -m -s 0000:37:00.0
37:00.0 "Ethernet controller" "QLogic Corp." "FastLinQ QL45000 Series 25GbE Controller" -r10 "Hewlett Packard Enterprise" "Device 0245"

[root@hpe-netqe-syn480g10-07 qed]# ethtool -i ens3f0
driver: qede
version: 8.33.0.20_dup7.5
firmware-version: mfw 8.35.29.0 storm 8.33.11.0
expansion-rom-version: 
bus-info: 0000:37:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: yes

[root@hpe-netqe-syn480g10-07 qed]# ll
total 18400
-rwxr-xr-x. 1 root root 1434652 Nov 20 15:49 qed_init_values-8.10.9.0.bin
-rwxr-xr-x. 1 root root 1459040 Nov 20 15:49 qed_init_values-8.14.6.0.bin
-rwxr-xr-x. 1 root root 1536984 Nov 20 15:49 qed_init_values-8.18.9.0.bin
-rwxr-xr-x. 1 root root 1443340 Nov 20 15:49 qed_init_values-8.20.0.0.bin
-rwxr-xr-x. 1 root root 1507052 Nov 20 15:49 qed_init_values-8.30.12.0.bin
-rwxr-xr-x. 1 root root 1562920 Nov 20 15:49 qed_init_values-8.33.12.0.bin
-rwxr-xr-x. 1 root root 1559500 Nov 20 15:49 qed_init_values-8.37.7.0.bin
-rw-r--r--. 1 root root  780576 Nov 20 15:49 qed_init_values_zipped-8.10.10.0.bin
-rw-r--r--. 1 root root  767532 Nov 20 15:49 qed_init_values_zipped-8.10.5.0.bin
-rw-r--r--. 1 root root  789540 Nov 20 15:49 qed_init_values_zipped-8.15.3.0.bin
-rw-r--r--. 1 root root  794456 Nov 20 15:49 qed_init_values_zipped-8.20.0.0.bin
-rwxr-xr-x. 1 root root  838612 Nov 20 15:49 qed_init_values_zipped-8.33.1.0.bin
-rwxr-xr-x. 1 root root  852456 Nov 20 15:49 qed_init_values_zipped-8.33.11.0.bin
-rw-r--r--. 1 root root  852456 Oct 25 12:42 qed_init_values_zipped-8.33.11.0_dup7.5.bin
-rwxr-xr-x. 1 root root  867472 Nov 20 15:49 qed_init_values_zipped-8.37.2.0.bin
-rwxr-xr-x. 1 root root  872296 Nov 20 15:49 qed_init_values_zipped-8.37.7.0.bin
-rw-r--r--. 1 root root  416930 Nov 20 15:49 qed_init_values_zipped-8.4.2.0.bin
-rw-r--r--. 1 root root  473792 Nov 20 15:49 qed_init_values_zipped-8.7.3.0.bin

Comment 5 Ameen Rahman 2019-01-29 21:38:57 UTC
This is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1624530
Patches fixing this issue are part of v17.11.5-rc2

Comment 6 Kevin Traynor 2019-06-28 09:27:56 UTC

*** This bug has been marked as a duplicate of bug 1624530 ***