Bug 1966157 - NeutronOVSFirewallDriver: openvswitch - ovn wrong openflow programming when vms >1 exist on the same host
Summary: NeutronOVSFirewallDriver: openvswitch - ovn wrong openflow programming when ...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: Alpha
: 17.0
Assignee: Haresh Khandelwal
QA Contact: Miguel Angel Nieto
URL:
Whiteboard:
: 2014183 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-31 14:30 UTC by Yariv
Modified: 2023-06-22 12:04 UTC (History)
31 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
There is a limitation when using ML2/OVN with `provider:network_type geneve` with a Mellanox adapter on a Compute node that has more than one instance on the geneve network. The floating IP of only one of the instances will be reachable. You can track the progress of the resolution on this Bugzilla ticket.
Clone Of:
: 2014183 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1580 0 None None None 2021-09-28 14:21:14 UTC
Red Hat Issue Tracker NFV-2140 0 None None None 2022-01-13 17:53:45 UTC
Red Hat Issue Tracker OSP-4312 0 None None None 2021-11-16 15:25:57 UTC

Description Yariv 2021-05-31 14:30:04 UTC
Description of problem:

One guest fip is replying when 2 or more vms deployed on single compute.


Version-Release number of selected component (if applicable):


How reproducible:

With puddle RHOS-16.2-RHEL-8-20210525.n.0

Steps to Reproduce:
1. deploy ovn with hw-offload computes:
refer the following THTs:
https://gitlab.cee.redhat.com/yrachman/testing-testbed/-/tree/master/tht/ospd-16.2-geneve-ovn-hw-offload-ctlplane-dataplane-bonding-hybrid
 
2.  create 4 vms at least two on each compute, attach floating ips 

openstack server list --host computeovshwoffload-1.redhat.local --all -c 'Networks'
+-------------------------------------------------------+
| Networks                                              |
+-------------------------------------------------------+
| mellanox-geneve-provider=20.20.220.113, 10.35.141.162 |
| mellanox-geneve-provider=20.20.220.171, 10.35.141.165 |
+-------------------------------------------------------+

3. send ping
 ping -c 1 10.35.141.162
PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data.
64 bytes from 10.35.141.162: icmp_seq=1 ttl=61 time=0.703 ms

 ping -c 1 10.35.141.165
PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data.
64 bytes from 10.35.141.162: icmp_seq=1 ttl=61 time=0.703 ms

ping -c 5 10.35.141.162
PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data.
^C
--- 10.35.141.162 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4115ms

 

Actual results:
Only one fip is responsive

Expected results:
Ping should be responsive for both floating ips


Additional info:
Once fip is mapped to vlan network both fips are reachable and responsive


sos report
rhos-release.virt.bos.redhat.com:/var/www/html/log/ovn-hw-fip/

Comment 1 Yariv 2021-05-31 14:33:30 UTC
BZ, opened due to regression for hw-offload rhel 8.4

Comment 14 Yariv 2021-06-16 10:50:16 UTC
this test is passing with ovs backend for 16.2

Another interesting issue
overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.162
PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data.
64 bytes from 10.35.141.162: icmp_seq=2 ttl=61 time=26.9 ms
^C
--- 10.35.141.162 ping statistics ---
2 packets transmitted, 1 received, 50% packet loss, time 1012ms
rtt min/avg/max/mdev = 26.877/26.877/26.877/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.163
PING 10.35.141.163 (10.35.141.163) 56(84) bytes of data.
^C
--- 10.35.141.163 ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7158ms


Waiting few minutes


(overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.163
PING 10.35.141.163 (10.35.141.163) 56(84) bytes of data.
64 bytes from 10.35.141.163: icmp_seq=1 ttl=61 time=26.10 ms
64 bytes from 10.35.141.163: icmp_seq=2 ttl=61 time=0.619 ms
^C
--- 10.35.141.163 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.619/13.800/26.981/13.181 ms
(overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.162
PING 10.35.141.162 (10.35.141.162) 56(84) bytes of data.
^C
--- 10.35.141.162 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1061ms

Comment 16 Yariv 2021-06-17 17:29:21 UTC
See description, above sos report:
rhos-release.virt.bos.redhat.com:/var/www/html/log/ovn-hw-fip/

Adding the commands here:


[root@computeovshwoffload-0 ~]# ovs-vsctl show
edb712f9-21dd-4816-b0ab-e185994d2312
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port ovn-f5aa6c-0
            Interface ovn-f5aa6c-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.10.161.127"}
                bfd_status: {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}
        Port ovn-3ea0c7-0
            Interface ovn-3ea0c7-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.10.161.129"}
        Port enp4s0f0_8
            Interface enp4s0f0_8
        Port ovn-1c2cb7-0
            Interface ovn-1c2cb7-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.10.161.135"}
                bfd_status: {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}
        Port tap130d9819-20
            Interface tap130d9819-20
        Port enp4s0f0_0
            Interface enp4s0f0_0
        Port br-int
            Interface br-int
                type: internal
        Port ovn-ccf1be-0
            Interface ovn-ccf1be-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.10.161.125"}
                bfd_status: {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}
    Bridge br-link1
        fail_mode: standalone
        Port bond0
            Interface enp6s0f0
            Interface enp6s0f1
        Port br-link1
            Interface br-link1
                type: internal
    Bridge br-link0
        fail_mode: standalone
        Port br-link0
            Interface br-link0
                type: internal
        Port mx-bond
            Interface mx-bond
    ovs_version: "2.15.1"
[root@computeovshwoffload-0 ~]# 



[root@computeovshwoffload-0 ~]# ovs-vsctl list Open_vSwitch
_uuid               : edb712f9-21dd-4816-b0ab-e185994d2312
bridges             : [087d9b34-53da-42b1-9052-50dc7819cf0b, 1d9b091c-c8e8-4d50-8ee5-07b51d9f5b53, 8d8e1d31-df9a-4c03-9876-14de0828c795]
cur_cfg             : 69
datapath_types      : [netdev, system]
datapaths           : {}
db_version          : "8.2.0"
dpdk_initialized    : false
dpdk_version        : "DPDK 20.11.1"
external_ids        : {hostname=computeovshwoffload-0.redhat.local, ovn-bridge=br-int, ovn-bridge-mappings="mx-network:br-link0,mgmt:br-link1", ovn-encap-ip="10.10.161.101", ovn-encap-type=geneve, ovn-openflow-probe-interval="60", ovn-remote="tcp:10.10.160.115:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="ff44434d-03f6-4cd2-9d57-94dfa476ca32"}
iface_types         : [bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
manager_options     : [fb560279-de47-49d8-9883-bce497d67e1e]
next_cfg            : 69
other_config        : {hw-offload="true"}
ovs_version         : "2.15.1"
ssl                 : []
statistics          : {}
system_type         : rhel
system_version      : "8.4"


[root@computeovshwoffload-0 ~]# ovs-vsctl list Open_vSwitch
_uuid               : edb712f9-21dd-4816-b0ab-e185994d2312
bridges             : [087d9b34-53da-42b1-9052-50dc7819cf0b, 1d9b091c-c8e8-4d50-8ee5-07b51d9f5b53, 8d8e1d31-df9a-4c03-9876-14de0828c795]
cur_cfg             : 69
datapath_types      : [netdev, system]
datapaths           : {}
db_version          : "8.2.0"
dpdk_initialized    : false
dpdk_version        : "DPDK 20.11.1"
external_ids        : {hostname=computeovshwoffload-0.redhat.local, ovn-bridge=br-int, ovn-bridge-mappings="mx-network:br-link0,mgmt:br-link1", ovn-encap-ip="10.10.161.101", ovn-encap-type=geneve, ovn-openflow-probe-interval="60", ovn-remote="tcp:10.10.160.115:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="ff44434d-03f6-4cd2-9d57-94dfa476ca32"}
iface_types         : [bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
manager_options     : [fb560279-de47-49d8-9883-bce497d67e1e]
next_cfg            : 69
other_config        : {hw-offload="true"}
ovs_version         : "2.15.1"
ssl                 : []
statistics          : {}
system_type         : rhel
system_version      : "8.4"
[root@computeovshwoffload-0 ~]# ovs-vsctl list Bridge
_uuid               : 1d9b091c-c8e8-4d50-8ee5-07b51d9f5b53
auto_attach         : []
controller          : []
datapath_id         : "00003cfdfe33a5c0"
datapath_type       : ""
datapath_version    : "<unknown>"
external_ids        : {}
fail_mode           : standalone
flood_vlans         : []
flow_tables         : {}
ipfix               : []
mcast_snooping_enable: false
mirrors             : []
name                : br-link1
netflow             : []
other_config        : {}
ports               : [74c7db58-b96b-4ae1-997a-4b4fe129eca4, d07bb466-f10a-4d4f-a30d-4bbc47d65f6b]
protocols           : []
rstp_enable         : false
rstp_status         : {}
sflow               : []
status              : {}
stp_enable          : false

_uuid               : 087d9b34-53da-42b1-9052-50dc7819cf0b
auto_attach         : []
controller          : []
datapath_id         : "0000ea39dbd27b00"
datapath_type       : system
datapath_version    : "<unknown>"
external_ids        : {ct-zone-130d9819-2725-4acb-bd71-08bcc4627bb5_dnat="3", ct-zone-130d9819-2725-4acb-bd71-08bcc4627bb5_snat="2", ct-zone-21fe1f0f-c5c4-4fec-96b6-f75f0d6d84ca="1", ct-zone-3f31490f-4a85-40b4-a55c-e404afe37a14_dnat="9", ct-zone-3f31490f-4a85-40b4-a55c-e404afe37a14_snat="8", ct-zone-5816989a-7d23-4baf-bcad-0f19d7556a0b="4", ct-zone-760c2c71-7f41-487b-ae42-97c5a8e68dbf="5", ct-zone-bacb5372-84c4-442d-8af9-6fe912043908_dnat="6", ct-zone-bacb5372-84c4-442d-8af9-6fe912043908_snat="7", ct-zone-provnet-9944dd33-b0ff-41d3-b1dc-503febf31976="10", ovn-nb-cfg="332"}
fail_mode           : secure
flood_vlans         : []
flow_tables         : {}
ipfix               : []
mcast_snooping_enable: false
mirrors             : []
name                : br-int
netflow             : []
other_config        : {disable-in-band="true", hwaddr="ea:39:db:d2:7b:00"}
ports               : [1fd0ad86-4fc5-4d7f-ba75-a7d83b5ec2cc, 47b547c4-e7ac-40bf-b0c5-3850cbfd3c43, 4f0ab7f9-0757-4bb1-919d-a447eccf03ae, 6e22db8e-de6f-448a-800b-27b7cac5cb9d, cf999796-32d0-4758-b5a7-92c1a9d342ae, d01b3af0-e70a-4c5e-933f-e06e1877a053, dfa292bb-6a5b-4f2a-af64-1b9b22d970a2, f0a86926-91bf-48a6-89ac-7a2fc84ed099]
protocols           : []
rstp_enable         : false
rstp_status         : {}
sflow               : []
status              : {}
stp_enable          : false

_uuid               : 8d8e1d31-df9a-4c03-9876-14de0828c795
auto_attach         : []
controller          : []
datapath_id         : "0000043f72b8bb5e"
datapath_type       : ""
datapath_version    : "<unknown>"
external_ids        : {}
fail_mode           : standalone
flood_vlans         : []
flow_tables         : {}
ipfix               : []
mcast_snooping_enable: false
mirrors             : []
name                : br-link0
netflow             : []
other_config        : {}
ports               : [46682d89-4531-4101-be78-00e743e37230, 65e8a46a-d494-486a-b1e1-a95613b46d11]
protocols           : []
rstp_enable         : false
rstp_status         : {}
sflow               : []
status              : {}
stp_enable          : false

[root@computeovshwoffload-0 ~]# ovs-vsctl list Interface
_uuid               : f5e85f91-dd6b-4889-90ab-13db7dc572e1
admin_state         : up
bfd                 : {enable="true"}
bfd_status          : {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 68
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "7a:1d:c7:73:af:9c"
mtu                 : []
mtu_request         : []
name                : ovn-ccf1be-0
ofport              : 4
ofport_request      : []
options             : {csum="true", key=flow, remote_ip="10.10.161.125"}
other_config        : {}
statistics          : {rx_bytes=445455, rx_packets=6513, tx_bytes=8671086, tx_packets=131274}
status              : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up}
type                : geneve

_uuid               : 53cfde91-9e82-476c-8179-159e39ed810f
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 62
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 1
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "3c:fd:fe:33:a5:c0"
mtu                 : 9000
mtu_request         : []
name                : br-link1
ofport              : 65534
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=735902, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=15801, tx_bytes=3386, tx_dropped=0, tx_errors=0, tx_packets=47}
status              : {driver_name=openvswitch}
type                : internal

_uuid               : ebc8ded2-c187-4287-94cc-451519dd07bf
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : full
error               : []
external_ids        : {iface-id="5816989a-7d23-4baf-bcad-0f19d7556a0b"}
ifindex             : 87
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : 10000000000
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "26:24:bc:47:5f:ca"
mtu                 : 1500
mtu_request         : []
name                : tap130d9819-20
ofport              : 28
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=23784, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=207, tx_bytes=52633, tx_dropped=0, tx_errors=0, tx_packets=740}
status              : {driver_name=veth, driver_version="1.0", firmware_version=""}
type                : ""

_uuid               : 5ab9c9e3-0efa-4e59-a404-27bf20fdfeb2
admin_state         : up
bfd                 : {enable="true"}
bfd_status          : {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 68
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "b6:2a:c4:15:7c:9d"
mtu                 : []
mtu_request         : []
name                : ovn-1c2cb7-0
ofport              : 3
ofport_request      : []
options             : {csum="true", key=flow, remote_ip="10.10.161.135"}
other_config        : {}
statistics          : {rx_bytes=409134, rx_packets=6199, tx_bytes=8647254, tx_packets=131019}
status              : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up}
type                : geneve

_uuid               : 869157d3-e7c3-4418-8c8d-5f65f9ca83d5
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : full
error               : []
external_ids        : {}
ifindex             : 18
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : 10000000000
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "04:3f:72:b8:bb:5e"
mtu                 : 9000
mtu_request         : []
name                : mx-bond
ofport              : 1
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=80336928, rx_crc_err=0, rx_dropped=4497, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=1185360, tx_bytes=21290897, tx_dropped=0, tx_errors=0, tx_packets=311882}
status              : {driver_name=bonding, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="2"}
type                : ""

_uuid               : 012981dc-10fd-4d48-ae20-24c22c7ba9e2
admin_state         : down
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 67
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : down
lldp                : {}
mac                 : []
mac_in_use          : "ea:39:db:d2:7b:00"
mtu                 : 1500
mtu_request         : []
name                : br-int
ofport              : 65534
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=0, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0}
status              : {driver_name=openvswitch}
type                : internal

_uuid               : 3526b34c-a22a-4c93-89ba-e5c4f09002e3
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 68
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "7e:ae:de:54:55:90"
mtu                 : []
mtu_request         : []
name                : ovn-3ea0c7-0
ofport              : 1
ofport_request      : []
options             : {csum="true", key=flow, remote_ip="10.10.161.129"}
other_config        : {}
statistics          : {rx_bytes=5140, rx_packets=68, tx_bytes=4912, tx_packets=94}
status              : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up}
type                : geneve

_uuid               : fc67aa9c-f166-4bbb-937d-d2b1889e8520
admin_state         : down
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 61
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 2
link_speed          : []
link_state          : down
lldp                : {}
mac                 : []
mac_in_use          : "04:3f:72:b8:bb:5e"
mtu                 : 9000
mtu_request         : []
name                : br-link0
ofport              : 65534
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=0, rx_crc_err=0, rx_dropped=15847, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0}
status              : {driver_name=openvswitch}
type                : internal

_uuid               : 276bfcb5-1a91-404a-b5aa-7efe4c58926a
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {attached-mac="fa:16:3e:ae:f2:b1", iface-id="760c2c71-7f41-487b-ae42-97c5a8e68dbf", iface-status=active, ovn-installed="true", vm-uuid="fd5d78d2-dce2-4790-b446-f5a0a4e34249"}
ifindex             : 30
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "da:cc:cd:60:8a:e9"
mtu                 : 8942
mtu_request         : []
name                : enp4s0f0_0
ofport              : 29
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=133508, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=1679, tx_bytes=69898, tx_dropped=0, tx_errors=0, tx_packets=889}
status              : {driver_name=mlx5e_rep, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="16.27.6120 (DEL0000000015)"}
type                : ""

_uuid               : 38dd099e-9c69-4910-9add-86636404c528
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : full
error               : []
external_ids        : {}
ifindex             : 7
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : 10000000000
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "3c:fd:fe:33:a5:c2"
mtu                 : 9000
mtu_request         : []
name                : enp6s0f1
ofport              : 2
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=43101385, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=718156, tx_bytes=12603, tx_dropped=0, tx_errors=0, tx_packets=106}
status              : {driver_name=i40e, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="5.40 0x80002d36 18.0.17"}
type                : ""

_uuid               : f34e95c6-4d67-4a91-ae19-e4945249fd24
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {attached-mac="fa:16:3e:35:1c:81", iface-id="21fe1f0f-c5c4-4fec-96b6-f75f0d6d84ca", iface-status=active, ovn-installed="true", vm-uuid="c0d00331-9ee5-4242-8494-f6ae60442d16"}
ifindex             : 38
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "0a:19:98:0d:ff:c6"
mtu                 : 8942
mtu_request         : []
name                : enp4s0f0_8
ofport              : 27
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=144624, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=1871, tx_bytes=80485, tx_dropped=0, tx_errors=0, tx_packets=1028}
status              : {driver_name=mlx5e_rep, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="16.27.6120 (DEL0000000015)"}
type                : ""

_uuid               : 77fa407e-1b70-4f9f-8b7c-a0fbe207a433
admin_state         : up
bfd                 : {enable="true"}
bfd_status          : {diagnostic="Control Detection Time Expired", flap_count="2", forwarding="false", remote_diagnostic="No Diagnostic", remote_state=down, state=down}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : []
error               : []
external_ids        : {}
ifindex             : 68
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : []
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "02:2e:4b:89:e5:50"
mtu                 : []
mtu_request         : []
name                : ovn-f5aa6c-0
ofport              : 2
ofport_request      : []
options             : {csum="true", key=flow, remote_ip="10.10.161.127"}
other_config        : {}
statistics          : {rx_bytes=499044, rx_packets=6693, tx_bytes=8728759, tx_packets=131496}
status              : {tunnel_egress_iface=vlan161, tunnel_egress_iface_carrier=up}
type                : geneve

_uuid               : 32e14c57-950b-431e-94ba-03be0cd3e80b
admin_state         : up
bfd                 : {}
bfd_status          : {}
cfm_fault           : []
cfm_fault_status    : []
cfm_flap_count      : []
cfm_health          : []
cfm_mpid            : []
cfm_remote_mpids    : []
cfm_remote_opstate  : []
duplex              : full
error               : []
external_ids        : {}
ifindex             : 6
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current        : []
link_resets         : 0
link_speed          : 10000000000
link_state          : up
lldp                : {}
mac                 : []
mac_in_use          : "3c:fd:fe:33:a5:c0"
mtu                 : 9000
mtu_request         : []
name                : enp6s0f0
ofport              : 1
ofport_request      : []
options             : {}
other_config        : {}
statistics          : {collisions=0, rx_bytes=43102825, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=718180, tx_bytes=15425, tx_dropped=0, tx_errors=0, tx_packets=147}
status              : {driver_name=i40e, driver_version="4.18.0-305.3.1.el8_4.x86_64", firmware_version="5.40 0x80002d36 18.0.17"}
type                : ""
[root@computeovshwoffload-0 ~]#

Comment 17 Aaron Conole 2021-06-23 14:20:30 UTC
Can you make sure that the system has the tc utilities installed?  For example, the sosreport doesn't
contain any of the 'tc' commands I would expect ('tc -s filter show {devname} ingress', etc) - maybe
you can capture.

I'm not sure about that errno - ENOENT - usually I think it implies a generic error installing the
flow along the hw datapath.  I would expect if too many flows got offloaded, we would see ENOSPC, and
if the flow wasn't supported we would see something like EOPNOTSUPP or similar.

Maybe mleitner can see something I don't.

Comment 20 Marcelo Ricardo Leitner 2021-07-22 14:41:55 UTC
(In reply to Aaron Conole from comment #17)
> I'm not sure about that errno - ENOENT - usually I think it implies a
> generic error installing the
> flow along the hw datapath.  I would expect if too many flows got offloaded,
> we would see ENOSPC, and
> if the flow wasn't supported we would see something like EOPNOTSUPP or
> similar.
> 
> Maybe mleitner can see something I don't.

While we don't have https://bugzilla.redhat.com/show_bug.cgi?id=1916418
We can use a perf probe on 
https://github.com/torvalds/linux/commit/7e3ce05e7f650371061d0b9eec1e1cf74ed6fca0
to find exactly where and why this error was returned.

Btw, interesting how the 1st packet gets through, and then others don't. That pretty much means the upcall handles it, updates the datapath and then things get broken somehow. But if the filter failed to be added in tc, it should have added in dp:ovs. Weird.

Comment 23 Yariv 2021-08-19 05:46:38 UTC
BUG not reproduced with the following paddle,

Same test is failing with this puddle
RHOS-16.2-RHEL-8-20210811.n.1

Checking if issue persist, will update

Comment 24 Yariv 2021-08-19 09:23:00 UTC
(In reply to Yariv from comment #23)
> BUG not reproduced with the following paddle,
> 
> Same test is failing with this puddle
> RHOS-16.2-RHEL-8-20210811.n.1
> 
> Checking if issue persist, will update

The problem still persist RHOS-16.2-RHEL-8-20210811.n.1

(overcloud) [stack@undercloud-0 ~]$ openstack server list --all --host computeovshwoffload-0.redhat.local
+--------------------------------------+------------------------------------------+--------+---------------------------------------------------------------------------------------------+---------------------------------------+--------+
| ID                                   | Name                                     | Status | Networks                                                                                    | Image                                 | Flavor |
+--------------------------------------+------------------------------------------+--------+---------------------------------------------------------------------------------------------+---------------------------------------+--------+
| 02f9e32d-7078-4f7b-869f-90b75f70dc56 | tempest-TestNfvOffload-server-1658772737 | ACTIVE | mellanox-geneve-provider=20.20.220.192, 10.35.141.167; mellanox-vlan-provider=30.30.220.182 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
| 8935550e-35d2-4177-a845-641e2a305c6e | tempest-TestNfvOffload-server-530477859  | ACTIVE | mellanox-geneve-provider=20.20.220.122, 10.35.141.172; mellanox-vlan-provider=30.30.220.125 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
+--------------------------------------+------------------------------------------+--------+---------------------------------------------------------------------------------------------+---------------------------------------+--------+
(overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.167
PING 10.35.141.167 (10.35.141.167) 56(84) bytes of data.
64 bytes from 10.35.141.167: icmp_seq=1 ttl=61 time=17.9 ms
^C
--- 10.35.141.167 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 17.891/17.891/17.891/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping 10.35.141.172
PING 10.35.141.172 (10.35.141.172) 56(84) bytes of data.

@mleitner, would you like to look at the machines?

Comment 34 Marcelo Ricardo Leitner 2021-08-24 15:03:16 UTC
(In reply to Yariv from comment #24)
> @mleitner, would you like to look at the machines?

Not really. :-)
Haresh can debug OSP better than I do. I'm here if anything, though.

Comment 35 Marcelo Ricardo Leitner 2021-08-24 21:36:54 UTC
Restoring need-info that I cleared by mistake.

Comment 41 Miguel Angel Nieto 2021-09-24 15:00:58 UTC
I have been debugging the issue together with Haresh and I think we have seen the root cause. 

The problem happens when there are 2 vms in the same compute connected to the same provider network. There is an issue with the flow programming and packets go to the wrong vm, so that ping fails. 
It is not related with floating ip, ping fails between 2 ips in the same provider network. If that ip is used for floating ip, then floating ip will fail too. 
If there is a single vm per compute there is no issue.

Here I provide an example:

I create 4 vms (2 in each compute). Vms does not have floating ip, i will use console:
(venv) (overcloud) [stack@undercloud-0 ~]$ openstack server list --a
+--------------------------------------+------------------------------------------+--------+------------------------------------------------------------------------------+---------------------------------------+--------+
| ID                                   | Name                                     | Status | Networks                                                                     | Image                                 | Flavor |
+--------------------------------------+------------------------------------------+--------+------------------------------------------------------------------------------+---------------------------------------+--------+
| 048a6bcd-5ada-4127-b95a-9962cf31e80f | tempest-TestNfvOffload-server-1590230603 | ACTIVE | mellanox-geneve-provider=20.20.220.112; mellanox-vlan-provider=30.30.220.147 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
| f71166db-2a49-4d73-aa73-4573b3bf8db5 | tempest-TestNfvOffload-server-741569861  | ACTIVE | mellanox-geneve-provider=20.20.220.106; mellanox-vlan-provider=30.30.220.132 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
| c71d4bdf-4aa7-4b69-8405-6633c5864a7d | tempest-TestNfvOffload-server-922847220  | ACTIVE | mellanox-geneve-provider=20.20.220.196; mellanox-vlan-provider=30.30.220.172 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
| 1ac387cf-31cf-4cf6-97b4-acb3034cca31 | tempest-TestNfvOffload-server-1803927283 | ACTIVE | mellanox-geneve-provider=20.20.220.188; mellanox-vlan-provider=30.30.220.175 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
+--------------------------------------+------------------------------------------+--------+------------------------------------------------------------------------------+---------------------------------------+

These are the ports and macs:
(venv) (overcloud) [stack@undercloud-0 ~]$ openstack port list | egrep "220.112|220.106|220.196|220.188|220.147|220.132|220.172|220.175"                                                                          
| 54397464-6866-40dd-98cd-8c3e1f48e018 | tempest-port-smoke-1207250115 | fa:16:3e:40:bf:a6 | ip_address='30.30.220.132', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE |                              
| 5f893749-f4ba-4230-b46a-9a1491b0bc6c | tempest-port-smoke-331751474  | fa:16:3e:05:48:82 | ip_address='20.20.220.106', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE |                              
| 7052864e-8624-493d-b23c-6b8ecbf54d8f | tempest-port-smoke-1880359353 | fa:16:3e:2d:1c:71 | ip_address='30.30.220.175', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE |                              
| 883f8950-5930-4549-a5e8-c6a8d44cf9c8 | tempest-port-smoke-1206809988 | fa:16:3e:e7:e1:1e | ip_address='20.20.220.112', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE |                              
| 92c7fd6f-929c-4f2a-b378-6ac69946ffc8 | tempest-port-smoke-441515917  | fa:16:3e:b6:fa:2c | ip_address='20.20.220.188', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE |                              
| a31551fc-bc18-4c11-971f-9acc6e12d51c | tempest-port-smoke-1291037828 | fa:16:3e:4c:0c:21 | ip_address='30.30.220.147', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE |                              
| ec05e264-e102-4c00-a04c-9873d1c7c9b9 | tempest-port-smoke-612065238  | fa:16:3e:52:ed:2a | ip_address='30.30.220.172', subnet_id='82969e14-51b1-4993-90d2-0269dd0bdf8d' | ACTIVE |                              
| ffbae10d-2440-4af7-8a29-7da30d537f59 | tempest-port-smoke-1068018430 | fa:16:3e:37:96:8b | ip_address='20.20.220.196', subnet_id='8aa962a5-354f-4406-8fa1-baa73ff14f2b' | ACTIVE | 

These are representor ports used:
hypervisor 192.0.50.18
29: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000
    link/ether 98:03:9b:9c:50:58 brd ff:ff:ff:ff:ff:ff
    vf 8     link/ether fa:16:3e:37:96:8b brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
    vf 9     link/ether fa:16:3e:05:48:82 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
50: enp4s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000
    link/ether 98:03:9b:9c:50:58 brd ff:ff:ff:ff:ff:ff permaddr 98:03:9b:9c:50:59
    vf 8     link/ether fa:16:3e:52:ed:2a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
    vf 9     link/ether fa:16:3e:40:bf:a6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
hypervisor 192.0.50.11
29: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000
    link/ether ec:0d:9a:7d:7d:32 brd ff:ff:ff:ff:ff:ff
    vf 8     link/ether fa:16:3e:b6:fa:2c brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
    vf 9     link/ether fa:16:3e:e7:e1:1e brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
50: enp4s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000
    link/ether ec:0d:9a:7d:7d:32 brd ff:ff:ff:ff:ff:ff permaddr ec:0d:9a:7d:7d:33
    vf 8     link/ether fa:16:3e:2d:1c:71 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
    vf 9     link/ether fa:16:3e:4c:0c:21 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off


Test: ping from 20.20.220.188 to 20.20.220.196 fails. If we execute tcpdump in 20.20.220.106 while executing ping, we can see how packets are comming, but they do not go to 20.20.220.196, so ping fails.  Below are the flows, and we can see that it is being delivered to enp4s0f0_9 instead of enp4s0f0_8

flows:
Hypervisor 192.0.50.11

ufid:bb8da3b2-727e-415b-b55c-fb1f19310940, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_8),packet_type(ns=0/0,id=0/0),eth(src=$a:16:3e:b6:fa:2c,dst=fa:16:3e:37:96:8b),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=20.20.220.192/255.255.255.224,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:22, bytes:3520, used:0.360s, offloaded:yes, dp:t$, actions:set(tunnel(tun_id=0x3,dst=10.10.121.176,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081                              


Hypervisor 192.0.50.18

ufid:734179ef-b530-42f8-aed6-e41ba813d5e7, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.146,dst=10.10.121.176,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20004/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:b6:fa:2c,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:68, bytes:6664, used:0.290s, offloaded:yes, dp:tc, actions:enp4s0f0_9

ufid:6ecf939a-0d5c-4c69-a396-5bbc424e5cfb, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.146,dst=10.10.121.176,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20004/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:b6:fa:2c,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=0.0.0.0/0.0.0.0,op=0/0,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:0, bytes:0, used:1.310s, offloaded:yes, dp:tc, actions:enp4s0f0_9

Comment 55 maord 2021-10-24 08:48:47 UTC
The issue can be related to this fix.
This fix solves an issue with HW offload of different geneve tunnel with the same tunnel src/dst ip, id and port but different geneve options.

commit 929a2faddd55290fbb0b73f453b200ed1b2b2947
Author: Dima Chumak <dchumak>
Date:   Thu Feb 11 09:36:33 2021 +0200

    net/mlx5e: Consider geneve_opts for encap contexts
    
    Current algorithm for encap keys is legacy from initial vxlan
    implementation and doesn't take into account all possible fields of a
    tunnel. For example, for a Geneve tunnel, which may have additional TLV
    options, they are ignored when comparing encap keys and a rule can be
    attached to an incorrect encap entry.
    
    Fix that by introducing encap_info_equal() operation in
    struct mlx5e_tc_tunnel. Geneve tunnel type uses custom implementation,
    which extends generic algorithm and considers options if they are set.
    
    Fixes: 7f1a546e3222 ("net/mlx5e: Consider tunnel type for encap contexts")
    Signed-off-by: Dima Chumak <dchumak>
    Reviewed-by: Vlad Buslov <vladbu>
    Signed-off-by: Saeed Mahameed <saeedm>

Comment 56 Haresh Khandelwal 2021-10-25 12:44:37 UTC
Thanks Maor,
Hi Amir, Is it possible to get 4.18 kernel patch with fix? We can try out and if it fixes the issue, can request for backport.

Thanks

Comment 57 Amir Tzin (Mellanox) 2021-10-27 08:55:06 UTC
(In reply to Haresh Khandelwal from comment #56)
> Thanks Maor,
> Hi Amir, Is it possible to get 4.18 kernel patch with fix? We can try out
> and if it fixes the issue, can request for backport.
> 
> Thanks

Hi, 

we have this fix already in kernel-4.18.0-324.el8 and above from RHEL 8.5 branch.
Is it enough for your testing ?

Comment 58 Haresh Khandelwal 2021-10-27 11:07:39 UTC
Thanks Amir, 

So, shall i assume that commit#929a2faddd55290fbb0b73f453b200ed1b2b2947 would fix this issue?
RHOSP 16.2.x would be shipped with RHEL 8.4 throughout life. The latest compose has kernel version 4.18.0-305.19.1.el8_4.
I am not aware of how RHEL picks kernel version or if next 8.4z would have the fix kernel. If not, then we need to backport it. 

Marcelo, can you help here? 

Thanks

Comment 59 Marcelo Ricardo Leitner 2021-10-27 15:43:27 UTC
8.4.z kernels will always be 4.18.0-305.*.el8_4. With that, yes, we would need to backport the fix to 8.4.z so that RHOSP can have it.

In theory we would need two tests here:
- one with y-stream/8.5 kernel, to be sure the issue is fixed in y-stream
  we don't want regressions for customers updating from 8.4.z to 8.5 or 8.6 later on.

- one with a test kernel on 8.4.z, to be sure that the fix is complete and no dependencies were missed
  we don't want to backport something that later on we find out "ooops, missed this other commit".

We can skip one of them if there's enough confidence, though. 
I think the patch is spot on. If Nvidia agrees, we can proceed with just the 2nd test, with a test kernel for 8.4.z.

Comment 60 Haresh Khandelwal 2021-10-28 07:25:03 UTC
Hi Marcelo,

(In reply to Marcelo Ricardo Leitner from comment #59)
> 8.4.z kernels will always be 4.18.0-305.*.el8_4. With that, yes, we would
> need to backport the fix to 8.4.z so that RHOSP can have it.
Good, Thanks
> 
> In theory we would need two tests here:
> - one with y-stream/8.5 kernel, to be sure the issue is fixed in y-stream
>   we don't want regressions for customers updating from 8.4.z to 8.5 or 8.6
> later on.
RHOSP has no plan to use RHEL 8.5 ever, RHOSP17 will be based on RHEL 9.
> 
> - one with a test kernel on 8.4.z, to be sure that the fix is complete and
> no dependencies were missed
>   we don't want to backport something that later on we find out "ooops,
> missed this other commit".
> 
> We can skip one of them if there's enough confidence, though. 
> I think the patch is spot on. If Nvidia agrees, we can proceed with just the
> 2nd test, with a test kernel for 8.4.z.
Yes, this Bz was found in our CI, so it would be easy to validate quickly if we have fix.

Thanks

Comment 61 Marcelo Ricardo Leitner 2021-10-28 18:51:24 UTC
(In reply to maord from comment #55)
> The issue can be related to this fix.
> This fix solves an issue with HW offload of different geneve tunnel with the
> same tunnel src/dst ip, id and port but different geneve options.
> 
> commit 929a2faddd55290fbb0b73f453b200ed1b2b2947
> Author: Dima Chumak <dchumak>
> Date:   Thu Feb 11 09:36:33 2021 +0200
> 
>     net/mlx5e: Consider geneve_opts for encap contexts

For the record, this bz was originally backported via https://bugzilla.redhat.com/show_bug.cgi?id=1915308
and that's where the 8.4.z will need to be requested, once the test confirms it.

Comment 63 Karrar Fida 2021-10-29 16:19:43 UTC
@atzin any news on the test NVIDIA build?

Comment 64 Amir Tzin (Mellanox) 2021-10-31 11:05:56 UTC
(In reply to Karrar Fida from comment #63)
> @atzin any news on the test NVIDIA build?

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=40713341
test kernel of RHEL-8.4 with 
929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts")

Comment 65 Elvira 2021-11-02 14:22:28 UTC
We think this is a test-only for the NFV team. Please switch it to our DFG if you think we are wrong!

Comment 66 Miguel Angel Nieto 2021-11-02 14:33:21 UTC
Hi, I have tested with that patch and the problem is not solved.

I have installed the patch:
(undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.20 "uname -a"
Warning: Permanently added '192.0.50.20' (ECDSA) to the list of known hosts.
Linux computeovshwoffload-0 4.18.0-305.26.1.el8_4.UNSUPPORTED_1966157.x86_64 #1 SMP Sun Oct 31 06:28:11 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
(undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.22 "uname -a"
Warning: Permanently added '192.0.50.22' (ECDSA) to the list of known hosts.
Linux computeovshwoffload-1 4.18.0-305.26.1.el8_4.UNSUPPORTED_1966157.x86_64 #1 SMP Sun Oct 31 06:28:11 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

I have check that I continue having problems with the ping
(overcloud) [stack@undercloud-0 ~]$ ping -c 1 -w 1 10.35.141.53;sleep 12; ping -c 1 -w 1 10.35.141.53
PING 10.35.141.53 (10.35.141.53) 56(84) bytes of data.

--- 10.35.141.53 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

PING 10.35.141.53 (10.35.141.53) 56(84) bytes of data.
64 bytes from 10.35.141.53: icmp_seq=1 ttl=61 time=10.0 ms

--- 10.35.141.53 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 10.013/10.013/10.013/0.000 ms

Comment 69 Marcelo Ricardo Leitner 2021-11-03 14:38:23 UTC
Hi Miguel,

If the test was positive, we would have skipped this test, but then:
Please try the latest 8.5 kernel as well. Maybe something went sour in the backport to 8.4.z, missed a patch dependency or so.
You can download it from here: http://download.eng.bos.redhat.com/brewroot/packages/kernel/4.18.0/348.4.el8/

Thanks.

Comment 70 Miguel Angel Nieto 2021-11-03 15:46:08 UTC
Hi

I tried with previous patch and it is not solving the problem either. 

Apart from installing rpms and rebooting computes, should I do anything else to  ensure the that patch is installed properly? With  uname -a I can see that I have the correct kernel version, is it enough?


(undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.20 "uname -a"
Warning: Permanently added '192.0.50.20' (ECDSA) to the list of known hosts.
Linux computeovshwoffload-0 4.18.0-348.4.el8.x86_64 #1 SMP Mon Oct 25 15:08:07 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
(undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.50.22 "uname -a"
Warning: Permanently added '192.0.50.22' (ECDSA) to the list of known hosts.
Linux computeovshwoffload-1 4.18.0-348.4.el8.x86_64 #1 SMP Mon Oct 25 15:08:07 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux




(undercloud) [stack@undercloud-0 ~]$ ping -c 1 -w 1 10.35.141.51;sleep 12; ping -c 1 -w 1 10.35.141.51
PING 10.35.141.51 (10.35.141.51) 56(84) bytes of data.

--- 10.35.141.51 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

PING 10.35.141.51 (10.35.141.51) 56(84) bytes of data.
64 bytes from 10.35.141.51: icmp_seq=1 ttl=61 time=2.48 ms

--- 10.35.141.51 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.475/2.475/2.475/0.000 ms

Comment 71 Marcelo Ricardo Leitner 2021-11-03 17:06:57 UTC
(In reply to Miguel Angel Nieto from comment #70)
> I tried with previous patch and it is not solving the problem either. 

Thanks. That's a very important piece of information.

> 
> Apart from installing rpms and rebooting computes, should I do anything else
> to  ensure the that patch is installed properly? With  uname -a I can see
> that I have the correct kernel version, is it enough?

It is enough, yes.

Comment 72 Amir Tzin (Mellanox) 2021-11-04 13:26:53 UTC
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=40903423

test kernel of RHEL-8.4 with 
929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts")

I think that the build of comment 64 did not eventually contained the fix due to my mistake.

Comment 73 Ariel Levkovich 2021-11-08 15:12:35 UTC
Hi Folks,

The fix that Maor suggested above actually solves a problem on the encap side, where it will not respect different encap header that have different geneve options. Therefore the effects of this bug may be observed only on the receiver side which does the matching and classification on the geneve options. So a few questions:
1. Did u install and test the fix also on the client/traffic sender side?
2. If what I mentioned is true, you should see this behavior also with HW offload turned off (As well as see the same geneve options for all traffic on tcpdump). Can you please confirm that you verified that without HW offload it works?
3. We need to understand if this issue is already resolved upstream or is it something we should repro and debug properly in house. Can you please confirm for us?

Thanks,
Ariel

Comment 74 Miguel Angel Nieto 2021-11-08 16:26:36 UTC
Hi

I answer the questions:
1. Yes, I patched overcloud image during deployment with the new kernel, so the kernel updates were everywhere. The issue is happening in all compute nodes if I have in that compute node 2 or more vms attached to the same geneve network.
2. When I tested the kernel patch I only tested with offload enabled, but from previous tests I did I can confirm that the issue only happens with hw offload. There is no issue if offload is disabled.
3. I will try to get more information about this point

Regards
Miguel

Comment 75 Marcelo Ricardo Leitner 2021-11-08 16:52:34 UTC
(In reply to Miguel Angel Nieto from comment #74)
> 3. I will try to get more information about this point

You can use an ARK kernel for that, btw.
It's the kernel-* packages at  https://odcs.fedoraproject.org/composes/production/latest-Fedora-ELN/compose/BaseOS/x86_64/os/Packages/

They should be fresh enough for this test, and they install nicely o RHEL 8.

Comment 76 Miguel Angel Nieto 2021-11-09 02:02:35 UTC
Thanks for the repo. I have tried today, I didnt have any issue to update kernel packages but the servers are not booting properly after updating the kernel, I would say some services are not working properly, ssh is broken. I will need some more time to see what is happening.

Comment 78 Ariel Levkovich 2021-11-16 15:21:54 UTC
Hi Folks, any update on the upstream testing here?

Comment 79 Miguel Angel Nieto 2021-11-17 15:59:09 UTC
sorry for the delay. I will tested it between thursday and friday.

Comment 80 Miguel Angel Nieto 2021-11-19 23:51:23 UTC
Hi

I tested with upstream kernel and I didnt found the issue, I think it is working properly
Linux computeovshwoffload-0 5.15.0-60.eln113.x86_64 #1 SMP Mon Nov 1 16:50:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux


I have ping to both vms at the same time and I have got the flows:

VMS:
(overcloud) [stack@undercloud-0 ~]$ openstack server list --a | egrep "141.58|141.55"
| e378517a-66de-4363-882c-6a7b11035f24 | tempest-TestNfvOffload-server-2033717592 | ACTIVE | mellanox-geneve-provider=20.20.220.180, 10.35.141.58; mellanox-vlan-provider=30.30.220.116 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |
| d075c648-197f-460a-a88f-5f19933447e3 | tempest-TestNfvOffload-server-1328589616 | ACTIVE | mellanox-geneve-provider=20.20.220.171, 10.35.141.55; mellanox-vlan-provider=30.30.220.162 | rhel-guest-image-7-6-210-x86-64-qcow2 |        |

PORTS
(overcloud) [stack@undercloud-0 ~]$ openstack port list | egrep "180|171"
| 89979eb3-b59d-4c6b-b81c-5264aecd60c8 | tempest-port-smoke-988384003  | fa:16:3e:03:8d:eb | ip_address='20.20.220.180', subnet_id='b25f99f4-5441-4df1-ab99-b3d1c5885042' | ACTIVE |
| 89d9df81-3bea-412d-a71b-663df1ed0ce7 | tempest-port-smoke-297925914  | fa:16:3e:de:26:4c | ip_address='20.20.220.171', subnet_id='b25f99f4-5441-4df1-ab99-b3d1c5885042' | ACTIVE |

VFS:
11: enp4s0f0: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000
    link/ether 98:03:9b:9c:50:58 brd ff:ff:ff:ff:ff:ff
    vf 2     link/ether fa:16:3e:03:8d:eb brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
    vf 9     link/ether fa:16:3e:de:26:4c brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off

FLOWS
ufid:951f5f75-93a2-482b-a090-9d28a84cf28e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_9),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:de:26:4c,dst=fa:16:3e:96:d9:20),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=10.35.0.0/255.255.128.0,proto=1,tos=0/0x3,ttl=64,frag=no),icmp(type=0/0,code=0/0), packets:372, bytes:59520, used:0.850s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.121.103,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:74:20:44,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081
ufid:0951ac83-ff60-4464-a216-5f52fde3307f, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_9),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:de:26:4c,dst=fa:16:3e:96:d9:20),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=32.0.0.0/224.0.0.0,proto=17,tos=0/0x3,ttl=64,frag=no),udp(src=0/0,dst=0/0x800), packets:1, bytes:152, used:3.070s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.121.103,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:74:20:44,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081
ufid:aa2a6075-74e8-4022-b3d6-9d3192cc880d, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_2),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:03:8d:eb,dst=fa:16:3e:96:d9:20),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=10.35.0.0/255.255.128.0,proto=1,tos=0/0x3,ttl=64,frag=no),icmp(type=0/0,code=0/0), packets:379, bytes:60640, used:0.850s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.121.103,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:74:20:44,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081
ufid:28fb9f44-4f87-4fe8-b2e6-6ee76847673a, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60005/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:96:d9:20,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=1,tos=0/0,ttl=0/0,frag=no),icmp(type=0/0,code=0/0), packets:379, bytes:37142, used:0.850s, offloaded:yes, dp:tc, actions:enp4s0f0_2
ufid:8bbf6429-398f-46d3-b719-5fbbf506d539, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60003/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:96:d9:20,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=1,tos=0/0,ttl=0/0,frag=no),icmp(type=0/0,code=0/0), packets:372, bytes:36456, used:0.850s, offloaded:yes, dp:tc, actions:enp4s0f0_9
ufid:d7a56785-6997-45ed-b162-ffe59dab9364, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60003/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:96:d9:20,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=32768/0x8000), packets:3, bytes:270, used:3.070s, offloaded:yes, dp:tc, actions:enp4s0f0_9
ufid:4e9f7708-d134-48fd-9eb0-ad9f646afb14, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(enp6s0f0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=d0:07:ca:34:e9:17,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=124,pcp=0/0x0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:0, bytes:0, used:never, dp:ovs, actions:userspace(pid=4183418373,slow_path(match))
ufid:e3a5a9ff-3f1d-4a5d-ac75-0a006da3b30e, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x0,src=10.10.121.172,dst=10.10.121.169,ttl=0/0,flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=3784), packets:931, bytes:61446, used:0.923s, dp:ovs, actions:userspace(pid=3978135798,slow_path(bfd))
ufid:b652e835-b598-4b30-8be1-379a4b94df21, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x0,src=10.10.121.131,dst=10.10.121.169,ttl=0/0,flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=3784), packets:931, bytes:61446, used:0.359s, dp:ovs, actions:userspace(pid=3978135798,slow_path(bfd))
ufid:6d0b12db-380d-4f64-81e1-bc1d6614025e, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x0,src=10.10.121.103,dst=10.10.121.169,ttl=0/0,flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=3784), packets:938, bytes:61908, used:0.207s, dp:ovs, actions:userspace(pid=3978135798,slow_path(bfd))
ufid:3f0eb711-5f01-46ca-91ed-bf5cd2a0cb80, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(enp4s0f0_9),skb_mark(0/0),ct_state(0/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:de:26:4c,dst=fa:16:3e:96:d9:20),eth_type(0x0806),arp(sip=20.20.220.171,tip=20.20.220.254,op=1/0xff,sha=fa:16:3e:de:26:4c,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never, dp:ovs, actions:userspace(pid=2567040171,slow_path(action))

Comment 81 Ariel Levkovich 2021-11-21 21:26:20 UTC
So do we have any idea between which versions/commits we should do the bisect?

Comment 82 Haresh Khandelwal 2021-11-22 17:41:36 UTC
Marcelo, Do you have any idea for comment#81?

Comment 83 Marcelo Ricardo Leitner 2021-11-22 20:20:09 UTC
According to comment #70, it DID NOT work with 8.5 kernel 4.18.0-348.4.el8.
That kernel has driver rebased to v5.12 as per https://bugzilla.redhat.com/show_bug.cgi?id=1915308.

It also has tc rebased to "latest upstream" (fuzzy) by https://bugzilla.redhat.com/show_bug.cgi?id=1946986, which seems it's v5.13.

I don't see any net/openvswitch changes between 8.5 and current net-next, 89f971182417cb27abd82cfc48a7f36b99352ddc.

Comment #80 says it worked with v5.15.

With that, I'm thinking the haystack that we're looking for this needle is v5.12..v5.15.



And then, while checking the driver diff between 8.5 and 89f971182417cb27abd82cfc48a7f36b99352ddc, I noticed this commit:

$ git show 3442e0335e70f348728c17bca924ec507ad6358a
commit 3442e0335e70f348728c17bca924ec507ad6358a
Author: Yevgeny Kliteynik <kliteyn>
Date:   Sun Feb 7 04:27:48 2021 +0200

    net/mlx5: DR, Add support for matching on geneve TLV option

    Enable matching on tunnel geneve TLV option using the flex parser.


Well, that's precisely what is being done here. The commit has:

@@ -360,10 +365,14 @@ static int dr_matcher_set_ste_builders(struct mlx5dr_matcher *matcher,
                if (dr_mask_is_tnl_vxlan_gpe(&mask, dmn))
                        mlx5dr_ste_build_tnl_vxlan_gpe(ste_ctx, &sb[idx++],
                                                       &mask, inner, rx);
-               else if (dr_mask_is_tnl_geneve(&mask, dmn))
+               else if (dr_mask_is_tnl_geneve(&mask, dmn)) {
                        mlx5dr_ste_build_tnl_geneve(ste_ctx, &sb[idx++],
                                                    &mask, inner, rx);
-
+                       if (dr_mask_is_tnl_geneve_tlv_opt(&mask.misc3))
+                               mlx5dr_ste_build_tnl_geneve_tlv_opt(ste_ctx, &sb[idx++],
+                                                                   &mask, &dmn->info.caps,
+                                                                   inner, rx);
+               }
                if (DR_MASK_IS_ETH_L4_MISC_SET(mask.misc3, outer))
                        mlx5dr_ste_build_eth_l4_misc(ste_ctx, &sb[idx++],
                                                     &mask, inner, rx);

This is too deep in the driver for me now, but apparently up to this commit it was ignoring this part of the information.
This commit is on DR (direct routing / which is AKA software steering), which OSP is using.

Checking the patchset that introduced this commit, the cover letter mentions:
https://lore.kernel.org/netdev/20210420032018.58639-1-saeed%40kernel.org/T/

"""
3) Dynamic Flex parser support:
   Flex parser is a HW parser that can support protocols that are not
    natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
    There are 8 such parsers, and each of them can be assigned to parse a
    specific set of protocols.

4) Enable matching on Geneve TLV options
"""

With the writing on #4, apparently that's the case.

The patch that we attempted earlier,
929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts"),
AFAICT now, it's meant for tx, right? While 3442e0335e70 ("net/mlx5: DR, Add support for matching on geneve TLV option")
is on rx side and we would need both. Note that the test here is failing by delivering the packets from the
wire to the wrong VF, which would be 'rx' in my wording here.

Depending on Nvidia's review now, perhaps we can narrow down that v5.12~v5.15 further.
Ariel, thoughts? Any other test that we can do?

Comment 84 Marcelo Ricardo Leitner 2021-11-22 20:28:59 UTC
(In reply to Marcelo Ricardo Leitner from comment #83)
> $ git show 3442e0335e70f348728c17bca924ec507ad6358a
> commit 3442e0335e70f348728c17bca924ec507ad6358a
> Author: Yevgeny Kliteynik <kliteyn>
> Date:   Sun Feb 7 04:27:48 2021 +0200
> 
>     net/mlx5: DR, Add support for matching on geneve TLV option
> 
>     Enable matching on tunnel geneve TLV option using the flex parser.

This commit is probably slated for 8.6, via https://bugzilla.redhat.com/show_bug.cgi?id=1982191 .
But Alaa/Amir will know better.

Comment 85 Karrar Fida 2021-11-30 11:39:41 UTC
Not to be backported to 8.4 and will therefor be release noted.

Comment 86 Ariel Levkovich 2021-12-01 01:38:47 UTC
(In reply to Marcelo Ricardo Leitner from comment #83)
> According to comment #70, it DID NOT work with 8.5 kernel 4.18.0-348.4.el8.
> That kernel has driver rebased to v5.12 as per
> https://bugzilla.redhat.com/show_bug.cgi?id=1915308.
> 
> It also has tc rebased to "latest upstream" (fuzzy) by
> https://bugzilla.redhat.com/show_bug.cgi?id=1946986, which seems it's v5.13.
> 
> I don't see any net/openvswitch changes between 8.5 and current net-next,
> 89f971182417cb27abd82cfc48a7f36b99352ddc.
> 
> Comment #80 says it worked with v5.15.
> 
> With that, I'm thinking the haystack that we're looking for this needle is
> v5.12..v5.15.
> 
> 
> 
> And then, while checking the driver diff between 8.5 and
> 89f971182417cb27abd82cfc48a7f36b99352ddc, I noticed this commit:
> 
> $ git show 3442e0335e70f348728c17bca924ec507ad6358a
> commit 3442e0335e70f348728c17bca924ec507ad6358a
> Author: Yevgeny Kliteynik <kliteyn>
> Date:   Sun Feb 7 04:27:48 2021 +0200
> 
>     net/mlx5: DR, Add support for matching on geneve TLV option
> 
>     Enable matching on tunnel geneve TLV option using the flex parser.
> 
> 
> Well, that's precisely what is being done here. The commit has:
> 
> @@ -360,10 +365,14 @@ static int dr_matcher_set_ste_builders(struct
> mlx5dr_matcher *matcher,
>                 if (dr_mask_is_tnl_vxlan_gpe(&mask, dmn))
>                         mlx5dr_ste_build_tnl_vxlan_gpe(ste_ctx, &sb[idx++],
>                                                        &mask, inner, rx);
> -               else if (dr_mask_is_tnl_geneve(&mask, dmn))
> +               else if (dr_mask_is_tnl_geneve(&mask, dmn)) {
>                         mlx5dr_ste_build_tnl_geneve(ste_ctx, &sb[idx++],
>                                                     &mask, inner, rx);
> -
> +                       if (dr_mask_is_tnl_geneve_tlv_opt(&mask.misc3))
> +                               mlx5dr_ste_build_tnl_geneve_tlv_opt(ste_ctx,
> &sb[idx++],
> +                                                                   &mask,
> &dmn->info.caps,
> +                                                                   inner,
> rx);
> +               }
>                 if (DR_MASK_IS_ETH_L4_MISC_SET(mask.misc3, outer))
>                         mlx5dr_ste_build_eth_l4_misc(ste_ctx, &sb[idx++],
>                                                      &mask, inner, rx);
> 
> This is too deep in the driver for me now, but apparently up to this commit
> it was ignoring this part of the information.
> This commit is on DR (direct routing / which is AKA software steering),
> which OSP is using.
> 
> Checking the patchset that introduced this commit, the cover letter mentions:
> https://lore.kernel.org/netdev/20210420032018.58639-1-saeed%40kernel.org/T/
> 
> """
> 3) Dynamic Flex parser support:
>    Flex parser is a HW parser that can support protocols that are not
>     natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
>     There are 8 such parsers, and each of them can be assigned to parse a
>     specific set of protocols.
> 
> 4) Enable matching on Geneve TLV options
> """
> 
> With the writing on #4, apparently that's the case.
> 
> The patch that we attempted earlier,
> 929a2faddd55 ("net/mlx5e: Consider geneve_opts for encap contexts"),
> AFAICT now, it's meant for tx, right? While 3442e0335e70 ("net/mlx5: DR, Add
> support for matching on geneve TLV option")
> is on rx side and we would need both. Note that the test here is failing by
> delivering the packets from the
> wire to the wrong VF, which would be 'rx' in my wording here.
> 
> Depending on Nvidia's review now, perhaps we can narrow down that
> v5.12~v5.15 further.
> Ariel, thoughts? Any other test that we can do?

I think we have a bingo. Nice catch Marcelo. This is indeed affecting matching on geneve headers on the RX path.
Looks like u have a valid test for the RX fix already.
To validate the TX we need to try and send traffic with different geneve options (but same tunnel IPs) from the same host and see that indeed the different flows have different options.

Comment 87 Marcelo Ricardo Leitner 2021-12-02 12:18:33 UTC
Thanks Ariel.
With that, Amir, can we have 8.4.z test kernel with this fix/series also please? Thanks.

Comment 88 Ariel Levkovich 2021-12-06 17:21:26 UTC
(In reply to Marcelo Ricardo Leitner from comment #87)
> Thanks Ariel.
> With that, Amir, can we have 8.4.z test kernel with this fix/series also
> please? Thanks.

Marcelo, we need to confirm the repro u have is with SW steering because the patch you pointed out is relevant only to that mode.

Ariel

Comment 89 Marcelo Ricardo Leitner 2021-12-06 19:31:26 UTC
Right. I thought I had asked folks that already, but if I did, I don't know where. :-}

Yariv, Haresh, can you please confirm Ariel's question on comment #88?

Thanks,
Marcelo

Comment 90 Haresh Khandelwal 2021-12-07 09:21:57 UTC
Hi Ariel, Marcelo,

Steering mode in OSP (16.1.3 onwards) is smfs. 

Thanks

Comment 91 Marcelo Ricardo Leitner 2021-12-07 13:36:31 UTC
And the test was using 16.2, ok. Thanks Haresh.

Ariel, should we assume that changing to smfs is hard to fail? Because in that case, it would be using dmfs, but in a normal run I have never seen the change to smfs fail. Asking because AFAIK OSP ignores the failure and continues with dmfs.

Comment 92 Ariel Levkovich 2021-12-07 17:04:37 UTC
As long as it is set while in legacy mode (not switchdev) it is not likely to fail.

Comment 93 Marcelo Ricardo Leitner 2021-12-09 00:15:51 UTC
With all the above, my understanding is that we can safely assume the host was using smfs at that moment.
Please speak if you (anyone) don't agree. :-)

Comment 94 Haresh Khandelwal 2021-12-09 13:24:45 UTC
Hi Miguel,

From comment#48, this issue is only related to ml2/ovn(and thus geneve), can you please update the bug summary and remove "ovs"?

Thanks

Comment 98 Marcelo Ricardo Leitner 2022-01-06 21:03:27 UTC
Considering patch from comment #86 is already present in 9.0 beta and we agreed today to not backport this to rhel8 unless requested by a customer (well, or via a general driver update), we're good from the RHEL side on this issue.

Comment 99 Marcelo Ricardo Leitner 2022-01-13 12:26:45 UTC
Haresh, considering the above, what should we do this bz then?

Comment 101 Haresh Khandelwal 2022-02-21 16:20:34 UTC
*** Bug 2014183 has been marked as a duplicate of this bug. ***

Comment 106 Lukas Svaty 2023-06-08 11:06:56 UTC
Raising severity due to the AutomationBlocker keyword

Comment 111 Miguel Angel Nieto 2023-06-22 12:04:03 UTC
This issue is not happening in 17.1. I have 2 vms in each compute and I can ping all of them

(overcloud) [stack@undercloud-0 ~]$ openstack server list --all-projects --long
+--------------------------------------+------------------------------------------+--------+------------+-------------+------------------------------------------------------+----------------------------------------------+--------------------------------------+--------------------+-------------------+-----------------------------------+------------+
| ID                                   | Name                                     | Status | Task State | Power State | Networks                                             | Image Name                                   | Image ID                             | Flavor             | Availability Zone | Host                              | Properties |
+--------------------------------------+------------------------------------------+--------+------------+-------------+------------------------------------------------------+----------------------------------------------+--------------------------------------+--------------------+-------------------+-----------------------------------+------------+
| 9c4b9a42-ad31-4e4c-af8b-cdf6f616ef84 | tempest-TestNfvOffload-server-1581528699 | ACTIVE | None       | Running     | mellanox-geneve-provider=10.46.228.40, 20.20.220.178 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_1 | nova              | computehwoffload-r740.localdomain |            |
| 384e3861-fe40-4b32-a3e5-5df6bba27d08 | tempest-TestNfvOffload-server-123555249  | ACTIVE | None       | Running     | mellanox-geneve-provider=10.46.228.39, 20.20.220.149 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_0 | nova              | computehwoffload-r730.localdomain |            |
| fdc3a9a1-1271-46db-b6e6-1e69adac4944 | tempest-TestNfvOffload-server-1714201295 | ACTIVE | None       | Running     | mellanox-geneve-provider=10.46.228.35, 20.20.220.118 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_1 | nova              | computehwoffload-r740.localdomain |            |
| cb50f5de-7a1c-4ead-a7cc-2b5d12cc5f03 | tempest-TestNfvOffload-server-514728000  | ACTIVE | None       | Running     | mellanox-geneve-provider=10.46.228.34, 20.20.220.140 | rhel-guest-image-nfv-2-8.7-1660.x86_64.qcow2 | c4d2c2e7-536e-4aa8-b3b7-e8f0f6d0cc90 | nfv_qe_ag_flavor_0 | nova              | computehwoffload-r730.localdomain |            |
+--------------------------------------+------------------------------------------+--------+------------+-------------+------------------------------------------------------+----------------------------------------------+--------------------------------------+--------------------+-------------------+-----------------------------------+------------+
(overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.40
PING 10.46.228.40 (10.46.228.40) 56(84) bytes of data.
64 bytes from 10.46.228.40: icmp_seq=1 ttl=61 time=7.34 ms

--- 10.46.228.40 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.343/7.343/7.343/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.39
PING 10.46.228.39 (10.46.228.39) 56(84) bytes of data.
64 bytes from 10.46.228.39: icmp_seq=1 ttl=61 time=7.64 ms

--- 10.46.228.39 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.637/7.637/7.637/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.35
PING 10.46.228.35 (10.46.228.35) 56(84) bytes of data.
64 bytes from 10.46.228.35: icmp_seq=1 ttl=61 time=5.51 ms

--- 10.46.228.35 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.510/5.510/5.510/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping -w 1 -c 1 10.46.228.34
PING 10.46.228.34 (10.46.228.34) 56(84) bytes of data.
64 bytes from 10.46.228.34: icmp_seq=1 ttl=61 time=2.19 ms

--- 10.46.228.34 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.189/2.189/2.189/0.000 ms

(overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
RHOS-17.1-RHEL-9-20230613.n.1(overcloud) [stack@undercloud-0 ~]


Note You need to log in before you can comment on or make changes to this bug.