Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1927799

Summary: [OVN] mlx5_core shows unsupported parameters while offloading ingress geneve tunnel
Product: Red Hat OpenStack Reporter: Haresh Khandelwal <hakhande>
Component: python-networking-ovnAssignee: Alaa Hleihel (NVIDIA Mellanox) <ahleihel>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 16.1 (Train)CC: ahleihel, apevec, cswanson, igallagh, jraju, kfida, lhh, lmartins, majopela, mleitner, mnietoji, scohen, spower, sputhenp, supadhya
Target Milestone: z4Keywords: TestBlocker, TestOnly, Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1927807 (view as bug list) Environment:
Last Closed: 2021-03-17 15:36:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1789380, 1927807, 1932407    
Bug Blocks:    

Description Haresh Khandelwal 2021-02-11 14:52:06 UTC
Description of problem:

From Sender compute 0

ufid:82f3fade-f87d-4815-a635-6d8584e13c8c, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1_1),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=f8:f2:1e:03:bf:f4),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:662, bytes:105920, used:0.610s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=152.20.0.11,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081

Flow is offloaded.

To receiver compute 1

ufid:69865966-0160-4181-8fd7-c2bab95b2805, skb_priority(0/0),tunnel(tun_id=0x2,src=152.20.0.206,dst=152.20.0.11,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20003/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),
ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f2,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:11, bytes:924, used:0.010s, dp:tc, actions:ens1f1_1

Programmed in tc sw datapath and not offloaded. 

This happens in reverse direction as well e.g. Compute 1 becomes sender and Compute 0 becomes receiver. Incoming tunneled packets are not offloaded. 

I see below in dmesg on receiver nodes, but cant co-relate if this is due to geneve tunnel flow. 

[ 4755.573872] mlx5_core 0000:65:00.1: Mask contains unsupported parameters
[ 4755.573874] mlx5_core 0000:65:00.1: Mask contains unsupported parameters
[ 4755.573877] mlx5_core 0000:65:00.1: Mask contains unsupported parameters
[ 4755.573879] mlx5_core 0000:65:00.1: Mask contains unsupported parameters
[ 4755.573881] mlx5_core 0000:65:00.1: mlx5_cmd_dr_create_flow_group:163:(pid 2412): Failed creating matcher
[ 4755.689293] mlx5_core 0000:65:00.0: Mask contains unsupported parameters
[ 4755.689296] mlx5_core 0000:65:00.0: Mask contains unsupported parameters
[ 4755.689298] mlx5_core 0000:65:00.0: Mask contains unsupported parameters
[ 4755.689301] mlx5_core 0000:65:00.0: Mask contains unsupported parameters
[ 4755.689303] mlx5_core 0000:65:00.0: mlx5_cmd_dr_create_flow_group:163:(pid 2412): Failed creating matcher

[root@cmpt-offload-1 openvswitch]# ethtool -i ens1f0
driver: mlx5e_rep
version: 4.18.0-193.41.1.el8_2.x86_64
firmware-version: 16.25.8000 (DEL0000000004)
expansion-rom-version:
bus-info: 0000:65:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
[root@cmpt-offload-1 openvswitch]# ethtool -i ens1f1   
driver: mlx5e_rep
version: 4.18.0-193.41.1.el8_2.x86_64
firmware-version: 16.25.8000 (DEL0000000004)
expansion-rom-version:
bus-info: 0000:65:00.1
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
[root@cmpt-offload-1 openvswitch]#

[root@cmpt-offload-1 openvswitch]# podman exec -it nova_compute tc filter show dev  ens1f1_1 ingress
filter protocol ip pref 4 flower chain 0
filter protocol ip pref 4 flower chain 0 handle 0x1
  dst_mac f8:f2:1e:03:bf:f2
  src_mac f8:f2:1e:03:bf:f4/01:00:00:00:00:00
  eth_type ipv4
  ip_tos 0x0/3
  ip_flags nofrag
  in_hw in_hw_count 1
action order 1: tunnel_key  set
src_ip 0.0.0.0  <<<<<<<<<<<<<<
dst_ip 152.20.0.206
key_id 2
dst_port 6081
geneve_opt 0102:80:00030002
csum
ttl 64 pipe
index 1 ref 1 bind 1

action order 2: mirred (Egress Redirect to device genev_sys_6081) stolen
  index 3 ref 1 bind 1
  cookie e4b1bb15b942d31aeb7b6fbc61fb8b98


Version-Release number of selected component (if applicable):
Kernel: 4.18.0-193.41.1.el8_2.x86_64
RHEL: 8.2
mlx5_core driver: 4.18.0-193.41.1.el8_2.x86_64
firmware: 16.25.8000 (DEL0000000004)

How reproducible:
Yes

Steps to Reproduce:
1. Deploy ml2/ovn geneve offload
2. Have VM with geneve provider network
3. Ping from VM to other side
4. Observe flow rules

Actual results:
geneve tunnel ingress flow not offloaded

Expected results:
geneve tunnel ingress flow should be offloaded

Additional info:
Spoke to Alla (NVIDIA) on this issue. We are missing few kernel patches for steering mode for geneve in rhel 8.2. We need to back port them.

Comment 4 Miguel Angel Nieto 2021-03-17 11:40:24 UTC
Verified

Below is 1 ping flow programming.

[root@computeovshwoffload-1 heat-admin]# ovs-appctl dpctl/dump-flows -m
ufid:2775e8b4-0f80-4368-b069-a25b6f83504a, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp7s0f0_7),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=f8:f2:1e:03:bf:f2),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:12, bytes:1920, used:0.880s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x3,dst=10.10.111.117,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x50004}),flags(key))),genev_sys_6081

skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.111.117,dst=10.10.111.109,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x40005/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f2,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:12, bytes:1176, used:0.880s, offloaded:yes, dp:tc, actions:enp7s0f0_7

skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.111.117,dst=10.10.111.109,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x40005/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f2,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=0.0.0.0/0.0.0.0,op=0/0,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:1, bytes:60, used:1.900s, offloaded:yes, dp:tc, actions:enp7s0f0_7

ufid:af5ff244-f977-4acf-9b0a-f9fee6494c52, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp7s0f0_7),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=f8:f2:1e:03:bf:f2),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=0.0.0.0/0.0.0.0,op=0/0,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:1, bytes:122, used:1.900s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x3,dst=10.10.111.117,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x50004}),flags(key))),genev_sys_6081

I have also run performance and I get 28mpps, so that shows that offloading is working.

Comment 5 Miguel Angel Nieto 2021-03-17 12:02:49 UTC
I forgot to mention versions:

RHOS-16.1-RHEL-8-20210311.n.1
kernel-4.18.0-193.47.1.el8_2.x86_64
openvswitch2.13-2.13.0-79.5.el8fdp.x86_64
ovn2.13-20.12.0-17.el8fdp.x86_64

Comment 9 errata-xmlrpc 2021-03-17 15:36:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0817