Bug 1946162 - [OSP16.1/16.2/RHEL8.4][ML2-OVN][Hw-Offload] Geneve HW Offload is broken in RHEL 8.4 without Connection Tracking
Summary: [OSP16.1/16.2/RHEL8.4][ML2-OVN][Hw-Offload] Geneve HW Offload is broken in RH...
Keywords:
Status: CLOSED COMPLETED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: z3
: 16.2 (Train on RHEL 8.4)
Assignee: Haresh Khandelwal
QA Contact: Miguel Angel Nieto
URL:
Whiteboard:
Depends On: 1983111 1997381 2022001
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-05 06:25 UTC by Itai Levy
Modified: 2023-10-30 01:20 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-10-30 01:20:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
os-net-config example of vf-lag (2.49 KB, text/plain)
2021-07-08 06:22 UTC, Itai Levy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker NFV-2086 0 None None None 2022-02-16 06:30:11 UTC
Red Hat Issue Tracker OSP-1783 0 None None None 2021-11-10 14:40:30 UTC

Description Itai Levy 2021-04-05 06:25:55 UTC
Description of problem:

Gevene offload with conntrack disabled which is working in 16.1.4 / RHEL 8.2 (GA), is broken in RHEL 8.4. Only one traffic direction is offloaded to HW.


Version-Release number of selected component (if applicable):
• OSP16.1.4 / 16.2
• RHEL8.4 with kernel 4.18.0-302.el8.x86_64
• ovn2.13-20.12.0-17.el8fdp.x86_64 
• openvswitch2.13-2.13.0-79.5.el8fdp.x86_64
• geneve tenant network , direct ports with "switchdev" capabilities and without security groups / port_security (conntrack disabled)
• VMs running iperf3 test


How reproducible:
Every time. Seen in both NVIDIA/Mellanox and RH labs

Steps to Reproduce:
1. deploy cloud
2. create geneve tenant network 
3. create direct ports with: --binding-profile '{"capabilities":["switchdev"]}' --no-security-group --disable-port-security
4. create instances with the ports and run traffic (iperf) between VMs over the geneve tunnels

Actual results:

RHEL 8.2: bidirectional flows are offloaded to HW
RHEL 8.4: only one direction is offloaded to HW, degraded performance results 


Expected results:
full offload in RHEL 8.4 as in RHEL 8.2

Additional info:

RHEL 8.2 flows dump - both directions are offloaded:

ufid:0e605a8f-c090-4890-835c-62a69f12e47d, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp219s0f0_14),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:6c:d8:c6,dst=fa:16:3e:c1:ec:b3),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=33.33.33.64/255.255.255.192,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:31993146, bytes:287914846441, used:0.940s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x1,dst=172.16.1.240,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081

ufid:1cb14b16-35e7-4d2c-b721-2b0a5b1673c5, skb_priority(0/0),tunnel(tun_id=0x1,src=172.16.1.240,dst=172.16.1.52,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:c1:ec:b3,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:1580713, bytes:104327070, used:0.940s, offloaded:yes, dp:tc, actions:enp219s0f0_14


RHEL 8.4 flows dump - only the direction from representor port to tunnel is offloaded:


ufid:3a659834-4cc0-42d4-9090-afcb350aa7ec, skb_priority(0/0),tunnel(tun_id=0x1,src=172.16.0.144,dst=172.16.0.46,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:c1:ec:b3,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:61802862, bytes:548996370821, used:0.000s, dp:tc, actions:ens1f0_15

ufid:f547928a-efe7-4b5d-8f62-849d63d68935, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0_15),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:6c:d8:c6,dst=fa:16:3e:c1:ec:b3),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=33.33.33.64/255.255.255.192,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:5816010, bytes:789140212, used:0.600s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x1,dst=172.16.0.144,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081

Comment 1 Itai Levy 2021-04-06 06:15:34 UTC
Some more info (ovs/tc dumps of the relevant flow) comparing RHEL 8.2 to 8.4:

RHEL 8.2

ufid:af679970-048e-4633-aa48-12a7c74149ce, skb_priority(0/0),tunnel(tun_id=0x1,src=172.16.1.115,dst=172.16.1.195,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20003/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:1f:82:94,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:20378733, bytes:182195059227, used:0.500s, offloaded:yes, dp:tc, actions:enp219s0f0_12


# tc -s filter show dev genev_sys_6081 parent ffff:
filter protocol ip pref 6 flower chain 0 
filter protocol ip pref 6 flower chain 0 handle 0x1 
  dst_mac fa:16:3e:ca:a3:0c/01:00:00:00:00:00
  src_mac fa:16:3e:1f:82:94
  eth_type ipv4
  enc_dst_ip 172.16.1.195
  enc_src_ip 172.16.1.115
  enc_key_id 1
  enc_dst_port 6081
  enc_tos 0x0/ff
  geneve_opt 0102:80:00020003/ffff:ff:7fffffff
  ip_flags nofrag
  in_hw in_hw_count 2
        action order 1: tunnel_key  unset pipe
         index 1 ref 1 bind 1 installed 95 sec used 95 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device enp219s0f0_12) stolen
        index 1 ref 1 bind 1 installed 95 sec used 0 sec
        Action statistics:
        Sent 437677715741 bytes 48948425 pkt (dropped 0, overlimits 0 requeues 0) 
        Sent software 0 bytes 0 pkt
        Sent hardware 437677715741 bytes 48948425 pkt
        backlog 0b 0p requeues 0
        cookie 02680f21a44df49d2f118d8d5be0d5f6


RHEL 8.4

ufid:7c9178d5-b6d8-4cd4-8abe-165cc32ac4e0, skb_priority(0/0),tunnel(tun_id=0x1,src=172.16.1.195,dst=172.16.1.115,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:ca:a3:0c,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:196068, bytes:10195548, used:0.000s, dp:tc, actions:enp219s0f0_15

filter ingress protocol ip pref 5 flower chain 0 
filter ingress protocol ip pref 5 flower chain 0 handle 0x1 
  dst_mac fa:16:3e:1f:82:94/01:00:00:00:00:00
  src_mac fa:16:3e:ca:a3:0c
  eth_type ipv4
  enc_dst_ip 172.16.1.115
  enc_src_ip 172.16.1.195
  enc_key_id 1
  enc_dst_port 6081
  enc_tos 0x0/ff
 geneve_opt 0102:80:00030002/ffff:ff:7fffffff
  ip_flags nofrag
  not_in_hw
        action order 1: tunnel_key  unset pipe
         index 2 ref 1 bind 1 installed 115 sec
        Action statistics:
        Sent 151352748 bytes 2910621 pkt (dropped 0, overlimits 0 requeues 0) 
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device enp219s0f0_15) stolen
        index 6 ref 1 bind 1 installed 115 sec
        Action statistics:
        Sent 151352748 bytes 2910621 pkt (dropped 0, overlimits 0 requeues 0) 
        backlog 0b 0p requeues 0
        cookie 062d2a1cb048c9f748bff3bd21ef794f

Comment 2 Haresh Khandelwal 2021-04-06 15:49:46 UTC
Hi, Do you see any error in dmesg? 
ovn flow programming looks fine.

Comment 3 Itai Levy 2021-04-12 05:52:51 UTC
I didnt see any suspicious errors or "syndrom" messages in dmesg file.
One message I noticed in RHEL8.4 setup that didnt show in RHEL 8.2 was:
E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295

I had to destroy the setup, once I will have the chance to rebuild it I will collect the entire log file.
Can you try reproducing in your lab with the same SW version for the relevant components (OSP / Kernel / OVN / OVS) + latest GA ConnectX Firmware?

Comment 4 Haresh Khandelwal 2021-04-12 08:29:14 UTC
Hi Jaison,
Do you have reproducer? I can look into.

Comment 6 Haresh Khandelwal 2021-04-14 07:44:54 UTC
Hi Itai FYI,
Pradipta is able to offload geneve tunnel traffic. 

Hi Pradipta,
Can you please share what was the issue? And how did it work? This may help Itai as well. 
Removing "TestBlocker" flag since we are able to make progress.

Comment 7 Itai Levy 2021-04-14 07:46:36 UTC
Hi Haresh, 

Pradipta is using older kernel than mine...

Comment 8 Marcelo Ricardo Leitner 2021-04-14 14:56:26 UTC
FWIW, the kernel difference is quite significant.
kernel-4.18.0-293.el8..kernel-4.18.0-302.el8
The newer one Itai is using includes, not limited to, mlx5 bzs:
Bugzilla: http://bugzilla.redhat.com/1928671
Bugzilla: http://bugzilla.redhat.com/1856795
Bugzilla: http://bugzilla.redhat.com/1913616
Bugzilla: http://bugzilla.redhat.com/1919807
Bugzilla: http://bugzilla.redhat.com/1925439
Bugzilla: http://bugzilla.redhat.com/1926120
Bugzilla: http://bugzilla.redhat.com/1928706
Bugzilla: http://bugzilla.redhat.com/1929119
Bugzilla: http://bugzilla.redhat.com/1929166

Comment 11 Pradipta Kumar Sahoo 2021-06-10 15:57:56 UTC
Hi Marcelo/Haresh,

In the OVN hw-offload environment, we are still reproduced the issue. Please find the below reproduced steps details.

Reproduced step:
1. Deployed OSP16.2 cloud with latest compose puddle: RHOS-16.2-RHEL-8-20210525.n.0

2. RHEL, OVS and OVN versions of DUT
    # cat /etc/redhat-release 
    Red Hat Enterprise Linux release 8.4 (Ootpa)

    # uname -a
    Linux overcloud-r640compute-0 4.18.0-305.el8.x86_64 #1 SMP Thu Apr 29 08:54:30 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

    # rpm -qa| grep openvswi
    rhosp-network-scripts-openvswitch-2.15-4.el8ost.1.noarch
    openvswitch2.15-2.15.0-15.el8fdp.x86_64
    rhosp-openvswitch-2.15-4.el8ost.1.noarch
    openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
    network-scripts-openvswitch2.15-2.15.0-15.el8fdp.x86_64

    # rpm -qa|grep ovn
    rhosp-ovn-2021-4.el8ost.1.noarch
    ovn-2021-host-21.03.0-21.el8fdp.x86_64
    ovn-2021-21.03.0-21.el8fdp.x86_64
    rhosp-ovn-host-2021-4.el8ost.1.noarch

    Kernel Boot Parameter:
    # cat /proc/cmdline 
BOOT_IMAGE=(hd0,msdos2)/boot/vmlinuz-4.18.0-305.el8.x86_64 root=UUID=3092f72c-9609-48a6-9452-91212b9f3d44 ro console=ttyS0 console=ttyS0,115200n81 no_timer_check crashkernel=auto rhgb quiet default_hugepagesz=1GB hugepagesz=1G hugepages=300 iommu=pt intel_iommu=on isolcpus=4-39,44-79 mitigations=off skew_tick=1 nohz=on nohz_full=4-39,44-79 rcu_nocbs=4-39,44-79 tuned.non_isolcpus=00000f00,0000000f intel_pstate=disable nosoftlockup


3. Firmware and Driver details PF
    # ethtool -i ens2f0
    driver: mlx5e_rep
    version: 4.18.0-305.el8.x86_64
    firmware-version: 16.27.6106 (DEL0000000015)
    expansion-rom-version: 
    bus-info: 0000:5e:00.0
    supports-statistics: yes
    supports-test: no
    supports-eeprom-access: no
    supports-register-dump: no
    supports-priv-flags: no

4. Switchdev and firmware hw-offload flag status of PF
    # devlink dev eswitch show pci/0000:5e:00.0
    pci/0000:5e:00.0: mode switchdev inline-mode none encap-mode basic

    # ethtool -k ens2f0|grep hw-tc-offload
    hw-tc-offload: on

    # ethtool -k ens2f0_1|grep hw-tc-offload
    hw-tc-offload: on

    DMESG log for 0000:5e:00.0
    [   82.649271] mlx5_core 0000:5e:00.0: E-Switch: Disable: mode(LEGACY), nvfs(10), active vports(11)
    [   84.596560] mlx5_core 0000:5e:00.0: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
    [   84.614446] mlx5_core 0000:5e:00.0: mlx5dr_actions_build_ste_arr:681:(pid 7058): Connecting table to a lower/same level destination table
    [   84.617425] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   84.696860] mlx5_core 0000:5e:00.0 ens2f0: renamed from eth0
    [   84.707356] device ens2f0 entered promiscuous mode
    [   84.748873] ib_srpt MAD registration failed for mlx5_0-1.
    [   84.813680] ib_srpt srpt_add_one(mlx5_0) failed.
    [   84.822119] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   84.879151] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   84.941814] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.007311] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.065626] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.124919] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.184557] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.248976] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.307848] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.365262] mlx5_core 0000:5e:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
    [   85.426371] mlx5_core 0000:5e:00.0: E-Switch: Enable: mode(OFFLOADS), nvfs(10), active vports(11)
    [   85.595808] mlx5_core 0000:5e:00.0 ens2f0: Link up


5. VF bind details
    # ip link show ens2f0
    21: ens2f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
        link/ether 0c:42:a1:d1:d5:80 brd ff:ff:ff:ff:ff:ff
        vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 1     link/ether fa:16:3e:36:6c:13 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 2     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 3     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 4     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 5     link/ether fa:16:3e:e4:1d:b1 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 6     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 7     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 8     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
        vf 9     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off


6. OpenStack neutron offload port properties:

    $ openstack port show --fit-width c47d4abe-4fd2-48ef-bd42-f25c1920d7bf
    +-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Field                   | Value                                                                                                                                                            |
    +-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | admin_state_up          | UP                                                                                                                                                               |
    | allowed_address_pairs   |                                                                                                                                                                  |
    | binding_host_id         | overcloud-r640compute-0.scalelab.local                                                                                                                           |
    | binding_profile         | capabilities='['switchdev']', pci_slot='0000:5e:00.3', pci_vendor_info='15b3:1018', physical_network=                                                            |
    | binding_vif_details     | port_filter='True'                                                                                                                                               |
    | binding_vif_type        | ovs                                                                                                                                                              |
    | binding_vnic_type       | direct                                                                                                                                                           |
    | created_at              | 2021-06-10T15:05:57Z                                                                                                                                             |
    | data_plane_status       | None                                                                                                                                                             |
    | description             |                                                                                                                                                                  |
    | device_id               | 98996ed4-bd2b-4916-aef6-189577b67253                                                                                                                             |
    | device_owner            | compute:r640-zone                                                                                                                                                |
    | dns_assignment          | fqdn='host-192-168-2-16.openstacklocal.', hostname='host-192-168-2-16', ip_address='192.168.2.16'                                                                |
    | dns_domain              | None                                                                                                                                                             |
    | dns_name                |                                                                                                                                                                  |
    | extra_dhcp_opts         |                                                                                                                                                                  |
    | fixed_ips               | ip_address='192.168.2.16', subnet_id='f405679f-05fc-4562-8366-c3ffc853cd80'                                                                                      |
    | id                      | c47d4abe-4fd2-48ef-bd42-f25c1920d7bf                                                                                                                             |
    | location                | cloud='', project.domain_id=, project.domain_name='Default', project.id='c10e431e893943e08d4730315a51be10', project.name='admin', region_name='regionOne', zone= |
    | mac_address             | fa:16:3e:36:6c:13                                                                                                                                                |
    | name                    | internal2-p1-VM1                                                                                                                                                 |
    | network_id              | 733bd460-22cb-46b5-aaf9-791109037f3e                                                                                                                             |
    | port_security_enabled   | False                                                                                                                                                            |
    | project_id              | c10e431e893943e08d4730315a51be10                                                                                                                                 |
    | propagate_uplink_status | None                                                                                                                                                             |
    | qos_policy_id           | None                                                                                                                                                             |
    | resource_request        | None                                                                                                                                                             |
    | revision_number         | 4                                                                                                                                                                |
    | security_group_ids      |                                                                                                                                                                  |
    | status                  | ACTIVE                                                                                                                                                           |
    | tags                    |                                                                                                                                                                  |
    | trunk_details           | None                                                                                                                                                             |
    | updated_at              | 2021-06-10T15:06:31Z                                                                                                                                             |
    +-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Ping test between two VM which are hosted in different compute node and using Geneve offload tunnel network. 
During the ping, the DP flows are not properly offloaded where looks like one way flows offloaded and the flows are inconsistent while monitoring 0.1sec duration.


Sample flow details from Datapath.

# ovs-appctl dpctl/dump-flows -m | grep ens2f0_1
ufid:56293100-f4e9-4e0f-b464-1dbdced2726e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens2f0_1),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:36:6c:13,dst=fa:16:3e:22:f5:57),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=192.168.2.17,op=1,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:0, bytes:0, used:0.250s, dp:tc, actions:set(tunnel(tun_id=0x4,dst=172.17.2.57,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081
ufid:c6cf1503-b5ac-437c-8ef2-c31b820fee55, skb_priority(0/0),tunnel(tun_id=0x4,src=172.17.2.57,dst=172.17.2.46,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:22:f5:57,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=0.0.0.0/0.0.0.0,op=0/0,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:1, bytes:52, used:0.030s, offloaded:yes, dp:tc, actions:ens2f0_1
ufid:ea3e3ff5-5d41-4d41-8eef-8c0db898e86d, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x4,src=172.17.2.57,dst=172.17.2.46,ttl=0/0,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(-df+csum+key)),in_port(genev_sys_6081),skb_mark(0/0),ct_state(0/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=fa:16:3e:22:f5:57,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:558, bytes:54684, used:0.314s, dp:ovs, actions:ens2f0_1
ufid:694fa8d6-37f8-489b-b44f-1e05eeb1c3b7, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(ens2f0_1),skb_mark(0/0),ct_state(0/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=fa:16:3e:36:6c:13,dst=fa:16:3e:22:f5:57),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=192.168.2.16/255.255.255.240,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:558, bytes:54684, used:0.314s, dp:ovs, actions:set(tunnel(tun_id=0x4,dst=172.17.2.57,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20003}),flags(df|csum|key))),genev_sys_6081

It seems like a regression in the newer kernel, so please suggest the stable kernel version for the Geneve offload test. Our geneve offload performance baseline test activity get block due to this offload issue. 

Regards,
Pradipta

Comment 12 Marcelo Ricardo Leitner 2021-06-15 22:13:55 UTC
Hi,

(In reply to Pradipta Kumar Sahoo from comment #11)
> ufid:694fa8d6-37f8-489b-b44f-1e05eeb1c3b7,
> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(ens2f0_1),skb_mark(0/0),
> ct_state(0/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=fa:16:3e:
          ^^^^^^^^                                    ^^^^^

Is CT supposed to be activated on this test? This causes this rule to be using

> 36:6c:13,dst=fa:16:3e:22:f5:57),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,
> dst=192.168.2.16/255.255.255.240,proto=0/0,tos=0/0x3,ttl=0/0,frag=no),
> packets:558, bytes:54684, used:0.314s, dp:ovs,
                                         ^^^^^^

> actions:set(tunnel(tun_id=0x4,dst=172.17.2.57,ttl=64,tp_dst=6081,
> geneve({class=0x102,type=0x80,len=4,0x20003}),flags(df|csum|key))),
> genev_sys_6081
> 
> It seems like a regression in the newer kernel, so please suggest the stable

Yes. -305.el8 has issues with the above ct_state match.

> kernel version for the Geneve offload test. Our geneve offload performance
> baseline test activity get block due to this offload issue. 

Please try the latest z-stream build from today, kernel-4.18.0-305.7.1.el8_4.
Amongst others, it has fixes for this bz: https://bugzilla.redhat.com/show_bug.cgi?id=1965457


It doesn't, however, explain this flow:
ufid:56293100-f4e9-4e0f-b464-1dbdced2726e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens2f0_1),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:36:6c:13,dst=fa:16:3e:22:f5:57),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=192.168.2.17,op=1,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:0, bytes:0, used:0.250s, dp:tc, actions:set(tunnel(tun_id=0x4,dst=172.17.2.57,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081

that is not offloaded, but please lets see how it behaves with the kernel.

Comment 13 Marcelo Ricardo Leitner 2021-06-15 22:15:49 UTC
There was a recent fix on OVN to *not* use CT on ports that are not using security groups even if this is enabled. Not sure, but maybe that's why CT is being activated?
https://bugzilla.redhat.com/show_bug.cgi?id=1955191

Comment 14 Itai Levy 2021-06-16 14:03:03 UTC
Thanks Marcelo.

According to a very initial test I did, it seems like indeed kernel-4.18.0-305.7.1.el8_4 is solving the issue in my case. will need to re-confirm.

Pradipta - please notice that in my case the other way around was not offloaded (from Tunnel to Representor, as appears also in the initial issue description).
Can you please confirm that this kernel is working for you as well?

Do we know what was the root cause of this behaviour?

Itai

Comment 15 Marcelo Ricardo Leitner 2021-06-16 22:22:32 UTC
(In reply to Itai Levy from comment #14)
> Do we know what was the root cause of this behaviour?

For the 1st part of comment #12, it was the lack of
afa536d8405a ("net/sched: cls_flower: fix only mask bit check in the validate_ct_state")

It is confusing because $summary here says "without connection tracking", but that flow was using CT somehow.

For the 2nd part, the flow using dp:tc but not offloaded, that I still don't know. Did it really get offloaded in the new test, or log something in dmesg?

Thanks.

Comment 16 Haresh Khandelwal 2021-06-21 13:33:48 UTC
Geneve offload (Without CT) is working well in RHOSP puddle: RHOS-16.2-RHEL-8-20210525.n.0. 
Closing the BZ, in case you still face the issue, Please Re-Open. 

Thanks

Comment 17 Itai Levy 2021-06-23 07:44:31 UTC
Hi Haresh, 

Re-opening, as we would really like to understand the root cause of this issue and how exactly it was "solved".
In addition we didnt get approval from Pradipta the issue is indeed solved in RH lab as well.

Itai

Comment 18 Pradipta Kumar Sahoo 2021-06-23 10:31:22 UTC
Hi Marcelo/Itai,

JFYI- We shifted the Geneve hw offload test to 100GbE Mellanox lab, where we noticed the packets are offloaded properly and able to received the line-rate performance. We suspected the previous issue (comment #11) was related to nic firmware where the packets are not offloaded.

# ovs-appctl dpctl/dump-flows -m type=offloaded
ufid:7ad51eb3-93d0-40b6-9d1f-dda0f3464600, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens3f1_7),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:a0:1f:28,dst=fa:16:3e:7f:a1:49),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:294680124, bytes:440840668872, used:0.590s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x6,dst=172.17.2.163,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081
ufid:2497e6e8-e679-4e2a-b686-25e4f04e2322, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens3f1_7),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:70:9f:2e,dst=fa:16:3e:3d:6b:f0),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:0, bytes:0, used:0.000s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x6,dst=172.17.2.163,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x28001}),flags(key))),genev_sys_6081
ufid:d54eb96e-2118-4e78-bd07-5abfb30bf97d, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens3f1_4),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:20:ab:56,dst=fa:16:3e:e4:0a:cf),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:325083501, bytes:486324126336, used:0.590s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x3,dst=172.17.2.160,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081
ufid:441a4c3a-bf2a-4ecd-9609-032a3006ad2a, skb_priority(0/0),tunnel(tun_id=0x3,src=172.17.2.160,dst=172.17.2.60,tos=0x1,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:e4:0a:cf,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:358155205, bytes:513594414898, used:0.590s, offloaded:yes, dp:tc, actions:ens3f1_4
ufid:a5ae752d-1ab7-4b3e-b846-4dea7e9212fa, skb_priority(0/0),tunnel(tun_id=0x4,src=172.17.2.161,dst=172.17.2.60,tos=0x1,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:20:ab:56,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:326514839, bytes:468222237280, used:0.590s, offloaded:yes, dp:tc, actions:ens3f1_0
ufid:13f1ec67-400e-49b1-8f7b-556d81feade1, skb_priority(0/0),tunnel(tun_id=0x5,src=172.17.2.162,dst=172.17.2.60,tos=0x1,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:a0:1f:28,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:296117872, bytes:424633014896, used:0.590s, offloaded:yes, dp:tc, actions:ens3f1_8
ufid:c178a094-4806-4de2-8b0a-d2064a2ad0b9, skb_priority(0/0),tunnel(tun_id=0x6,src=172.17.2.163,dst=172.17.2.60,tos=0x1,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:7f:a1:49,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:279334584, bytes:400565642634, used:0.590s, offloaded:yes, dp:tc, actions:ens3f1_7
ufid:7e9b700e-3227-4b81-84d2-29cda1613a2f, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens3f1_8),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:7f:a1:49,dst=fa:16:3e:a0:1f:28),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:277865068, bytes:415685346540, used:0.590s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x5,dst=172.17.2.162,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081
ufid:4a06f9c1-0007-4d9f-946c-1c6e748db442, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens3f1_0),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:e4:0a:cf,dst=fa:16:3e:20:ab:56),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:356681140, bytes:533594172164, used:0.590s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x4,dst=172.17.2.161,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x20003}),flags(key))),genev_sys_6081
ufid:9a2d93b3-46b0-49f6-96f6-96e159da71f0, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens3f1_0),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:08:4e:b2,dst=fa:16:3e:56:d1:8f),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:0, bytes:0, used:0.000s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x4,dst=172.17.2.161,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x28001}),flags(key))),genev_sys_6081


For reference, please find the below environment details for reference.

# cat /etc/redhat-release; uname -r;
Red Hat Enterprise Linux release 8.4 (Ootpa)
4.18.0-305.el8.x86_64

# ovs-vswitchd --version
ovs-vswitchd (Open vSwitch) 2.15.1
DPDK 20.11.0

# ovn-sbctl --version
ovn-sbctl 21.03.1
Open vSwitch Library 2.15.90
DB Schema 20.16.1

# ethtool -i ens3f1
driver: mlx5e_rep
version: 4.18.0-305.el8.x86_64
firmware-version: 16.29.1016 (MT_0000000013)
expansion-rom-version: 
bus-info: 0000:5e:00.1
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

# ethtool ens3f1 | grep Speed
        Speed: 100000Mb/s

# lspci | grep -i Mellanox
5e:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
5e:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]


1500 MTU sample test with 100% line rate:
-----------------------------------------
trex>stats 
Global Statistics

connection   : localhost, Port 4501                       total_tx_L2  : 98.6 Gbps                      
version      : STL @ v2.89                                total_tx_L1  : 99.91 Gbps                     
cpu_util.    : 1.42% @ 42 cores (21 per dual port)        total_rx     : 97.83 Gbps                     
rx_cpu_util. : 9.88% / 125.84 pps                         total_pps    : 8.22 Mpps                      
async_util.  : 0% / 69.97 bps                             drop_rate    : 0 bps                          
total_cps.   : 0 cps                                      queue_full   : 0 pkts                         

Port Statistics

   port    |         0         |         1         |         2         |         3         |       total       
-----------+-------------------+-------------------+-------------------+-------------------+------------------
owner      |              root |              root |              root |              root |                   
link       |                UP |                UP |                UP |                UP |                   
state      |      TRANSMITTING |      TRANSMITTING |      TRANSMITTING |      TRANSMITTING |                   
speed      |           25 Gb/s |           25 Gb/s |           25 Gb/s |           25 Gb/s |                   
CPU util.  |             1.34% |             1.34% |              1.5% |              1.5% |                   
--         |                   |                   |                   |                   |                   
Tx bps L2  |        24.65 Gbps |        24.65 Gbps |        24.65 Gbps |        24.65 Gbps |         98.6 Gbps 
Tx bps L1  |        24.98 Gbps |        24.98 Gbps |        24.98 Gbps |        24.98 Gbps |        99.91 Gbps 
Tx pps     |         2.05 Mpps |         2.05 Mpps |         2.05 Mpps |         2.05 Mpps |         8.22 Mpps 
Line Util. |           99.91 % |           99.91 % |           99.91 % |           99.91 % |                   
---        |                   |                   |                   |                   |                   
Rx bps     |        24.46 Gbps |        24.46 Gbps |        24.46 Gbps |        24.46 Gbps |        97.83 Gbps 
Rx pps     |         2.04 Mpps |         2.04 Mpps |         2.04 Mpps |         2.04 Mpps |         8.17 Mpps 
----       |                   |                   |                   |                   |                   


Thank You,
Pradipta

Comment 19 Marcelo Ricardo Leitner 2021-06-24 01:51:48 UTC
Hi,

(In reply to Pradipta Kumar Sahoo from comment #18)
> line-rate performance. We suspected the previous issue (comment #11) was
> related to nic firmware where the packets are not offloaded.

Comment #11 had "ct_state(0/0x3f)" in it, which is not present here.

> # ovn-sbctl --version
> ovn-sbctl 21.03.1               <---
> Open vSwitch Library 2.15.90
> DB Schema 20.16.1

The CT usage removal was likely due to https://bugzilla.redhat.com/show_bug.cgi?id=1955191
(as indicated in comment #13)


Nice numbers, btw! :-)

Thanks,
Marcelo

Comment 20 Marcelo Ricardo Leitner 2021-07-01 00:13:54 UTC
Itai, are we good now or do you need something else?

Comment 21 Alaa Hleihel (NVIDIA Mellanox) 2021-07-04 08:19:39 UTC
(In reply to Marcelo Ricardo Leitner from comment #20)
> Itai, are we good now or do you need something else?

Hi, Marcelo.

There is still an open issue here.
Here is some data from the team working on it:

The reason that we don't see rules offloaded to HW is that if the qdisc of the tunnel device is created before we're in switchdev mode, we will never get a callback to register yet another callback that will give us the filter commands.
One W/A is to restart openvswitch service, or make sure that the ingress qdisc for any tunnel device is created after the driver is loaded and in switchdev mode.

An example to demonstrate the issue:
- Start with mlx5 driver unloaded.
- Create vxlan device using iproute2.
- Create ingress qdisc for it.
- Load mlx5 driver.
- Move to switchdev mode.
- Add rule on the vxlan device with tc.
---> The rule will *NOT* be in hardware.
- Delete the qdisc from the vxlan.
- Add qdisc to vxlan.
- Add rule on vxlan.
---> The rule will be in hardware.

Comment 22 Haresh Khandelwal 2021-07-05 08:13:24 UTC
Hi Itai,

From RHOSP perspective, 
We move e-switch to switchdev at time of deployment itself and qdisc config on the devices happens only after deployment. 
So, 
"
- Start with mlx5 driver unloaded.
- Create vxlan device using iproute2.
- Create ingress qdisc for it.
- Load mlx5 driver." 
Not applicable when you use RHOSP (And this particular BZ). Also, in case of stack update, we don't driver unload. 

However, I agree on the test case here. 
Do you still see any relevance which i might have overlooked?

Comment 23 Itai Levy 2021-07-08 06:21:37 UTC
Hi Haresh, 

Yes, let me clarify:
I deployed RHOSP 16.1 GA with VF-LAG (bonding) VLAN interface for tenant network (used for the Gevene tunnels). The bond is configured with switchdev VFs.
After the deployment, I upgraded the compute nodes to RHEL 8.4 and rebooted it.
At this point, when I create VMs with direct switchdev ports over geneve network the Decap traffic is not offloaded. An analysis we did showed that the reason is the qdisc config as explained above (you can ignore the manual steps to demo the issue on a standalone device).
To workaround the issue and re-trigger the tunnel qdisc config (after every reboot, after the switchdev was set) I stop the VMs, restart OVS, start VMs and initiate traffic between the VMs over the tunnels to verify it is offloaded to HW.

It would be appreciated if you can verify the same issue on your OSP system. I am attaching my os-net-config example of a compute node with VF-LAG configuration.

Itai

Comment 24 Itai Levy 2021-07-08 06:22:52 UTC
Created attachment 1799541 [details]
os-net-config example of vf-lag

Comment 25 Saravanan KR 2021-07-08 06:43:15 UTC
~~~~~~
[Unit]
Description=SR-IOV numvfs configuration
After=systemd-udev-settle.service openibd.service
Before=openvswitch.service
[Service]
Type=oneshot
ExecStart=/usr/bin/os-net-config-sriov
[Install]
WantedBy=multi-user.target
~~~~~~

The sriov_config.service [1], which configures the switchdev mode on reboot, should run before openvswitch. If qdisc is created before switchdev mode change, then it is possible that openvswitch is starting before sriov_config on restart. Can you confirm if there is any other trigger for openvswithc before srivo_config service? Is network.service is running before, which is triggering openvswitch? 


[1] https://github.com/openstack/os-net-config/blob/master/os_net_config/utils.py#L47

Comment 26 Itai Levy 2021-07-08 07:42:50 UTC
Hi Saravanan,
The sriov_config.service in my setup has exactly the same services order as you pasted here.
According to messages file OVS is started immediately after sriov_config:

 overcloud-computesriov-rack0-1 systemd[1]: sriov_config.service: Succeeded.
 overcloud-computesriov-rack0-1 systemd[1]: Started SR-IOV numvfs configuration.
 overcloud-computesriov-rack0-1 systemd[1]: Starting Open vSwitch...
 overcloud-computesriov-rack0-1 systemd[1]: Started Open vSwitch.
 overcloud-computesriov-rack0-1 systemd[1]: Starting LSB: Bring up/down networking...

Comment 27 Haresh Khandelwal 2021-07-09 08:52:36 UTC
Hi Itai,

As You mentioned in comment#26, config flow looks fine. 
However, to confirm, I will try this in my lab. 
Let me mention the steps to reproduce so, we will be on same page.

1) Have offload working VMs (ml2/ovn, geneve tunnel), I will use 16.2 compose with rhel 8.4. 
2) Reboot the host node
3) Check if existing vm's traffic offloaded or not?
4) Create new vms and check if their traffic too offloaded or not.

Please share if this not what you tried. 

Thanks

Comment 28 Itai Levy 2021-07-11 13:58:02 UTC
Hi Haresh, 

In my case, right after deployment, when I create the VMs for the first time and try to run traffic over the tunnels, I already see the offload issue. 
Reboot of nodes will not solve it, only OVS restart after the reboot.
Please make sure you are using bond (VF-LAG acceleration) over vlan interface for the tunnels, and that you see both flow directions are offloaded (encap + decap).

Itai

Comment 29 Haresh Khandelwal 2021-07-13 07:52:51 UTC
Hi Itai,

I did fresh deployment with compose RHOS-16.2-RHEL-8-20210614.n.1. 
    Bridge br-data
        fail_mode: standalone
        Port mx-bond
            Interface mx-bond
        Port patch-provnet-6fd73eaa-0b31-40bc-b5f0-a6b0a6beaab1-to-br-int
            Interface patch-provnet-6fd73eaa-0b31-40bc-b5f0-a6b0a6beaab1-to-br-int
                type: patch
                options: {peer=patch-br-int-to-provnet-6fd73eaa-0b31-40bc-b5f0-a6b0a6beaab1}
        Port br-data
            Interface br-data
                type: internal
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port patch-br-int-to-provnet-6fd73eaa-0b31-40bc-b5f0-a6b0a6beaab1
            Interface patch-br-int-to-provnet-6fd73eaa-0b31-40bc-b5f0-a6b0a6beaab1
                type: patch
                options: {peer=patch-provnet-6fd73eaa-0b31-40bc-b5f0-a6b0a6beaab1-to-br-int}
        Port enp4s0f0_0
            Interface enp4s0f0_0
        Port enp4s0f1_0
            Interface enp4s0f1_0
        Port ovn-9c9f2a-0
            Interface ovn-9c9f2a-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.10.51.120"}
        Port ovn-C7-0
            Interface ovn-C7-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.10.51.121"}
    ovs_version: "2.15.1"


28: vlan402@mx-bond: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 04:3f:72:d9:c0:48 brd ff:ff:ff:ff:ff:ff

[root@hareshcomputesriovoffload-0 ~]# cat /proc/net/bonding/mx-bond 
Ethernet Channel Bonding Driver: v4.18.0-305.el8.x86_64

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: enp4s0f0
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp4s0f0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 04:3f:72:d9:c0:48
Slave queue ID: 0

Slave Interface: enp4s0f1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 04:3f:72:d9:c0:49
Slave queue ID: 0
[root@hareshcomputesriovoffload-0 ~]# 

Created VMs and i am able to offload traffic both vlan and geneve.

ufid:ee5cffad-8c30-43fb-aa04-ca3b8a53caf3, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f1_0),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f6,dst=fa:16:3e:c2:ce:3a),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:79010, bytes:3853636, used:0.220s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.51.121,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x28001}),flags(key))),genev_sys_6081

ufid:6eebae6c-d8ae-451d-abf8-e0b5f97fa2dd, skb_priority(0/0),tunnel(tun_id=0x2,src=10.10.51.121,dst=10.10.51.159,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:c2:ce:3a,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:79014, bytes:3318560, used:0.220s, offloaded:yes, dp:tc, actions:enp4s0f1_0

I didn't need any reboots/ovs restarts. 

Thanks

Comment 30 Itai Levy 2021-07-13 07:58:04 UTC
Haresh, 
Can you please try with Active-Active lacp bond? (this is what I used, sorry for not specifying earlier).

Thanks
Itai

Comment 31 Haresh Khandelwal 2021-07-13 08:50:24 UTC
Hi Itai,

here too, it is working. 

ufid:4ec83157-1f19-4546-b7c9-9bdecaab614b, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f1_0),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f6,dst=fa:16:3e:6f:11:ff),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:114112, bytes:7202588, used:0.580s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x1,dst=10.10.51.121,ttl=64,tp_dst=6081,key6(bad key length 1, expected 0)(01)geneve({class=0x102,type=0x80,len=4,0x28001}),flags(key))),genev_sys_6081

ufid:9635f2b1-fd53-4b50-a0a5-fce291915fa0, skb_priority(0/0),tunnel(tun_id=0x1,src=10.10.51.121,dst=10.10.51.159,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:6f:11:ff,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:161817, bytes:6630638, used:0.580s, offloaded:yes, dp:tc, actions:enp4s0f1_0

[root@hareshcomputesriovoffload-0 ~]# cat /proc/net/bonding/mx-bond 
Ethernet Channel Bonding Driver: v4.18.0-305.el8.x86_64

Bonding Mode: load balancing (xor)  <<<<<<<<<<<<<<<<<<<<<<<
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp4s0f0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 04:3f:72:d9:c0:48
Slave queue ID: 0

Slave Interface: enp4s0f1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 04:3f:72:d9:c0:49
Slave queue ID: 0
[root@hareshcomputesriovoffload-0 ~]#

Comment 32 Itai Levy 2021-07-13 10:16:21 UTC
Thanks for the update Haresh.
There must be a change between our setups that we miss. 
Have you seen the os-net config I attached? how many VFs are you enabling during boot? 16 per PF like me?

Itai

Comment 33 Haresh Khandelwal 2021-07-13 17:30:44 UTC
Hi Itai,

I think we got the issue you are facing. 
I rebooted the node and started the traffic.

ufid:c658ac3b-95bb-42fd-b426-c2aea7fe94da, skb_priority(0/0),tunnel(tun_id=0x1,src=10.10.51.121,dst=10.10.51.159,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:6f:11:ff,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:72456, bytes:3043110, used:0.740s, offloaded:yes, dp:tc, actions:enp4s0f1_2

ufid:3e399b6e-b609-4317-9cab-2a119faa496f, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(enp4s0f1_2),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=f8:f2:1e:03:bf:f6,dst=fa:16:3e:6f:11:ff),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:73162, bytes:3804424, used:0.002s, dp:ovs, actions:userspace(pid=3867753320,controller(reason=1,dont_send=0,continuation=0,recirc_id=24,rule_cookie=0xff2250a2,controller_id=0,max_len=65535)),set(tunnel(tun_id=0x1,dst=10.10.51.121,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x28001}),flags(df|csum|key))),genev_sys_6081

As you can see above, ingress traffic is offloaded while egress is not. 

From ovs log, I could see geneve sys interface no added for qdisc.

2021-07-13T13:33:55.099Z|00001|netdev_linux(revalidator2)|INFO|ioctl(SIOCGIFINDEX) on genev_sys_6081 device failed: No such device

Restarting ovs fixes it.
2021-07-13T13:46:05.113Z|00017|netdev_offload_tc|INFO|added ingress qdisc to genev_sys_6081
2021-07-13T13:46:05.113Z|00018|netdev_offload|INFO|genev_sys_6081: Assigned flow API 'linux_tc'.


qdisc present just after reboot and before ovs restart (Which deletes them and re add qdisc on the port)
[root@hareshcomputesriovoffload-0 /]# tc -s qdisc show  dev genev_sys_6081
qdisc noqueue 0: root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc ingress ffff: parent ffff:fff1 ---------------- 
 Sent 476 bytes 17 pkt (dropped 0, overlimits 0 requeues 0)   <<<<<<<<<<<<<<<<<<<<<<
 backlog 0b 0p requeues 0

[root@hareshcomputesriovoffload-0 /]# tc -s qdisc show  dev mx-bond
qdisc noqueue 0: root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc ingress ffff: parent ffff:fff1 ingress_block 27 ---------------- 
 Sent 112334 bytes 461 pkt (dropped 0, overlimits 0 requeues 0)   <<<<<<<<<<<<<<
 backlog 0b 0p requeues 0

I will discuss with saravanan on how to address this issue.

Thanks

Comment 34 Marcelo Ricardo Leitner 2021-07-15 03:02:13 UTC
I guess this issue deserves a new bz of its own by now. :-)

Comment 35 Itai Levy 2021-07-15 06:35:36 UTC
Thanks for the update Haresh.
Please notice that in my case it was the egress traffic which was offloaded while the ingress was not.

Marcelo - why new BZ? :) 
Its still the same behaviour described initially of partial offload...

Comment 36 Haresh Khandelwal 2021-07-15 07:24:07 UTC
Hi Itai,

(In reply to Itai Levy from comment #35)
> Thanks for the update Haresh.
> Please notice that in my case it was the egress traffic which was offloaded
> while the ingress was not.

That may be due to nature of traffic. I dont use normal ping so no arp (broadcast). 
Can you paste the flows here for my reference?

> 
> Marcelo - why new BZ? :) 
> Its still the same behaviour described initially of partial offload...
we shall create new one but only after root causing it. My suspect is ovs right now, and if turns out right one, we need 1 on ovs.

Comment 37 Itai Levy 2021-07-15 09:35:34 UTC
Hi Haresh, 
You have the flows (iperf traffic) in the BZ description.

Itai

Comment 38 Alaa Hleihel (NVIDIA Mellanox) 2021-08-24 11:26:30 UTC
Hi, Marcelo.

The fixes for this bug were accepted upstream:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=c1c5cb3aee05
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=74fc4f828769


That is:
c1c5cb3aee05 net/core: Remove unused field from struct flow_indr_dev
74fc4f828769 net: Fix offloading indirect devices dependency on qdisc order creation

Both fixes are in kernel net-core.
Should this BZ be moved to Kernel component or copied?

Thanks,
Alaa

Comment 39 Marcelo Ricardo Leitner 2021-08-24 14:54:10 UTC
Hey Alaa,

(In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #38)
> That is:
> c1c5cb3aee05 net/core: Remove unused field from struct flow_indr_dev
> 74fc4f828769 net: Fix offloading indirect devices dependency on qdisc order
> creation

Oh, nice. And I was thinking that this would be just a initscript thingie. :)

> Both fixes are in kernel net-core.
> Should this BZ be moved to Kernel component or copied?

Copied please. So OSP QE can still test it afterwards.

Btw, what about the child bz here then, https://bugzilla.redhat.com/show_bug.cgi?id=1983111 ?
With the description in these patches, seems that's not needed anymore.

Thanks,
Marcelo

Comment 40 Alaa Hleihel (NVIDIA Mellanox) 2021-08-25 06:24:52 UTC
(In reply to Marcelo Ricardo Leitner from comment #39)
> Hey Alaa,
> 
> (In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #38)
> > That is:
> > c1c5cb3aee05 net/core: Remove unused field from struct flow_indr_dev
> > 74fc4f828769 net: Fix offloading indirect devices dependency on qdisc order
> > creation
> 
> Oh, nice. And I was thinking that this would be just a initscript thingie. :)
> 
> > Both fixes are in kernel net-core.
> > Should this BZ be moved to Kernel component or copied?
> 
> Copied please. So OSP QE can still test it afterwards.
> 

Sounds good, created BZ #1997381.

> Btw, what about the child bz here then,
> https://bugzilla.redhat.com/show_bug.cgi?id=1983111 ?
> With the description in these patches, seems that's not needed anymore.

I also thought the same, but going over your discussion with Haresh, I see both of you think there is probably another issue to fix in that BZ, so let's keep it till that is confirmed.

Thanks,
Alaa

Comment 41 Marcelo Ricardo Leitner 2021-12-14 14:22:01 UTC
(In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #40)
> (In reply to Marcelo Ricardo Leitner from comment #39)
> > (In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #38)
> > > That is:
> > > c1c5cb3aee05 net/core: Remove unused field from struct flow_indr_dev
> > > 74fc4f828769 net: Fix offloading indirect devices dependency on qdisc order
> > > creation
> > > Both fixes are in kernel net-core.
> > > Should this BZ be moved to Kernel component or copied?
> > 
> > Copied please. So OSP QE can still test it afterwards.
> > 
> 
> Sounds good, created BZ #1997381.

Please note that the fix is available in 8.4.z already, in kernel-4.18.0-305.30.1.el8_4, as per
https://bugzilla.redhat.com/show_bug.cgi?id=2022406 (the 8.4.z one)
So this bug can be verified and closed.

Comment 42 Marcelo Ricardo Leitner 2022-01-05 12:58:22 UTC
(In reply to Marcelo Ricardo Leitner from comment #41)
> Please note that the fix is available in 8.4.z already, in
> kernel-4.18.0-305.30.1.el8_4, as per
> https://bugzilla.redhat.com/show_bug.cgi?id=2022406 (the 8.4.z one)
> So this bug can be verified and closed.

Itai, Yariv, thoughts?

Comment 43 Itai Levy 2022-01-06 10:48:02 UTC
Hi Marcelo, 

If the fix was taken into RHOSP (was it) I assume RH will validate it during testing cycles right?

Itai

Comment 44 Marcelo Ricardo Leitner 2022-01-06 11:50:44 UTC
Hi Itai. That's my thinking as well.

Hi Haresh. Do we have TestOnly bzs for OSP? :-)

Comment 45 Haresh Khandelwal 2022-01-10 14:40:01 UTC
Hi Itai, Marcelo,

We are still not clear on "steps to reproduce". Though I will try Comment#33 again in latest compose and check, also, please note bug reported from 4.18.0-302.el8.x86_64 and latest compose will have higher than 4.18.0-305.28.1.el8_4 and ovs,ovn versions accordingly. One more thing, we didn't hear geneve + ovn related issues post 16.2.0. So may be this issue already fixed somewhere!

Thanks
-Haresh

Comment 46 Itai Levy 2022-01-11 07:36:18 UTC
Hi Haresh, Marcelo, 

Can we just check if those kernel patches were taken into the kernel being used in OSP 16.2?

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=c1c5cb3aee05
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=74fc4f828769

Itai

Comment 47 Marcelo Ricardo Leitner 2022-01-17 15:11:38 UTC
(In reply to Itai Levy from comment #46)
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=c1c5cb3aee05
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=74fc4f828769

Double checking comment #41 here:
Both are backported to RHEL via https://bugzilla.redhat.com/show_bug.cgi?id=1997381
Which was z-streamed to 8.4.z via https://bugzilla.redhat.com/show_bug.cgi?id=2022406 and present in kernels >= kernel-4.18.0-305.30.1.el8_4.

Thx.

Comment 48 Haresh Khandelwal 2022-02-21 13:26:07 UTC
Hi,

Followed the steps. 

1. deploy cloud
2. create geneve tenant network 
3. create direct ports with: --binding-profile '{"capabilities":["switchdev"]}' --no-security-group --disable-port-security
3. create an external provider vlan network
4. create vrouter with both subnets (--external-gateway for the "external" network)
5. create floating IPs on the "external" network
6. create instances with the geneve direct ports and assign external floating IPs
7. run traffic (iperf) between VMs or between a VM and an external iperf server via the floating IP

Versions:
OSP: RHOS-16.2-RHEL-8-20220201.n.1
kernel: 4.18.0-305.34.2.el8_4.x86_64
ovs: openvswitch2.15-2.15.0-55.el8fdp.x86_64
ovn: ovn-2021-21.12.0-11.el8fdp.x86_64

I see traffic is offloaded. Below are the flows.

ufid:52106039-4ffb-4767-be6a-a12c7222862d, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(mx-bond),packet_type(ns=0/0,id=0/0),eth(src=26:d0:92:6c:bb:cf,dst=fa:16:3e:c2:32:fb),eth_type(0x8100),vlan(vid=405,pcp=0),encap(eth_type(0x0800),ipv4(src=10.10.54.96/255.255.255.248,dst=10.10.54.132,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0)), packets:10759005, bytes:16288625322, used:0.770s, offloaded:yes, dp:tc, actions:pop_vlan,ct(zone=3,nat),recirc(0x89)

ufid:35de8587-2901-405d-bfb7-23c3ab52100f, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x89),dp_hash(0/0),in_port(mx-bond),packet_type(ns=0/0,id=0/0),eth(src=26:d0:92:6c:bb:cf,dst=fa:16:3e:c2:32:fb),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=7.7.7.14,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:10908051, bytes:16493831420, used:0.770s, offloaded:yes, dp:tc, actions:ct_clear,set(eth(src=fa:16:3e:a9:a7:49,dst=f8:f2:1e:03:bf:f2)),set(ipv4(ttl=63)),enp4s0f1_1

ufid:1d4c80f8-622e-4863-99f7-cba02f31b4de, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f1_1),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f2,dst=fa:16:3e:a9:a7:49),eth_type(0x0800),ipv4(src=7.7.7.14,dst=10.10.54.100,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:243916, bytes:16267560, used:0.770s, offloaded:yes, dp:tc, actions:set(eth(src=fa:16:3e:c2:32:fb,dst=26:d0:92:6c:bb:cf)),set(ipv4(ttl=63)),ct(zone=3,nat),recirc(0x8b)

ufid:da20bf8e-42e1-42db-849b-0fb791577fbc, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x8b),dp_hash(0/0),in_port(enp4s0f1_1),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:c2:32:fb,dst=26:d0:92:6c:bb:cf),eth_type(0x0800),ipv4(src=8.0.0.0/248.0.0.0,dst=10.10.54.0/255.255.255.128,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:243916, bytes:17243208, used:0.770s, offloaded:yes, dp:tc, actions:ct_clear,push_vlan(vid=405,pcp=0),mx-bond

[root@computesriov-0 heat-admin]# cat /proc/net/nf_conntrack | grep 10.10.54
ipv4     2 tcp      6 src=10.10.54.100 dst=10.10.54.132 sport=60380 dport=5201 src=7.7.7.14 dst=10.10.54.100 sport=5201 dport=60380 [HW_OFFLOAD] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=3

Moving to ON_QA for formal validation of the bug.

Thanks

Comment 54 Miguel Angel Nieto 2022-06-15 09:36:31 UTC
Verified. I see the flows below and I have check that I do not have packets in representor ports when executing tcpdump, so packets are being offloaded. I do not see any problem.
Used puddle RHOS-16.2-RHEL-8-20220513.n.2

Verified using the following testcase
python -m testtools.run nfv_tempest_plugin.tests.scenario.test_nfv_offload.TestNfvOffload.test_offload_tcp
 

ufid:e31a145a-05b0-4a00-82d5-8beb8b89d8b1, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_2),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:94:f3:c3,dst=fa:16:3e:92:48:1a),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=10.35.0.0/255.255.128.0,proto=6,tos=0/0x3,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:0, bytes:0, used:10.400s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.141.172,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:3d:5d:d9,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081
ufid:c4810d65-434b-4708-adfd-8dd5cf4a78b4, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_2),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:94:f3:c3,dst=fa:16:3e:5a:17:99),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=20.20.220.0/255.255.255.128,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:313926, bytes:40182636, used:1.120s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x3,dst=10.10.141.151,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x20003}),flags(csum|key))),genev_sys_6081
ufid:26024442-b9af-4cef-b8a4-9726493a0be2, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_2),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:94:f3:c3,dst=fa:16:3e:92:48:1a),eth_type(0x0800),ipv4(src=20.20.220.128/255.255.255.192,dst=128.0.0.0/192.0.0.0,proto=17,tos=0/0x3,ttl=64,frag=no),udp(src=0/0,dst=0/0x800), packets:0, bytes:0, used:0.400s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x2,dst=10.10.141.172,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(csum|key))),set(eth(src=fa:16:3e:3d:5d:d9,dst=9c:cc:83:58:1c:60)),set(ipv4(ttl=63)),genev_sys_6081
ufid:e2ecaeb9-cbc9-4909-8caf-77af9512fe04, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.141.151,dst=10.10.141.174,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:5a:17:99,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:3674565, bytes:32909247779, used:1.120s, offloaded:yes, dp:tc, actions:enp4s0f0_2
ufid:fa9a4019-21e0-4f1a-96a4-64cbbcf0e17b, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.141.172,dst=10.10.141.174,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x40002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:92:48:1a,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=0/0), packets:65, bytes:13518, used:9.180s, offloaded:yes, dp:tc, actions:enp4s0f0_2
ufid:31de7ff1-5f06-464f-aa9f-e58193255f0e, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.141.172,dst=10.10.141.174,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x40002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:92:48:1a,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=17,tos=0/0,ttl=0/0,frag=no),udp(src=0/0,dst=32768/0x8000), packets:0, bytes:0, used:0.390s, offloaded:yes, dp:tc, actions:enp4s0f0_2
ufid:28c45ce6-1041-4fcb-8e9a-9806b3b84167, skb_priority(0/0),tunnel(tun_id=0x3,src=10.10.141.151,dst=10.10.141.174,ttl=0/0,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002/0x7fffffff}),flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(genev_sys_6081),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:5a:17:99,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0806),arp(sip=0.0.0.0/0.0.0.0,tip=0.0.0.0/0.0.0.0,op=0/0,sha=00:00:00:00:00:00/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00), packets:0, bytes:0, used:5.210s, offloaded:yes, dp:tc, actions:enp4s0f0_2


Note You need to log in before you can comment on or make changes to this bug.