Description of problem: ipv6 between 2 vfs for mlx5_core in ovn setup is not offloaded ipv4 between the same vfs is offloaded Version-Release number of selected component (if applicable): openvswitch2.15-2.15.0-23.el8fdp.x86_64 How reproducible: Always Steps to Reproduce: 1. setup vf: echo 4 > /sys/bus/pci/devices/0000:3b:00.0/sriov_numvfs echo 0000:3b:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:3b:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:3b:00.4 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:3b:00.5 > /sys/bus/pci/drivers/mlx5_core/unbind devlink dev eswitch set pci/0000:3b:00.0 mode switchdev 2. create guest and attach representer into guest: virt-install --name g0 --vcpus=2 --ram=2048 --disk path=/var/lib/libvirt/images/g0.qcow2,device=disk,bus=virtio,format=qcow2 --network bridge=virbr0,model=virtio --boot hd --accelerate --force --graphic s none --noautoconsole virt-install --name g2 --vcpus=2 --ram=2048 --disk path=/var/lib/libvirt/images/g2.qcow2,device=disk,bus=virtio,format=qcow2 --network bridge=virbr0,model=virtio --boot hd --accelerate --force --graphic s none --noautoconsole cat vf.xml <interface type='hostdev' managed='yes'> <source> <address type='pci' domain='0x0000' bus='0x3b' slot='0x00' function='0x2'/> </source> <mac address='00:00:00:01:01:13'/> </interface> virsh attach-device g0 vf.xml cat vf.xml <interface type='hostdev' managed='yes'> <source> <address type='pci' domain='0x0000' bus='0x3b' slot='0x00' function='0x4'/> </source> <mac address='00:00:00:01:02:13'/> </interface> virsh attach-device g2 vf.xml 3. add representer into ovn ip link set eth0 down ip link set eth0 name s_pf0vf0 ovs-vsctl add-port br-int s_pf0vf0 -- set interface s_pf0vf0 external_ids:iface-id=s_pf0vf0 ip link set s_pf0vf0 up ip link set eth2 down ip link set eth2 name s_pf0vf2 ovs-vsctl add-port br-int s_pf0vf2 -- set interface s_pf0vf2 external_ids:iface-id=s_pf0vf2 ip link set s_pf0vf2 up ovn-nbctl ls-add ls1 ovn-nbctl ls-add ls2 ovn-nbctl lr-add lr1 ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:00:01 192.168.1.254/24 2001::a/64 ovn-nbctl lsp-add ls1 ls1-lr1 ovn-nbctl lsp-set-type ls1-lr1 router ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1 ovn-nbctl lsp-set-addresses ls1-lr1 router ovn-nbctl lrp-add lr1 lr1-ls2 00:00:00:00:00:02 172.17.$ip_subnet.254/24 7777:$ip_subnet::a/64 ovn-nbctl lsp-add ls2 ls2-lr1 ovn-nbctl lsp-set-type ls2-lr1 router ovn-nbctl lsp-set-options ls2-lr1 router-port=lr1-ls2 ovn-nbctl lsp-set-addresses ls2-lr1 router ovn-nbctl lsp-add ls1 s_pf0vf0 ovn-nbctl lsp-set-addresses s_pf0vf0 "00:00:00:01:01:11 192.168.1.11 2001::11" ovn-nbctl lsp-add ls2 s_pf0vf2 ovn-nbctl lsp-set-addresses s_pf0vf2 "00:00:00:01:02:11 172.17.174.11 7777:174::11" 4. enable hw offload for ovs: ovs-vsctl set Open_vSwitch . other_config:hw-offload=true systemctl restart openvswitch 5. ping6 7777:174::11 on g0. Actual results: ipv6 packets can be captured on representer for s_pf0vf2: not hw offloaded Expected results: ipv6 should not be captured on representer for s_pf0vf2: hw offloaded Additional info: [root@wsfd-advnetlab18 ~]# ovn-nbctl show switch c36f3372-3e51-4c9a-a706-9bc447c3c913 (ls1) port s_pf1vf1 addresses: ["00:00:00:01:01:16 192.168.1.16 2001::16"] port c_pf0vf1 addresses: ["00:00:00:01:01:14 192.168.1.14 2001::14"] port c_pf0vf0 addresses: ["00:00:00:01:01:13 192.168.1.13 2001::13"] port s_pf1vf0 addresses: ["00:00:00:01:01:15 192.168.1.15 2001::15"] port ls1-lr1 type: router router-port: lr1-ls1 port s_pf0vf0 addresses: ["00:00:00:01:01:11 192.168.1.11 2001::11"] port s_pf0vf1 addresses: ["00:00:00:01:01:12 192.168.1.12 2001::12"] switch 5dc10d4b-56d8-47a1-84d4-bf82599c078b (ls2) port c_pf0vf2 addresses: ["00:00:00:01:02:13 172.17.174.13 7777:174::13"] port ls2-lr1 type: router router-port: lr1-ls2 port s_pf0vf3 addresses: ["00:00:00:01:02:12 172.17.174.12 7777:174::12"] port c_pf0vf3 addresses: ["00:00:00:01:02:14 172.17.174.14 7777:174::14"] port s_pf0vf2 addresses: ["00:00:00:01:02:11 172.17.174.11 7777:174::11"] router f8f6b318-42d0-45b9-8aad-f29c3cab59cf (lr1) port lr1-ls1 mac: "00:00:00:00:00:01" networks: ["192.168.1.254/24", "2001::a/64"] port lr1-ls2 mac: "00:00:00:00:00:02" networks: ["172.17.174.254/24", "7777:174::a/64"] [root@wsfd-advnetlab18 ~]# ovs-appctl dpctl/dump-flows -m --names ufid:005fab43-9392-4e7b-b4cd-049d0cf8a46b, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf0),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x0800),ipv4(src=192.168.1.0/255.255.255.128,dst=172.17.174.11,proto=1,tos=0/0,ttl=64,frag=no),icmp(type=0/0,code=0/0), packets:2268, bytes:222264, used:0.340s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:02,dst=00:00:00:01:02:11)),set(ipv4(ttl=63)),s_pf0vf2 <==== ipv4 is offloaded ufid:14952cf7-4625-4f21-9b13-47a74c217761, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf0),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x86dd),ipv6(src=2001::/ffff:ffff:ffff:ffff::,dst=7777:174::11,label=0/0,proto=58,tclass=0/0,hlimit=64,frag=no),icmpv6(type=0/0,code=0/0), packets:17, bytes:1768, used:0.981s, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:02,dst=00:00:00:01:02:11)),set(ipv6(hlimit=63)),s_pf0vf2 <==== ipv6 is not offloaded ufid:a6055232-a637-4f7a-9615-c772d29e7675, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x0800),ipv4(src=172.17.174.0/255.255.255.128,dst=192.168.1.11,proto=1,tos=0/0,ttl=64,frag=no),icmp(type=0/0,code=0/0), packets:2268, bytes:222264, used:0.340s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01:11)),set(ipv4(ttl=63)),s_pf0vf0 ufid:3e8cc87a-7746-411a-bff0-815ea790fa57, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x86dd),ipv6(src=7777:174::/ffff:ffff:ffff:ffff::,dst=2001::11,label=0/0,proto=58,tclass=0/0,hlimit=64,frag=no),icmpv6(type=0/0,code=0/0), packets:17, bytes:1768, used:0.980s, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01:11)),set(ipv6(hlimit=63)),s_pf0vf0 [root@wsfd-advnetlab18 ~]# uname -a Linux wsfd-advnetlab18.anl.lab.eng.bos.redhat.com 4.18.0-305.el8.x86_64 #1 SMP Thu Apr 29 08:54:30 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux [root@wsfd-advnetlab18 ~]# rpm -qa | grep -E "openvswitch2.15|ovn-2.21" python3-openvswitch2.15-2.15.0-23.el8fdp.x86_64 openvswitch2.15-2.15.0-23.el8fdp.x86_64 ovn-2021-21.03.0-40.el8fdp.x86_64 ovn-2021-host-21.03.0-40.el8fdp.x86_64 ovn-2021-central-21.03.0-40.el8fdp.x86_64 [root@wsfd-advnetlab18 ~]# cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-305.el8.x86_64 root=/dev/mapper/rhel_wsfd--advnetlab18-root ro crashkernel=auto resume=/dev/mapper/rhel_wsfd--advnetlab18-swap rd.lvm.lv=rhel_wsfd-advnetlab18/root rd.lvm.lv=rhel_wsfd-advnetlab18/swap console=ttyS1,115200 intel_iommu=on iommu=pt [root@wsfd-advnetlab18 ~]# cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-305.el8.x86_64 root=/dev/mapper/rhel_wsfd--advnetlab18-root ro crashkernel=auto resume=/dev/mapper/rhel_wsfd--advnetlab18-swap rd.lvm.lv=rhel_wsfd-advnetlab18/root rd.lvm.lv=rhel_wsfd-advnetlab18/swap console=ttyS1,115200 intel_iommu=on iommu=pt [root@wsfd-advnetlab18 ~]# ethtool -i ens1f0 driver: mlx5e_rep version: 4.18.0-305.el8.x86_64 firmware-version: 16.27.2008 (MT_0000000013) expansion-rom-version: bus-info: 0000:3b:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no [root@wsfd-advnetlab18 ~]# ip link sh ens1f0 249: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 0c:42:a1:08:0b:02 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 00:00:00:01:01:11 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 1 link/ether 00:00:00:01:01:12 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 2 link/ether 00:00:00:01:02:11 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 3 link/ether 00:00:00:01:02:12 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off [root@wsfd-advnetlab18 ~]# ethtool -i s_pf0vf0 driver: mlx5e_rep version: 4.18.0-305.el8.x86_64 firmware-version: 16.27.2008 (MT_0000000013) expansion-rom-version: bus-info: supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no [root@wsfd-advnetlab18 ~]# ethtool -i s_pf0vf2 driver: mlx5e_rep version: 4.18.0-305.el8.x86_64 firmware-version: 16.27.2008 (MT_0000000013) expansion-rom-version: bus-info: supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no [root@wsfd-advnetlab18 ~]# lspci | grep mell -i 3b:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] 3b:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] 3b:00.2 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 3b:00.3 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 3b:00.4 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 3b:00.5 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 3b:01.2 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 3b:01.3 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]
Created attachment 1788771 [details] flows for br-int
[root@wsfd-advnetlab18 ~]# ovs-vsctl show 928f0153-86ca-4031-9f1d-73287246b0aa Bridge br-int fail_mode: secure Port s_pf1vf0 Interface s_pf1vf0 Port s_pf0vf3 Interface s_pf0vf3 Port br-int Interface br-int type: internal Port s_pf0vf0 Interface s_pf0vf0 Port ovn-hv0-0 Interface ovn-hv0-0 type: geneve options: {csum="true", key=flow, remote_ip="20.0.174.26"} Port s_pf0vf2 Interface s_pf0vf2 Port s_pf1vf1 Interface s_pf1vf1 Port s_pf0vf1 Interface s_pf0vf1 ovs_version: "2.15.1"
tcp and udp are offloaded: [root@wsfd-advnetlab16 ~]# ovs-appctl dpctl/dump-flows -m --names ufid:176d34f0-cc49-490b-9234-e6997ba86aa5, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf0),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x0800),ipv4(src=192.168.1.0/255.255.255.128,dst=64.0.0.0/224.0.0.0,proto=0/0,tos=0/0,ttl=64,frag=no), packets:0, bytes:0, used:never, dp:tc, actions:ct_clear ufid:cfedf95e-7647-4700-b80c-980d3df4f0d5, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf0),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x86dd),ipv6(src=2001::/ffff:ffff:ffff:ffff::,dst=7777:172::11,label=0/0,proto=6,tclass=0/0,hlimit=64,frag=no),tcp(src=0/0,dst=0/0), packets:2097272, bytes:3170661917, used:0.521s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:02,dst=00:00:00:01:02:11)),set(ipv6(hlimit=63)),s_pf0vf2 ufid:d357c98f-62a1-4bc9-8e40-c826288aba02, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x86dd),ipv6(src=7777:172::/ffff:ffff:ffff:ffff::,dst=2001::11,label=0/0,proto=6,tclass=0/0,hlimit=64,frag=no),tcp(src=0/0,dst=0/0), packets:22151, bytes:1904998, used:0.520s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01:11)),set(ipv6(hlimit=63)),s_pf0vf0 ufid:c7aef2c3-9dba-44fb-ab61-d14b2fd8d1f2, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(s_pf0vf0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x0806),arp(sip=192.168.1.11,tip=192.168.1.254,op=1/0xff,sha=00:00:00:01:01:11,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never, dp:ovs, actions:userspace(pid=3636835820,slow_path(action)) [root@wsfd-advnetlab16 ~]# ovs-appctl dpctl/dump-flows -m --names ufid:d4f2d986-a742-4104-8859-6461588be0e5, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf0),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x86dd),ipv6(src=2001::/ffff:ffff:ffff:ffff::,dst=7777:172::11,label=0/0,proto=6,tclass=0/0,hlimit=64,frag=no),tcp(src=0/0,dst=0/0), packets:10, bytes:1023, used:0.180s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:02,dst=00:00:00:01:02:11)),set(ipv6(hlimit=63)),s_pf0vf2 ufid:e76a055c-ecfb-4d39-8be8-2ce235982e6c, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf0),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:01:11,dst=00:00:00:00:00:01),eth_type(0x86dd),ipv6(src=2001::/ffff:ffff:ffff:ffff::,dst=7777:172::11,label=0/0,proto=17,tclass=0/0,hlimit=64,frag=no),udp(src=0/0,dst=0/0), packets:92, bytes:137080, used:0.180s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:02,dst=00:00:00:01:02:11)),set(ipv6(hlimit=63)),s_pf0vf2 ufid:e1ca52da-64d6-4d00-a2b6-0bf533004e8b, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x86dd),ipv6(src=7777:172::/ffff:ffff:ffff:ffff::,dst=2001::11,label=0/0,proto=6,tclass=0/0,hlimit=64,frag=no),tcp(src=0/0,dst=0/0), packets:7, bytes:607, used:0.181s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01:11)),set(ipv6(hlimit=63)),s_pf0vf0 ufid:21199520-c3b6-4715-a4fb-959170e300db, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x86dd),ipv6(src=7777:172::/ffff:ffff:ffff:ffff::,dst=2001::11,label=0/0,proto=17,tclass=0/0,hlimit=64,frag=no),udp(src=0/0,dst=0/0), packets:0, bytes:0, used:1.220s, offloaded:yes, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01:11)),set(ipv6(hlimit=63)),s_pf0vf0 only icmp6 is not offloaded
Hi Alaa, Any known limitation on offloading icmp6 as above in comment #0? It is using dp:tc, but not getting offloaded. Jianlin, maybe you can grab an extack message out of the failure to offload with the new perf probe that was added via https://bugzilla.redhat.com/show_bug.cgi?id=1956983 Thanks, Marcelo
(In reply to Marcelo Ricardo Leitner from comment #4) > Hi Alaa, > > Any known limitation on offloading icmp6 as above in comment #0? It is using > dp:tc, but not getting offloaded. > > Jianlin, maybe you can grab an extack message out of the failure to offload > with the new perf probe that was added via > https://bugzilla.redhat.com/show_bug.cgi?id=1956983 > > Thanks, > Marcelo [root@wsfd-advnetlab16 ~]# uname -a Linux wsfd-advnetlab16.anl.lab.eng.bos.redhat.com 4.18.0-312.el8.x86_64 #1 SMP Wed Jun 2 16:30:46 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux [root@wsfd-advnetlab16 ~]# perf script -v build id event received for [kernel.kallsyms]: d62e5584320038183a19ae49ebdee183cbdf29b5 [20] build id event received for [vdso]: 8eebcc6a6fa31251db815633bb5eb9259bbbcb55 [20] Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols handler2 8892 [010] 11156.013083: netlink:netlink_extack: msg=mlx5_core: can't offload TC csum action for some header/s handler2 8892 [010] 11156.029937: netlink:netlink_extack: msg=mlx5_core: can't offload TC csum action for some header/s
Thanks, Jianlin. There is another message in dmesg log with the problematic flag. From your system: /var/log/messages:Jun 16 04:56:28 wsfd-advnetlab16 kernel: s_pf0vf0: can't offload TC csum action for some header/s - flags 0x2 ^^^^^^^^^^ Flag 0x2 is TCA_CSUM_UPDATE_FLAG_ICMP And we can see in function csum_offload_supported() that csum on ICMP is not supported: https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c#L2836 rewrite ICMP header + TC csum action is not supported, so closing the BZ. Regards Alaa
(In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #6) > Thanks, Jianlin. > > There is another message in dmesg log with the problematic flag. > From your system: > > /var/log/messages:Jun 16 04:56:28 wsfd-advnetlab16 kernel: s_pf0vf0: can't > offload TC csum action for some header/s - flags 0x2 > > ^^^^^^^^^^ > > Flag 0x2 is TCA_CSUM_UPDATE_FLAG_ICMP > And we can see in function csum_offload_supported() that csum on ICMP is not > supported: > https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/mellanox/ > mlx5/core/en_tc.c#L2836 > > rewrite ICMP header + TC csum action is not supported, so closing the BZ. @dumitru , is the "rewrite ICMP header + TC csum action" required by ovs flow added by ovn? the topo is described in description: ls-lr-ls > > Regards > Alaa
(In reply to Jianlin Shi from comment #7) > (In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #6) > > Thanks, Jianlin. > > > > There is another message in dmesg log with the problematic flag. > > From your system: > > > > /var/log/messages:Jun 16 04:56:28 wsfd-advnetlab16 kernel: s_pf0vf0: can't > > offload TC csum action for some header/s - flags 0x2 > > > > ^^^^^^^^^^ > > > > Flag 0x2 is TCA_CSUM_UPDATE_FLAG_ICMP > > And we can see in function csum_offload_supported() that csum on ICMP is not > > supported: > > https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/mellanox/ > > mlx5/core/en_tc.c#L2836 > > > > rewrite ICMP header + TC csum action is not supported, so closing the BZ. > > @dumitru , is the "rewrite ICMP header + TC csum action" required by ovs > flow added by ovn? the topo is described in description: ls-lr-ls OVN adds flows that manipulate headers, including ICMP/ICMPv6. OVN does *not* control hw offload and any TC rules, that's external, and handled by OVS. However, looking at the flow that wasn't offloaded I don't see any ICMPv6 header changes: ufid:3e8cc87a-7746-411a-bff0-815ea790fa57, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x86dd),ipv6(src=7777:174::/ffff:ffff:ffff:ffff::,dst=2001::11,label=0/0,proto=58,tclass=0/0,hlimit=64,frag=no),icmpv6(type=0/0,code=0/0), packets:17, bytes:1768, used:0.980s, dp:tc, actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01:11)),set(ipv6(hlimit=63)),s_pf0vf0 The only ipv6 change there is decrementing hlimit due to routing. > > > > > Regards > > Alaa Regards, Dumitru
(In reply to Dumitru Ceara from comment #8) > However, looking at the flow that wasn't offloaded I don't see any ICMPv6 > header changes: Good point. > > ufid:3e8cc87a-7746-411a-bff0-815ea790fa57, > skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0), > ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(s_pf0vf2),packet_type(ns=0/0, > id=0/0),eth(src=00:00:00:01:02:11,dst=00:00:00:00:00:02),eth_type(0x86dd), > ipv6(src=7777:174::/ffff:ffff:ffff:ffff::,dst=2001::11,label=0/0,proto=58, > tclass=0/0,hlimit=64,frag=no),icmpv6(type=0/0,code=0/0), packets:17, > bytes:1768, used:0.980s, dp:tc, > actions:ct_clear,ct_clear,set(eth(src=00:00:00:00:00:01,dst=00:00:00:01:01: > 11)),set(ipv6(hlimit=63)),s_pf0vf0 > > The only ipv6 change there is decrementing hlimit due to routing. That 0x2 from dmesg on comment #6 means TCA_CSUM_UPDATE_FLAG_ICMP and apparently nobody else other than OVS specifies it. static inline int csum_update_flag(struct tc_flower *flower, enum pedit_header_type htype) { /* Explictily specifiy the csum flags so HW can return EOPNOTSUPP * if it doesn't support a checksum recalculation of some headers. * And since OVS allows a flow such as * eth(dst=<mac>),eth_type(0x0800) actions=set(ipv4(src=<new_ip>)) * we need to force a more specific flow as this can, for example, * need a recalculation of icmp checksum if the packet that passes * is ICMPv6 and tcp checksum if its tcp. */ switch (htype) { case TCA_PEDIT_KEY_EX_HDR_TYPE_IP4: flower->csum_update_flags |= TCA_CSUM_UPDATE_FLAG_IPV4HDR; /* Fall through. */ case TCA_PEDIT_KEY_EX_HDR_TYPE_IP6: case TCA_PEDIT_KEY_EX_HDR_TYPE_TCP: case TCA_PEDIT_KEY_EX_HDR_TYPE_UDP: if (flower->key.ip_proto == IPPROTO_TCP) { flower->needs_full_ip_proto_mask = true; flower->csum_update_flags |= TCA_CSUM_UPDATE_FLAG_TCP; } else if (flower->key.ip_proto == IPPROTO_UDP) { flower->needs_full_ip_proto_mask = true; flower->csum_update_flags |= TCA_CSUM_UPDATE_FLAG_UDP; } else if (flower->key.ip_proto == IPPROTO_ICMP) { flower->needs_full_ip_proto_mask = true; } else if (flower->key.ip_proto == IPPROTO_ICMPV6) { flower->needs_full_ip_proto_mask = true; flower->csum_update_flags |= TCA_CSUM_UPDATE_FLAG_ICMP; But AFAICT from nl_msg_put_flower_rewrite_pedits, that should be handling the per header pedit requests. Lets reopen this one for now. There's still smoke coming from this bush.
It could be that tc.c:calc_offsets() is calculating something wrongly, confusing csum_update_flag() above. But this is just a theory ATM.
(In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #6) > Thanks, Jianlin. > > There is another message in dmesg log with the problematic flag. > From your system: > > /var/log/messages:Jun 16 04:56:28 wsfd-advnetlab16 kernel: s_pf0vf0: can't > offload TC csum action for some header/s - flags 0x2 > > ^^^^^^^^^^ > > Flag 0x2 is TCA_CSUM_UPDATE_FLAG_ICMP > And we can see in function csum_offload_supported() that csum on ICMP is not > supported: > https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/mellanox/ > mlx5/core/en_tc.c#L2836 > > rewrite ICMP header + TC csum action is not supported, so closing the BZ. will this kind of operation be supported in the future? btw, do you have any documentation about what the hw offload support? > > Regards > Alaa
(In reply to Jianlin Shi from comment #11) > (In reply to Alaa Hleihel (NVIDIA Mellanox) from comment #6) > > Thanks, Jianlin. > > > > There is another message in dmesg log with the problematic flag. > > From your system: > > > > /var/log/messages:Jun 16 04:56:28 wsfd-advnetlab16 kernel: s_pf0vf0: can't > > offload TC csum action for some header/s - flags 0x2 > > > > ^^^^^^^^^^ > > > > Flag 0x2 is TCA_CSUM_UPDATE_FLAG_ICMP > > And we can see in function csum_offload_supported() that csum on ICMP is not > > supported: > > https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/mellanox/ > > mlx5/core/en_tc.c#L2836 > > > > rewrite ICMP header + TC csum action is not supported, so closing the BZ. > > will this kind of operation be supported in the future? No, it's a HW limitation, so it won't be supported. When doing header rewrite, ovs can ask re-calculates the relevant L3/L4 checksums, and we do not support it for icmpv6 > btw, do you have any documentation about what the hw offload support? Unfortunately, there is no such document...
What I don't follow is why icmp6 checksum calc is getting activated. It could be because it's matching against icmp type, but then, if it's a match, it shouldn't need to recompute the checksum. "...,icmpv6(type=0/0,code=0/0)..." maybe ovs is just being extra safe?
Please note that for HWOL I'm considering this bug as a low priority one. ICMPs are a really low volume traffic and doesn't impact much the solution. If you disagree, please comment. Thanks.
(In reply to Marcelo Ricardo Leitner from comment #13) > What I don't follow is why icmp6 checksum calc is getting activated. > It could be because it's matching against icmp type, but then, if it's a > match, it shouldn't need to recompute the checksum. > "...,icmpv6(type=0/0,code=0/0)..." > maybe ovs is just being extra safe? Hrrm... maybe we should consider a change like the following: https://github.com/orgcandman/ovs/tree/rfc_csum_ip6 WDYT?
It's towards the right direction, I think, but I don't get why some protocols got 'true' for the new flag, such as TCA_PEDIT_KEY_EX_HDR_TYPE_ETH and TCA_PEDIT_KEY_EX_HDR_TYPE_TCP.