Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1685616

Summary: TCP checksum issues when using kernel space OVS with netdev datapath
Product: Red Hat Enterprise Linux 7 Reporter: Andreas Karis <akaris>
Component: openvswitchAssignee: Eelco Chaudron <echaudro>
Status: CLOSED WONTFIX QA Contact: ovs-qe
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.6CC: atragler, maxime.coquelin, ovs-qe, tredaelli
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-01 13:17:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Karis 2019-03-05 16:37:50 UTC
Description of problem:
In https://bugzilla.redhat.com/show_bug.cgi?id=1670169 , I found out that there were issues with TCP checksum offloading when kernel space OVS was used and the bridge was switched to datapath_type netdev. While I find this combination highly odd, we *do* permit it and hence we have 2 options:
* fix the TCP checksum offload issues 
* deny this combination of settings

Version-Release number of selected component (if applicable):
I tested with a recent test build of OVS, but older versions of OVS 2.9 are affected  as well:
~~~
[root@overcloud-compute-0 ~]# rpm -qa | grep openvswitch
openvswitch-ovn-central-2.9.0-98.el7fdn.x86_64
openstack-neutron-openvswitch-9.4.1-32.el7ost.noarch
openvswitch-2.9.0-98.el7fdn.x86_64
python-openvswitch-2.9.0-98.el7fdn.x86_64
openvswitch-ovn-common-2.9.0-98.el7fdn.x86_64
openvswitch-selinux-extra-policy-1.0-3.el7fdp.noarch
openvswitch-ovn-host-2.9.0-98.el7fdn.x86_64
~~~

How reproducible:

Start OVS in kernel space mode (hence do not configure DPDK stuff), then add a new bridge, namespaces, veth pairs:
~~~
ovs-vsctl add-br test
ip link add name right1 type veth peer name left1
ip link add name right2 type veth peer name left2
ovs-vsctl add-port test left1
ovs-vsctl add-port test left2
ip link set dev left1 up
ip link set dev left2 up
ip netns add netns1
ip netns add netns2
ip link set dev right1 netns netns1
ip link set dev right2 netns netns2
ip netns exec netns1 ip link set dev lo up
ip netns exec netns1 ip link set dev right1 up
ip netns exec netns2 ip link set dev right2 up
ip netns exec netns2 ip link set dev lo up
ip netns exec netns1 ip  a a dev right1 192.168.0.1/24
ip netns exec netns2 ip a a dev right2 192.168.0.2/24
~~~

Now, make sure to disable TX offloading on the server side and start netcat:
~~~
[root@overcloud-compute-0 ~]# ip netns exec netns2 ethtool -K right2 tx off
Actual changes:
tx-checksumming: off
        tx-checksum-ip-generic: off
        tx-checksum-sctp: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [requested on]
        tx-tcp-ecn-segmentation: off [requested on]
        tx-tcp6-segmentation: off [requested on]
        tx-tcp-mangleid-segmentation: off [requested on]
udp-fragmentation-offload: off [requested on]
[root@overcloud-compute-0 ~]# ip netns exec netns2 nc -k -l -p 8000
~~~

With this, we can now run the following tests on the client side:

i) working, tx csum offloading off:
~~~
[root@overcloud-compute-0 ~]# ip netns exec netns1 ethtool -K right1 tx off
Actual changes:
tx-checksumming: off
        tx-checksum-ip-generic: off
        tx-checksum-sctp: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [requested on]
        tx-tcp-ecn-segmentation: off [requested on]
        tx-tcp6-segmentation: off [requested on]
        tx-tcp-mangleid-segmentation: off [requested on]
udp-fragmentation-offload: off [requested on]
[root@overcloud-compute-0 ~]# echo "test" | ip netns exec netns1 nc 192.168.0.2 8000
~~~

ii) not working, tx csum offloading on:
~~~
[root@overcloud-compute-0 ~]# ip netns exec netns1 ethtool -K right1 tx on
Actual changes:
tx-checksumming: on
        tx-checksum-ip-generic: on
        tx-checksum-sctp: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
        tx-tcp-mangleid-segmentation: on
udp-fragmentation-offload: on
[root@overcloud-compute-0 ~]# echo "test" | ip netns exec netns1 nc 192.168.0.2 8000
Ncat: Connection timed out.
~~~

iii) Switch the datapath_type of the bridge to "system":
~~~
[root@overcloud-compute-0 ~]# ovs-vsctl set bridge test datapath_type=system
[root@overcloud-compute-0 ~]# ip netns exec netns2 nc -k -l -p 8000
test
~~~

And this will work even with csum offloading on:
~~~
[root@overcloud-compute-0 ~]# echo "test" | ip netns exec netns1 nc 192.168.0.2 8000
[root@overcloud-compute-0 ~]#
~~~

Additional info:
The combination of netdev + kernel OVS may not be desirable. Then we need to make sure that this combination cannot be configured by users or we should at least log a big WARN message.

Or, if we allow this combination, then we need to find out where this issue is coming from.

Comment 2 Andreas Karis 2019-03-05 16:51:20 UTC
In the above reproducer, I missed the most important step:
~~~
[root@overcloud-compute-0 ~]# ovs-vsctl set bridge test datapath_type=netdev
~~~

Hence, the beginning of the instructions should be as follows ...

Start OVS in kernel space mode (hence do not configure DPDK stuff), then add a new bridge, namespaces, veth pairs:
~~~
ovs-vsctl add-br test
ip link add name right1 type veth peer name left1
ip link add name right2 type veth peer name left2
ovs-vsctl add-port test left1
ovs-vsctl add-port test left2
ip link set dev left1 up
ip link set dev left2 up
ip netns add netns1
ip netns add netns2
ip link set dev right1 netns netns1
ip link set dev right2 netns netns2
ip netns exec netns1 ip link set dev lo up
ip netns exec netns1 ip link set dev right1 up
ip netns exec netns2 ip link set dev right2 up
ip netns exec netns2 ip link set dev lo up
ip netns exec netns1 ip  a a dev right1 192.168.0.1/24
ip netns exec netns2 ip a a dev right2 192.168.0.2/24
ovs-vsctl set bridge test datapath_type=netdev
~~~

Comment 5 Eelco Chaudron 2019-03-25 10:53:15 UTC
After some discussion with upstream this is a perfectly valid configuration, as the netdev datapath supports linux type devices just fine. The only thing to keep in mind is how they are handled.

Linux devices are accessed using a PF_PACKET socket, and are processing in the main thread (so not in the PMD threads). As packets are read through the PF_PACKET interface additional information related to HW offload features are stripped. Or as Ilya explained it upstream:

"""
The issue here is that OVS netdev datapath doesn't
support TX checksum offloading (this is not easy task with arguable profit).
i.e. if packet arrives with bad/no checksum it will be sent to the output port
with same bad/no checksum. Everything works in case of kernel datapth because
the packet doesn't leave the kernel space. In case of netdev datapath some
information (like CHECKSUM_VALID skb flags) is lost while receiving via
socket in userspace and subsequently kernel expects valid checksum while
receiving the packet from userspace because TX offloading is not enabled.

This kind of issues usually mitigated by disabling TX offloading on the
"right*" interfaces, or by setting iptables to fill the checksums like this:

iptables -A POSTROUTING -t mangle -p udp -m udp -j CHECKSUM --checksum-fill

Some related OpenStack bug: https://bugs.launchpad.net/neutron/+bug/1244589

Also, note that this happens only for virtual interfaces like veth/tap because
kernel always tries to delay checksum calculation/validation as much as possible.
Correct packets received from the wire will always have correct checksums.
"""

In addition, to make spotting this unwanted configuration easier the following patch was sent upstream and applied:

  https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357438.html


Also the following patch was sent to make sure patch ports only work with compatible bridges:

  https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357466.html