Bug 1685616
| Summary: | TCP checksum issues when using kernel space OVS with netdev datapath | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Andreas Karis <akaris> |
| Component: | openvswitch | Assignee: | Eelco Chaudron <echaudro> |
| Status: | CLOSED WONTFIX | QA Contact: | ovs-qe |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.6 | CC: | atragler, maxime.coquelin, ovs-qe, tredaelli |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-04-01 13:17:37 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Andreas Karis
2019-03-05 16:37:50 UTC
In the above reproducer, I missed the most important step: ~~~ [root@overcloud-compute-0 ~]# ovs-vsctl set bridge test datapath_type=netdev ~~~ Hence, the beginning of the instructions should be as follows ... Start OVS in kernel space mode (hence do not configure DPDK stuff), then add a new bridge, namespaces, veth pairs: ~~~ ovs-vsctl add-br test ip link add name right1 type veth peer name left1 ip link add name right2 type veth peer name left2 ovs-vsctl add-port test left1 ovs-vsctl add-port test left2 ip link set dev left1 up ip link set dev left2 up ip netns add netns1 ip netns add netns2 ip link set dev right1 netns netns1 ip link set dev right2 netns netns2 ip netns exec netns1 ip link set dev lo up ip netns exec netns1 ip link set dev right1 up ip netns exec netns2 ip link set dev right2 up ip netns exec netns2 ip link set dev lo up ip netns exec netns1 ip a a dev right1 192.168.0.1/24 ip netns exec netns2 ip a a dev right2 192.168.0.2/24 ovs-vsctl set bridge test datapath_type=netdev ~~~ After some discussion with upstream this is a perfectly valid configuration, as the netdev datapath supports linux type devices just fine. The only thing to keep in mind is how they are handled. Linux devices are accessed using a PF_PACKET socket, and are processing in the main thread (so not in the PMD threads). As packets are read through the PF_PACKET interface additional information related to HW offload features are stripped. Or as Ilya explained it upstream: """ The issue here is that OVS netdev datapath doesn't support TX checksum offloading (this is not easy task with arguable profit). i.e. if packet arrives with bad/no checksum it will be sent to the output port with same bad/no checksum. Everything works in case of kernel datapth because the packet doesn't leave the kernel space. In case of netdev datapath some information (like CHECKSUM_VALID skb flags) is lost while receiving via socket in userspace and subsequently kernel expects valid checksum while receiving the packet from userspace because TX offloading is not enabled. This kind of issues usually mitigated by disabling TX offloading on the "right*" interfaces, or by setting iptables to fill the checksums like this: iptables -A POSTROUTING -t mangle -p udp -m udp -j CHECKSUM --checksum-fill Some related OpenStack bug: https://bugs.launchpad.net/neutron/+bug/1244589 Also, note that this happens only for virtual interfaces like veth/tap because kernel always tries to delay checksum calculation/validation as much as possible. Correct packets received from the wire will always have correct checksums. """ In addition, to make spotting this unwanted configuration easier the following patch was sent upstream and applied: https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357438.html Also the following patch was sent to make sure patch ports only work with compatible bridges: https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357466.html |