Bug 1565205

Summary: TCP packet re-ordering noticed between instance on same compute using dpdk network
Product: Red Hat OpenStack Reporter: Jaison Raju <jraju>
Component: openvswitchAssignee: Aaron Conole <aconole>
Status: CLOSED CURRENTRELEASE QA Contact: Ofer Blaut <oblaut>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: aconole, aguetta, akaris, amcleod, apevec, asoni, atelang, atragler, cfields, cfontain, chrisw, dwojewod, fbaudin, fherrman, fsoppels, glamb, jraju, ksundara, mcroce, mowens, nchandek, pablo.iranzo, rhos-maint, rkhan, srevivo, supadhya, tredaelli, vkhitrin, yrachman
Target Milestone: asyncKeywords: TestOnly, Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openvswitch-2.6.1-31.git20180130 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-07 18:02:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patched RPM
none
Patched RPM - not src version
none
openvswitch package with the EMC reorder backport none

Description Jaison Raju 2018-04-09 15:27:39 UTC
Description of problem:
TCP packet re-ordering noticed between instance on same compute using dpdk  network .

A 1GB file is downloaded from instance over http to another instance using wget over vxlan tenant netdev/dpdk network which are running on same compute node.
tcp packet are seen out of order by ~ 0.1ms.
These are not drops, but actual segements arriving late.

Version-Release number of selected component (if applicable):
RHOS10
openvswitch-2.6.1-13.git20161206.el7ost.x86_64

How reproducible:
Only in customer environment

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
i. Instance using single queue on same compute node.
ii. Confirmed that a single vcpu servicing eth2 interrupts in the guest.
iii. Non-isolated cpus used for guest. CPUAffinity is moved aware from these cores via systemd though.
iv. HT enabled.
v. Tested with 4 pmd cores . Tested with single pmd core.
The issue is still noticed.
vi. MTU of vhu interface is 1950, but tcpdump indicates 1514 segment size.

Despite fragmentation, we still don't expect packets to go out of order.

Comment 7 Jaison Raju 2018-04-25 07:52:34 UTC
I think this 1514 size frames are seen because of the MTU on br-tun is 1500.

This bridge is automatically created by neutron.
I dont see any option to configure an MTU for br-tun. Any suggestions?

I have requested customer to test again with 
sudo ip link set dev br-tun mtu 1950

Comment 25 Aaron Conole 2018-06-17 13:32:24 UTC
It's possible this is related to https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/348267.html

Would the customer be willing to try a build?  I can patch this in and send it along just as a wild guess (seems to possibly match the symptoms?)

Comment 28 Aaron Conole 2018-06-19 13:59:24 UTC
Created attachment 1452950 [details]
Patched RPM

Comment 31 Aaron Conole 2018-06-25 13:07:48 UTC
Created attachment 1454356 [details]
Patched RPM - not src version

Comment 76 Lon Hohberger 2018-09-25 10:37:15 UTC
According to our records, this should be resolved by openvswitch-2.9.0-56.el7fdp.  This build is available now.

Comment 77 Matteo Croce 2018-09-25 19:16:30 UTC
Created attachment 1486911 [details]
openvswitch package with the EMC reorder backport

Hi all,

This is an RPM with a 6 patch series backported, most notably this one:

From 9b4f08cdcaf253175edda088683bdd3db9e4c097 Mon Sep 17 00:00:00 2001
From: Vishal Deep Ajmera <vishal.deep.ajmera>
Date: Fri, 27 Jul 2018 23:56:37 +0530
Subject: [PATCH] dpif-netdev: Avoid reordering of packets in a batch with same megaflow

full commit message here:

https://github.com/openvswitch/ovs/commit/9b4f08cdcaf253175edda088683bdd3db9e4c097

Comment 89 Alex McLeod 2019-04-01 16:20:10 UTC
Hi,

If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 92 Yariv 2019-04-15 10:04:48 UTC
Hi 

Till 10z8 we were using openvswitch-2.6.1-16.git20161206.el7ost.x86_64.rpm  
10z9 deployments moved to openvswitch-2.9.0-56.el7fdp.x86_64.rpm  


Can you advise which version and openvswitch to test?