Bug 1565205 - TCP packet re-ordering noticed between instance on same compute using dpdk network
Summary: TCP packet re-ordering noticed between instance on same compute using dpdk n...
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
Target Milestone: async
: 10.0 (Newton)
Assignee: Aaron Conole
QA Contact: Ofer Blaut
Depends On:
TreeView+ depends on / blocked
Reported: 2018-04-09 15:27 UTC by Jaison Raju
Modified: 2020-11-03 07:27 UTC (History)
29 users (show)

Fixed In Version: openvswitch-2.6.1-31.git20180130
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-06-07 18:02:06 UTC
Target Upstream Version:

Attachments (Terms of Use)
Patched RPM (13.90 MB, application/x-rpm)
2018-06-19 13:59 UTC, Aaron Conole
no flags Details
Patched RPM - not src version (4.56 MB, application/x-rpm)
2018-06-25 13:07 UTC, Aaron Conole
no flags Details
openvswitch package with the EMC reorder backport (2.81 MB, application/x-rpm)
2018-09-25 19:16 UTC, Matteo Croce
no flags Details

System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3443011 0 None None None 2018-07-23 06:47:26 UTC

Description Jaison Raju 2018-04-09 15:27:39 UTC
Description of problem:
TCP packet re-ordering noticed between instance on same compute using dpdk  network .

A 1GB file is downloaded from instance over http to another instance using wget over vxlan tenant netdev/dpdk network which are running on same compute node.
tcp packet are seen out of order by ~ 0.1ms.
These are not drops, but actual segements arriving late.

Version-Release number of selected component (if applicable):

How reproducible:
Only in customer environment

Steps to Reproduce:

Actual results:

Expected results:

Additional info:
i. Instance using single queue on same compute node.
ii. Confirmed that a single vcpu servicing eth2 interrupts in the guest.
iii. Non-isolated cpus used for guest. CPUAffinity is moved aware from these cores via systemd though.
iv. HT enabled.
v. Tested with 4 pmd cores . Tested with single pmd core.
The issue is still noticed.
vi. MTU of vhu interface is 1950, but tcpdump indicates 1514 segment size.

Despite fragmentation, we still don't expect packets to go out of order.

Comment 7 Jaison Raju 2018-04-25 07:52:34 UTC
I think this 1514 size frames are seen because of the MTU on br-tun is 1500.

This bridge is automatically created by neutron.
I dont see any option to configure an MTU for br-tun. Any suggestions?

I have requested customer to test again with 
sudo ip link set dev br-tun mtu 1950

Comment 25 Aaron Conole 2018-06-17 13:32:24 UTC
It's possible this is related to https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/348267.html

Would the customer be willing to try a build?  I can patch this in and send it along just as a wild guess (seems to possibly match the symptoms?)

Comment 28 Aaron Conole 2018-06-19 13:59:24 UTC
Created attachment 1452950 [details]
Patched RPM

Comment 31 Aaron Conole 2018-06-25 13:07:48 UTC
Created attachment 1454356 [details]
Patched RPM - not src version

Comment 76 Lon Hohberger 2018-09-25 10:37:15 UTC
According to our records, this should be resolved by openvswitch-2.9.0-56.el7fdp.  This build is available now.

Comment 77 Matteo Croce 2018-09-25 19:16:30 UTC
Created attachment 1486911 [details]
openvswitch package with the EMC reorder backport

Hi all,

This is an RPM with a 6 patch series backported, most notably this one:

From 9b4f08cdcaf253175edda088683bdd3db9e4c097 Mon Sep 17 00:00:00 2001
From: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Date: Fri, 27 Jul 2018 23:56:37 +0530
Subject: [PATCH] dpif-netdev: Avoid reordering of packets in a batch with same megaflow

full commit message here:


Comment 89 Alex McLeod 2019-04-01 16:20:10 UTC

If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 92 Yariv 2019-04-15 10:04:48 UTC

Till 10z8 we were using openvswitch-2.6.1-16.git20161206.el7ost.x86_64.rpm  
10z9 deployments moved to openvswitch-2.9.0-56.el7fdp.x86_64.rpm  

Can you advise which version and openvswitch to test?

Note You need to log in before you can comment on or make changes to this bug.