Bug 1005804 - modem-like speed when transmitting TCP to a floating IP
modem-like speed when transmitting TCP to a floating IP
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
x86_64 Linux
unspecified Severity urgent
: z3
: 3.0
Assigned To: Radomir Vrbovsky
Jean-Tsung Hsiao
: Regression, Reopened, ZStream
Depends On: 997632 1091629 1132588
Blocks: 1091627
  Show dependency treegraph
 
Reported: 2013-09-09 08:25 EDT by Jaroslav Henner
Modified: 2016-09-06 04:39 EDT (History)
10 users (show)

See Also:
Fixed In Version: kernel-2.6.32-358.123.3.openstack.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1091627 (view as bug list)
Environment:
Last Closed: 2013-11-14 12:41:57 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jaroslav Henner 2013-09-09 08:25:46 EDT
Description of problem:
We have experienced a regression of TCP speed when downloading some data to our instances. After some investigation I found that switching back to kernel 114 from 118 on the controller helped.

I am not sure whether it may be related, but all of our HW machines have nics:
Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20).

the controller runs as a libvirt VM on same HW as is used for compute nodes, but as VM, virtio interfaces, bridged with the real HW ones.

Both interfaces in each node are in same native VLAN:

              +--------+
[Switch]      | Node 1 |
 | |          |--------|
 | +--------- o eth0   | public iface
 +=========== o eth1   | private iface,  VLANs trunk, native VLAN same as eth0
              +--------+


Version-Release number of selected component (if applicable):
dracut-kernel.noarch                  004-303.el6          @anaconda-RedHatEnterpriseLinux-201301301459.x86_64/6.4
kernel.x86_64                         2.6.32-358.111.1.openstack.el6
kernel.x86_64                         2.6.32-358.114.1.openstack.el6
kernel.x86_64                         2.6.32-358.118.1.openstack.el6
kernel-debuginfo.x86_64               2.6.32-358.118.1.openstack.el6
kernel-debuginfo-common-x86_64.x86_64 2.6.32-358.118.1.openstack.el6
kernel-devel.x86_64                   2.6.32-358.114.1.openstack.el6
kernel-devel.x86_64                   2.6.32-358.118.1.openstack.el6
kernel-firmware.noarch                2.6.32-358.118.1.openstack.el6
kernel-headers.x86_64                 2.6.32-358.118.1.openstack.el6
openstack-quantum.noarch              2013.1.3-1.el6ost    @puddle              
openstack-quantum-openvswitch.noarch  2013.1.3-1.el6ost    @puddle              
openvswitch.x86_64                    1.9.0-2.el6ost       @puddle              
python-quantum.noarch                 2013.1.3-1.el6ost    @puddle              
python-quantumclient.noarch           2:2.2.1-2.el6ost     @puddle              


How reproducible:
1/1

Steps to Reproduce:
1. boot kernel ...114, measure the speed with iperf, from the controller to a instance, using floating IP
2. boot kernel ...118, measure again

Actual results:
The speed with 118 is like I remember from my school years, less than half megabit


Expected results:
way more than Mbit

Additional info:
Comment 1 Jaroslav Henner 2013-09-09 08:28:36 EDT
Created attachment 795598 [details]
kernel-114.pcapng.bz2
Comment 7 Jaroslav Henner 2013-09-10 02:55:38 EDT
It really helps to switch back to older kernel and I checked the logs and as IIRC problems started 5th of September, it correlates. Setting regression. I will check whether our other deployment suffers from this and whether I cannot get some more info.


I can see those in messages:
Aug 22 09:29:55 controller kernel: Linux version 2.6.32-358.114.1.openstack.el6.x86_64 (mockbuild@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Jul 3 02:11:25 EDT 2013
Sep  4 17:09:27 controller kernel: Linux version 2.6.32-358.118.1.openstack.el6.x86_64 (mockbuild@x86-007.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Aug 14 13:18:08 EDT 2013
Sep  5 10:06:45 controller kernel: Linux version 2.6.32-358.118.1.openstack.el6.x86_64 (mockbuild@x86-007.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Aug 14 13:18:08 EDT 2013
Sep  6 12:29:42 controller kernel: Linux version 2.6.32-358.118.1.openstack.el6.x86_64 (mockbuild@x86-007.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Aug 14 13:18:08 EDT 2013
Comment 10 Thomas Graf 2013-09-11 07:48:29 EDT
Can you confirm that the OVS ports are all untagged? I was assuming so far that is the case but it's not entirely clear based on the information in this BZ.
Comment 11 Jaroslav Henner 2013-09-16 14:48:48 EDT
(In reply to Thomas Graf from comment #10)
> Can you confirm that the OVS ports are all untagged? I was assuming so far
> that is the case but it's not entirely clear based on the information in
> this BZ.

I don't much understand the question. In the attachment controller_status you can see that there are many ports tagged in in br-int:


+ovs-vsctl show
5a53e753-ecea-45e3-8d0e-8cb9db8710bb
    Bridge br-int
        Port br-int
            Interface br-int
                type: internal
        Port "tap1ae1918a-71"
            tag: 3
            Interface "tap1ae1918a-71"
        Port "tapabe9258e-8d"
            tag: 3
            Interface "tapabe9258e-8d"
        Port "tapa3d1159a-dc"
            tag: 5
            Interface "tapa3d1159a-dc"
        Port "tap959fdd9b-c2"
            tag: 1
            Interface "tap959fdd9b-c2"
...
Comment 12 Thomas Graf 2013-09-16 18:08:41 EDT
The question was basically if this is a DUP of BZ997632 which I think is the case. Would you agree with closing this as a DUP of BZ997632?
Comment 13 Jean-Tsung Hsiao 2013-09-16 20:50:02 EDT
(In reply to Thomas Graf from comment #12)
> The question was basically if this is a DUP of BZ997632 which I think is the
> case. Would you agree with closing this as a DUP of BZ997632?

Hi Thomas,

I think you're right --- this is a dup of Bug 997632.

Today, I just identified that VXLAN had the issue. So, I'm in the process of identifying at which build we started having this issue.

Stay tune.

Thanks!

Jean
Comment 14 Jean-Tsung Hsiao 2013-09-17 05:19:51 EDT
Was gre tunnel or vxlan part of the data path?

Thanks!

Jean
Comment 15 Jaroslav Henner 2013-09-18 04:54:51 EDT
(In reply to Jean-Tsung Hsiao from comment #14)
> Was gre tunnel or vxlan part of the data path?
> 
> Thanks!
> 
> Jean

No. As I drove above, in the bug description, my bare-metal machines have eth1 ifaces interconnected using trunk ports trunking several VLANs. No GRE nor VXLAN. The VLANs are terminated on the bare-metal.
Comment 19 Thomas Graf 2013-10-07 06:58:50 EDT
This is a duplicate of BZ997632 thus I'm marking it as such in order to keep all the information together.

*** This bug has been marked as a duplicate of bug 997632 ***
Comment 23 Radomir Vrbovsky 2013-10-28 04:49:02 EDT
fixed in kernel-2.6.32-358.123.3.openstack.el6
Comment 27 errata-xmlrpc 2013-11-14 12:41:57 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1520.html
Comment 28 Jean-Tsung Hsiao 2014-05-31 22:13:56 EDT
Hi,

I would like to learn how to re-produce this issue.

Thanks in advance!

Jean

Note You need to log in before you can comment on or make changes to this bug.