RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 854066 - [rhel6] lvs: issues with GRO / icmp fragmentation needed
Summary: [rhel6] lvs: issues with GRO / icmp fragmentation needed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.3
Hardware: All
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Jesper Brouer
QA Contact: Jan Tluka
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-03 21:31 UTC by Marcelo Ricardo Leitner
Modified: 2018-11-30 20:54 UTC (History)
1 user (show)

Fixed In Version: kernel-2.6.32-328.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 854067 (view as bug list)
Environment:
Last Closed: 2013-02-21 06:33:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
0001-Backport-of-8f1b03a-ipvs-allow-transmit-of-GRO-aggre.patch (4.63 KB, patch)
2012-09-03 21:31 UTC, Marcelo Ricardo Leitner
no flags Details | Diff
0002-Also-handle-GSO-at-ip_vs_dr_xmit_v6.patch (974 bytes, patch)
2012-09-03 21:32 UTC, Marcelo Ricardo Leitner
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0496 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6 kernel update 2013-02-20 21:40:54 UTC

Description Marcelo Ricardo Leitner 2012-09-03 21:31:52 UTC
Created attachment 609477 [details]
0001-Backport-of-8f1b03a-ipvs-allow-transmit-of-GRO-aggre.patch

When using LVS LoadBalancer with GRO enabled, the server will often drop incoming packets and reply with ICMP Fragmentation Needed, nuking the performance.

This happens because GRO will make packets seem larger than they are at real and will confuse the sender.

Upstream commit 8f1b03a4c18e8f3f0801447b62330faa8ed3bb37 fixes this.

Attached is my backport of it for RHEL 6.

Comment 2 Marcelo Ricardo Leitner 2012-09-03 21:32:37 UTC
Created attachment 609478 [details]
0002-Also-handle-GSO-at-ip_vs_dr_xmit_v6.patch

This patch is also needed.

Comment 3 RHEL Program Management 2012-09-12 10:01:05 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 4 Jesper Brouer 2012-09-27 10:47:46 UTC
Fixing this is important, because of the bugs effect.

The bug will result in extremely bad TCP performance, when
enabling GRO/GSO on a machine running IPVS/LVS.  The TCP
connection will continue to "work", but only by retransmitting
all data (almost three time), as only TCP segments with a single
packet will be allowed through (without causing a ICMP frag
needed).

Comment 15 Jesper Brouer 2012-10-05 22:34:29 UTC
Simply make sure that GSO and TSO are enabled on all hosts.
 "ethtool -K ethX tso on gso on"

And run e.g. an iperf test through the LVS/IPVS setup.

Its the exact same test as in bug 854067 comment #3 (which is the RHEL5 equiv).

On my KVM system I see the following performance numbers:
 - With GSO enabled, and no patch:  58 Kbit/sec (very low, lots of TCP retrans)
 - Without GSO, and no patch:      1.3 Gbits/sec
 - With GSO, and with patch:      12.4 Gbits/sec

You can just tcpdump the traffic and see that big packets are transmitted, and observer that no ICMP error messages and TCP retransmits occur.

Comment 16 Jarod Wilson 2012-10-10 19:52:06 UTC
Patch(es) available on kernel-2.6.32-328.el6

Comment 19 Jan Tluka 2012-10-22 12:50:52 UTC
Reproduced on -279.el6, the TCP retransmission occurs and iperf gets to 90kb/s and retransmit occurs during test:

------------------------------------------------------------
Client connecting to 192.168.122.6, TCP port 5001
TCP window size: 23.2 KByte (default)
------------------------------------------------------------
[  3] local 192.168.122.1 port 39810 connected with 192.168.122.6 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-23.2 sec   256 KBytes  90.4 Kbits/sec


Verified on -330.el6 kernel, with gso/tso enabled I get throughput of 2 Gb/s and no retransmit occurs:

------------------------------------------------------------
Client connecting to 192.168.122.6, TCP port 5001
TCP window size: 23.2 KByte (default)
------------------------------------------------------------
[  3] local 192.168.122.1 port 39834 connected with 192.168.122.6 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2.74 GBytes  2.35 Gbits/sec

Comment 22 errata-xmlrpc 2013-02-21 06:33:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html

Comment 23 daryl herzmann 2013-06-11 12:14:56 UTC
For what its probably not worth, I am still seeing this problem with RHEL6.4 2.6.32-358.11.1.el6.x86_64

I have a LVS NAT setup with a Broadcom Corporation NetXtreme II BCM5709 , I get brutal throughput with GRO enabled.  Turning it off and things are 'fine'.

# ethtool -k eth1
Features for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: off

Comment 24 Marcelo Ricardo Leitner 2013-06-11 12:22:43 UTC
(In reply to Daryl Herzmann from comment #23)
> For what its probably not worth, I am still seeing this problem with RHEL6.4
> 2.6.32-358.11.1.el6.x86_64
> 
> I have a LVS NAT setup with a Broadcom Corporation NetXtreme II BCM5709 , I
> get brutal throughput with GRO enabled.  Turning it off and things are
> 'fine'.

Are you also seeing icmp fragmentation needed?

Anyway, as this is already in Errata state, please open a new bug. Feel free to Cc me on the new one.

Comment 25 daryl herzmann 2013-06-11 12:51:34 UTC
(In reply to Marcelo Ricardo Leitner from comment #24)
> Anyway, as this is already in Errata state, please open a new bug. Feel free
> to Cc me on the new one.

thanks, I opened https://bugzilla.redhat.com/show_bug.cgi?id=973190


Note You need to log in before you can comment on or make changes to this bug.