Bug 854066 - [rhel6] lvs: issues with GRO / icmp fragmentation needed
[rhel6] lvs: issues with GRO / icmp fragmentation needed
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.3
All Linux
medium Severity high
: rc
: ---
Assigned To: Jesper Brouer
Jan Tluka
: Patch
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-03 17:31 EDT by Marcelo Ricardo Leitner
Modified: 2013-06-11 08:51 EDT (History)
1 user (show)

See Also:
Fixed In Version: kernel-2.6.32-328.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 854067 (view as bug list)
Environment:
Last Closed: 2013-02-21 01:33:53 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
0001-Backport-of-8f1b03a-ipvs-allow-transmit-of-GRO-aggre.patch (4.63 KB, patch)
2012-09-03 17:31 EDT, Marcelo Ricardo Leitner
no flags Details | Diff
0002-Also-handle-GSO-at-ip_vs_dr_xmit_v6.patch (974 bytes, patch)
2012-09-03 17:32 EDT, Marcelo Ricardo Leitner
no flags Details | Diff

  None (edit)
Description Marcelo Ricardo Leitner 2012-09-03 17:31:52 EDT
Created attachment 609477 [details]
0001-Backport-of-8f1b03a-ipvs-allow-transmit-of-GRO-aggre.patch

When using LVS LoadBalancer with GRO enabled, the server will often drop incoming packets and reply with ICMP Fragmentation Needed, nuking the performance.

This happens because GRO will make packets seem larger than they are at real and will confuse the sender.

Upstream commit 8f1b03a4c18e8f3f0801447b62330faa8ed3bb37 fixes this.

Attached is my backport of it for RHEL 6.
Comment 2 Marcelo Ricardo Leitner 2012-09-03 17:32:37 EDT
Created attachment 609478 [details]
0002-Also-handle-GSO-at-ip_vs_dr_xmit_v6.patch

This patch is also needed.
Comment 3 RHEL Product and Program Management 2012-09-12 06:01:05 EDT
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.
Comment 4 Jesper Brouer 2012-09-27 06:47:46 EDT
Fixing this is important, because of the bugs effect.

The bug will result in extremely bad TCP performance, when
enabling GRO/GSO on a machine running IPVS/LVS.  The TCP
connection will continue to "work", but only by retransmitting
all data (almost three time), as only TCP segments with a single
packet will be allowed through (without causing a ICMP frag
needed).
Comment 15 Jesper Brouer 2012-10-05 18:34:29 EDT
Simply make sure that GSO and TSO are enabled on all hosts.
 "ethtool -K ethX tso on gso on"

And run e.g. an iperf test through the LVS/IPVS setup.

Its the exact same test as in bug 854067 comment #3 (which is the RHEL5 equiv).

On my KVM system I see the following performance numbers:
 - With GSO enabled, and no patch:  58 Kbit/sec (very low, lots of TCP retrans)
 - Without GSO, and no patch:      1.3 Gbits/sec
 - With GSO, and with patch:      12.4 Gbits/sec

You can just tcpdump the traffic and see that big packets are transmitted, and observer that no ICMP error messages and TCP retransmits occur.
Comment 16 Jarod Wilson 2012-10-10 15:52:06 EDT
Patch(es) available on kernel-2.6.32-328.el6
Comment 19 Jan Tluka 2012-10-22 08:50:52 EDT
Reproduced on -279.el6, the TCP retransmission occurs and iperf gets to 90kb/s and retransmit occurs during test:

------------------------------------------------------------
Client connecting to 192.168.122.6, TCP port 5001
TCP window size: 23.2 KByte (default)
------------------------------------------------------------
[  3] local 192.168.122.1 port 39810 connected with 192.168.122.6 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-23.2 sec   256 KBytes  90.4 Kbits/sec


Verified on -330.el6 kernel, with gso/tso enabled I get throughput of 2 Gb/s and no retransmit occurs:

------------------------------------------------------------
Client connecting to 192.168.122.6, TCP port 5001
TCP window size: 23.2 KByte (default)
------------------------------------------------------------
[  3] local 192.168.122.1 port 39834 connected with 192.168.122.6 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2.74 GBytes  2.35 Gbits/sec
Comment 22 errata-xmlrpc 2013-02-21 01:33:53 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html
Comment 23 daryl herzmann 2013-06-11 08:14:56 EDT
For what its probably not worth, I am still seeing this problem with RHEL6.4 2.6.32-358.11.1.el6.x86_64

I have a LVS NAT setup with a Broadcom Corporation NetXtreme II BCM5709 , I get brutal throughput with GRO enabled.  Turning it off and things are 'fine'.

# ethtool -k eth1
Features for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: off
Comment 24 Marcelo Ricardo Leitner 2013-06-11 08:22:43 EDT
(In reply to Daryl Herzmann from comment #23)
> For what its probably not worth, I am still seeing this problem with RHEL6.4
> 2.6.32-358.11.1.el6.x86_64
> 
> I have a LVS NAT setup with a Broadcom Corporation NetXtreme II BCM5709 , I
> get brutal throughput with GRO enabled.  Turning it off and things are
> 'fine'.

Are you also seeing icmp fragmentation needed?

Anyway, as this is already in Errata state, please open a new bug. Feel free to Cc me on the new one.
Comment 25 daryl herzmann 2013-06-11 08:51:34 EDT
(In reply to Marcelo Ricardo Leitner from comment #24)
> Anyway, as this is already in Errata state, please open a new bug. Feel free
> to Cc me on the new one.

thanks, I opened https://bugzilla.redhat.com/show_bug.cgi?id=973190

Note You need to log in before you can comment on or make changes to this bug.