Bug 546324

Summary: TCP receive window clamping problem
Product: Red Hat Enterprise Linux 4 Reporter: Fabio Olive Leite <fleite>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Liang Zheng <lzheng>
Severity: high Docs Contact:
Priority: high    
Version: 4.8CC: christopher.horton, fhirtz, jeder, jruemker, jwest, kzhang, lzheng, nhorman, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-16 15:54:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed patch, tested and verified.
none
Python script to implement a slow receiver.
none
Python script to implement a "small sender"
none
Graph of tcp rwnd values as seen from the receiver side, showing permanent window clamp
none
Graph of tcp rwnd values as seen from the receiver side, with patch applied
none
receive window size on 4U8
none
receive window size on 2.6.9-94 none

Description Fabio Olive Leite 2009-12-10 16:42:12 UTC
Created attachment 377518 [details]
Proposed patch, tested and verified.

Description of problem:

RHEL4 contains a non-recoverable TCP receive window clamping problem that was fixed around 2.6.14 and 2.6.15. The upstream discussion starts in:
http://marc.info/?l=linux-kernel&m=112561417224145&w=2

When a TCP connection starts getting bursts of very small packet and the receive buffer is filled with those high-overhead packets, the kernel decides to coalesce the receive buffer and clamp the receive window to slow down the sender. The problem is that the receive window never fully heals from this conditions even when the traffic pattern changes.

Version-Release number of selected component (if applicable):

Any current RHEL-4 kernel.

How reproducible:

Depends on receive buffer size and how much data is allowed to linger there by a slow reader. The python scripts attached reproduce the issue easily by simulating a financial application traffic pattern.

Steps to Reproduce:
1. Run slow_receiver.py on a RHEL-4 box;
2. Point the IP adddres of small_sender.py to the RHEL-4 box and run it;
3. Capture traffic and analyze the receive window changes.
  
Actual results:

Receive window gets clamped and never fully heals, like the graph attached.

Expected results:

Receive window expands back to original value when traffic conditions allow for it. Graph with patched kernel attached as well.

Additional info:

Patch attached, based on upstream thread above and commit 326f36e9e7de362e09745ce6f84b65e7ccac33ba.

Workaround:

This issue can be worked around by tuning net.ipv4.tcp_rmem way up, so that any existing applications that receive bursts of small packets and take too long to read them get a large enough receive buffer that won't trigger the clamping.

Comment 1 Fabio Olive Leite 2009-12-10 16:43:33 UTC
Created attachment 377519 [details]
Python script to implement a slow receiver.

Comment 2 Fabio Olive Leite 2009-12-10 16:44:13 UTC
Created attachment 377520 [details]
Python script to implement a "small sender"

Comment 3 Fabio Olive Leite 2009-12-10 16:45:33 UTC
Created attachment 377522 [details]
Graph of tcp rwnd values as seen from the receiver side, showing permanent window clamp

Comment 4 Fabio Olive Leite 2009-12-10 16:46:32 UTC
Created attachment 377523 [details]
Graph of tcp rwnd values as seen from the receiver side, with patch applied

Comment 5 Neil Horman 2009-12-10 20:02:51 UTC
Nice work Fabio.  One question, it appears you backported most of the patch, but not all of it.  How come you didn't move the sk_rcvbuf check to outside the ofo_win conditional like the upstream patch did?

Comment 6 Fabio Olive Leite 2009-12-11 19:34:19 UTC
Neil,

With this being RHEL-4, I wanted the simplest possible solution, so I focused on the "two-liner" proposed by Alexey Kuznetsov on <http://marc.info/?l=linux-netdev&m=112568628430571&w=2>.

After removing those two lines, I noticed that app_win was no longer needed so I removed any lines calculating it as well. I did not know the effect of applying the whole patch to the older RHEL-4 kernel, so I decided to keep it simple.

If you are sure the whole patch applies well and causes no problems, feel free to consider it. :)

Regards,
Fábio Olivé

Comment 7 Neil Horman 2009-12-14 16:16:43 UTC
well, I'm a bit worried the potential for out of order windows to allow inappropriate expansion of a sockets rx buffer.  But I think your right, its more important to minimize change at this point.  We'll go with your patch as is.  Thanks!

Comment 8 RHEL Program Management 2009-12-14 16:22:40 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 Danny Feng 2009-12-29 07:06:04 UTC
*** Bug 523425 has been marked as a duplicate of this bug. ***

Comment 13 Vivek Goyal 2010-09-02 18:09:12 UTC
Committed in 89.33.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 21 Liang Zheng 2010-12-21 13:36:46 UTC
Created attachment 469975 [details]
receive window size on 4U8

Comment 22 Liang Zheng 2010-12-21 13:37:35 UTC
Created attachment 469976 [details]
receive window size on 2.6.9-94

Comment 24 errata-xmlrpc 2011-02-16 15:54:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html