Bug 156602 - SCTP memory consumption, additional fixes
SCTP memory consumption, additional fixes
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.3
All Linux
medium Severity medium
: ---
: ---
Assigned To: Neil Horman
Brian Brock
:
Depends On:
Blocks: 168429
  Show dependency treegraph
 
Reported: 2005-05-02 08:11 EDT by Patrick C. F. Ernzer
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version: RHSA-2006-0132
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-07 13:58:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch to account for ulpevents in receive queue (633 bytes, patch)
2005-09-16 08:18 EDT, Neil Horman
no flags Details | Diff
updated patch to account for remaining sctp skb allocations (1.05 KB, patch)
2005-09-20 16:38 EDT, Neil Horman
no flags Details | Diff
Updated patch taking upstream suggestions (3.19 KB, patch)
2005-09-27 14:06 EDT, Neil Horman
no flags Details | Diff
latest upstream proposal patch (10.25 KB, patch)
2005-10-18 13:23 EDT, Neil Horman
no flags Details | Diff
final upstream version of patch (10.40 KB, patch)
2005-10-20 09:50 EDT, Neil Horman
no flags Details | Diff
new version of patch (9.76 KB, patch)
2005-11-11 10:07 EST, Neil Horman
no flags Details | Diff

  None (edit)
Description Patrick C. F. Ernzer 2005-05-02 08:11:51 EDT
Description of problem: This is split off BZ #146797.

This bug exists to track the non-critical memory consumption issues with sctp
(critical issues were fixed in BZ #146797, kernel 2.6.9-6.43, should be in RHEL4
U1 GA)

Version-Release number of selected component (if applicable):
kernel-2.6.9-6.43
Comment 6 Patrick C. F. Ernzer 2005-09-07 12:29:22 EDT
Update from IT:

[start]
In SIGTRAN's case, the smallest messages are at least 50 bytes. A realistic
message rate is probably 1000 messages/s in + 1000 messages/s out. We are
required to support over 2000 associations, but in reality currently there are
rarely more than 100 associations. This is lowmem we're talking about here and
there is always less than one GB of that to go around, that's the only static
limit there is.

The possibility that so many send buffers would fill up at the same time and
exhaust lowmem is totally theoretical at this point in time.

However, with 50 byte messages each socket uses 1.85 MB of lowmem to buffer
about 100KB of outbound data. That doesn't seem reasonable. One would only need
to fill the send buffer of 487 associations to exhaust 900MB of lowmem. That's
very uncomfortable considering we must support over 2000 concurrent
associations. The number of associations used will increase in the future as
server clusters and networks increase in size.

So this is not a problem in realistic situations yet, but it will be later on.
[end]
Comment 8 Neil Horman 2005-09-16 08:18:33 EDT
Created attachment 118890 [details]
patch to account for ulpevents in receive queue

This patch has been tested, and found to improve memory use slightly.  Its not
enough by any stretch, but will likely be part of the final solution.
Comment 9 Neil Horman 2005-09-20 16:38:21 EDT
Created attachment 119047 [details]
updated patch to account for remaining sctp skb allocations

This is an improvement on my previous patch, and seems to clear up all the
missing accounting pieces for me.
Comment 10 Neil Horman 2005-09-26 11:17:15 EDT
Customer reports that latest patch provides correct accounting.  I'll propose
this upstream later today.
Comment 13 Neil Horman 2005-09-27 14:06:01 EDT
Created attachment 119314 [details]
Updated patch taking upstream suggestions

This is an updated version of the patch taking into consideration some of the
suggestions provided by upstream.  Its functionally equivalent, and provides
the same accounting as the previous patch, but uses the skb->desctructor to do
its work, which in my view is a better solution, it also consolidates receive
buffer accounting so it doesn't need to occur both in sctp_rcv and
sctp_ulpevent_set_owner, and cleans up an inadvertent double accounting error.
Comment 14 Neil Horman 2005-10-18 13:23:42 EDT
Created attachment 120128 [details]
latest upstream proposal patch

This is the latest version of the upstream patch.  After going around several
times, we've come to the consensus that this is the best solution to the
immediate problem.  There are some outstanding issues with receive window size
that still need to be hashed out, but they aren't pertinent to this problem,
and the issues aren't RFC violators, nor do they have a real performance
impact. This patch passes all my tests, and as soon as I have upstream
acceptance, I'll build a test kernel for you to confirm
Comment 15 Neil Horman 2005-10-20 09:50:10 EDT
Created attachment 120191 [details]
final upstream version of patch

This is the version of the patch that now has a commitment for upstream
inclusion from the sctp maintainer.  Its identical to the previous patch, but
with a variable name change per request of the maintainer.  I'm going to build
a kernel with this patch against the latest RHEL4 kernel for you to test with,
and post internally for inclusion if it fixes the problem for you (it should,
it passes all the test cases I've been using).
Comment 16 Neil Horman 2005-11-11 10:07:21 EST
Created attachment 120943 [details]
new version of patch

The receive buffer accounting patch uncovered an skb leak in the establishment
of stream style sockets, which the upstream maintainer rolled into the
accounting patch.  We should pick it up as well.  Same patch as before with the
additional leak fixing bits.
Comment 17 Neil Horman 2005-11-15 09:43:53 EST
Ok, customer reports this corrects the remaining memeory accounting issues. 
Posting the above patch to rhkl
Comment 25 Red Hat Bugzilla 2006-03-07 13:58:56 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html

Note You need to log in before you can comment on or make changes to this bug.