Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 187494 - CVE-2006-2275 SCTP traffic probably never resumes
CVE-2006-2275 SCTP traffic probably never resumes
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Neil Horman
Brian Brock
: Security
: 191259 (view as bug list)
Depends On:
Blocks: 181409
  Show dependency treegraph
Reported: 2006-03-31 07:23 EST by Issue Tracker
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-08-10 18:59:14 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
patch to correct deadlock (3.68 KB, patch)
2006-04-04 17:01 EDT, Neil Horman
no flags Details | Diff
new patch that solves the deadlock issue (2.34 KB, patch)
2006-04-07 16:43 EDT, Neil Horman
no flags Details | Diff
Version of patch that was accepted upstream (2.34 KB, patch)
2006-04-11 15:35 EDT, Neil Horman
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0575 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4 2006-08-10 00:00:00 EDT

  None (edit)
Description Issue Tracker 2006-03-31 07:23:34 EST
Escalated to Bugzilla from IssueTracker
Comment 3 Neil Horman 2006-03-31 16:56:58 EST
I'm getting to understand what going on here I think.  It appears that we have
two problems:

1) Even though we are discarding frames due to lack of buffer space on the
receive side, we continue to ack them, constantly reopening our receive window.  

2) When we fill our receive window, and then receive a frame, we immediately
fill it up again with the next packet that arrives, which we then discard, as
the chunks are bundled, making for a bigger packet that we much unilaterraly
accept or deny. This leads to a constant lack of reception of frames which
eventually leads to the sender giving up and sending an abort message 

I think what I need to do here is write a patch that:
1) doesn't SACK frames that are discarded (i.e. don't add a GEN_SACK command in
sctp_eat_data_* if the return from sctp_eat_data is IGNORE_TSN).

2) provides hysteresis on receive buffer accounting that only processes received
frames when there is enough open space in the receive buffer to handle as much
data as has previously been dropped (to prevent the constant fill problem).
Comment 4 Neil Horman 2006-04-04 17:01:11 EDT
Created attachment 127313 [details]
patch to correct deadlock

This is the first pass at the patch that I am proposing upstream.  It still
needs some cleanup, but its functional, and solves the problem
Comment 5 Neil Horman 2006-04-07 08:10:07 EDT
FYI, upstream had some disagreements with the patch and I am currently
reworking.  Also, I've been meaning to mention this:  A good deal of the delay
you may be seeing with this issue is the fact that heartbeat message are on a 30
second timer, reducing this value may help your association recover more quickly.
Comment 6 Neil Horman 2006-04-07 16:43:40 EDT
Created attachment 127482 [details]
new patch that solves the deadlock issue

This is the new version of the patch that I used to solve the deadlock issue.
Its much smaller and cleaner, and I'm currently proposing it upstream.
Comment 7 Neil Horman 2006-04-11 15:35:31 EDT
Created attachment 127633 [details]
Version of patch that was accepted upstream

This is the patch that is getting acceptance upstream, and what I will be
backporting to RHEL4
Comment 9 Jason Baron 2006-04-25 06:40:33 EDT
committed in stream U4 build 34.23. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 12 Neil Horman 2006-05-10 11:14:13 EDT
*** Bug 191259 has been marked as a duplicate of this bug. ***
Comment 13 Marcel Holtmann 2006-05-10 11:24:27 EDT
The upstream fix is different from the proposed fix:

Comment 14 Marcel Holtmann 2006-05-10 11:26:41 EDT
This has been assigned CVE-2006-2275.
Comment 18 Neil Horman 2006-05-22 10:46:34 EDT
So I just tried the reproducer they provided, and this definately isn't a
regression.  In fact, this isn't really a bug at all, but rather its working as
designed.  When the receiver is reniced to +19, the recive queue slowly backs up
(as one would expect, since the scheduler doesn't run the reciever as often), to
the point where occasionally frames are dropped, and retranmits are required. 
So yes, traffic slows down, but it has to because traffic isn't being consumed
at the receiver as fast.  But its definately not deadlocking, as this bug was
opened to fix.  I'm removing the regression keyword.
Comment 20 Mike Gahagan 2006-07-14 17:27:32 EDT
Patch is in -42, setting verified.
Comment 22 Red Hat Bugzilla 2006-08-10 18:59:20 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.