Bug 187494
Summary: | CVE-2006-2275 SCTP traffic probably never resumes | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Issue Tracker <tao> | ||||||||
Component: | kernel | Assignee: | Neil Horman <nhorman> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 4.0 | CC: | holtmann, jbaron, security-response-team, tao | ||||||||
Target Milestone: | --- | Keywords: | Security | ||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | impact=moderate,source=vendorsec,reported=20060509,public=20060509 | ||||||||||
Fixed In Version: | RHSA-2006-0575 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2006-08-10 22:59:14 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 181409 | ||||||||||
Attachments: |
|
Description
Issue Tracker
2006-03-31 12:23:34 UTC
I'm getting to understand what going on here I think. It appears that we have two problems: 1) Even though we are discarding frames due to lack of buffer space on the receive side, we continue to ack them, constantly reopening our receive window. 2) When we fill our receive window, and then receive a frame, we immediately fill it up again with the next packet that arrives, which we then discard, as the chunks are bundled, making for a bigger packet that we much unilaterraly accept or deny. This leads to a constant lack of reception of frames which eventually leads to the sender giving up and sending an abort message I think what I need to do here is write a patch that: 1) doesn't SACK frames that are discarded (i.e. don't add a GEN_SACK command in sctp_eat_data_* if the return from sctp_eat_data is IGNORE_TSN). 2) provides hysteresis on receive buffer accounting that only processes received frames when there is enough open space in the receive buffer to handle as much data as has previously been dropped (to prevent the constant fill problem). Created attachment 127313 [details]
patch to correct deadlock
This is the first pass at the patch that I am proposing upstream. It still
needs some cleanup, but its functional, and solves the problem
FYI, upstream had some disagreements with the patch and I am currently reworking. Also, I've been meaning to mention this: A good deal of the delay you may be seeing with this issue is the fact that heartbeat message are on a 30 second timer, reducing this value may help your association recover more quickly. Created attachment 127482 [details]
new patch that solves the deadlock issue
This is the new version of the patch that I used to solve the deadlock issue.
Its much smaller and cleaner, and I'm currently proposing it upstream.
Created attachment 127633 [details]
Version of patch that was accepted upstream
This is the patch that is getting acceptance upstream, and what I will be
backporting to RHEL4
committed in stream U4 build 34.23. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/ *** Bug 191259 has been marked as a duplicate of this bug. *** The upstream fix is different from the proposed fix: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7c3ceb4fb9667f34f1599a062efecf4cdc4a4ce5 This has been assigned CVE-2006-2275. So I just tried the reproducer they provided, and this definately isn't a regression. In fact, this isn't really a bug at all, but rather its working as designed. When the receiver is reniced to +19, the recive queue slowly backs up (as one would expect, since the scheduler doesn't run the reciever as often), to the point where occasionally frames are dropped, and retranmits are required. So yes, traffic slows down, but it has to because traffic isn't being consumed at the receiver as fast. But its definately not deadlocking, as this bug was opened to fix. I'm removing the regression keyword. Patch is in -42, setting verified. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0575.html |