| Summary: | SCTP CRC message error on the receiver side | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Guy Streeter <streeter> |
| Component: | kernel | Assignee: | Thomas Graf <tgraf> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.4 | CC: | cevich, davem, nhorman, nobody, rkhan, tgraf |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-10-24 01:46:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Guy Streeter
2011-02-21 16:50:21 UTC
Note: Still confirming if the CRC is internal to their message or is the protocol CRC that's failing. They notice CRC failure only when MSG_EOR is set on the last part of a formerly partially delivered message. Answers to the above questions: 1) Partial delivery can be forced by careful calculation of the receive window. Set the receiver RCVBUF size to a known value (knowing that the initial rwnd value is going to be min(RCVBUF/2, 1500). If the receiver then never actually calls sctp_recvmsg, the rwnd will slowly shrink toward zero. Use small chunks to fill it until the remaining rwnd value is some small number C, then send one large data chunk who's size > C. That should force you into partial delivery mode. 2) No. SCTP does include a adler-32 checksum (CRC) in each sctp common header for validation, however, bad checksums to not result in log messages getting created. Instead the checksum errors counter gets incremented in /proc/net/sctp/snmp, so that should provide a definitive indicator if its the customer application noting CRC errors or the protocol (my guess will be the former). I note this bz is filed against 5.4. I would strongly suggest recreating this on the 5.7 beta kernel before wasting time on a deeper investigation of the problem. In this case, if multi-homing is in use, and messages are traversing multiple packets, I'd be particularly interested in seeing if this is a dup of bz 517504. Any TCPdumps you can provide along with re-creation code would be appreciated. yes, thats correct. Calling recvmsg may actually be optional for them, depending on if they need to actually get data to validate the crc (which I'm guessing they will have to do). Eitehr way, yes, you enter partial deliver mode when the next chunk of data for an association would overflow its rwnd. So you can force the condition by not calling recvmsg in the application so that the receive window closes. This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update. Contact your manager or support representative in case you need to escalate this bug. Any news on reproducing this on 5.7? |