Bug 537145 - Wrong TCP behavior for zero window probes containing data
Summary: Wrong TCP behavior for zero window probes containing data
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Neil Horman
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-12 16:38 UTC by Casey Dahlin
Modified: 2018-10-27 15:55 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-01-22 15:28:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
tcpdump for issue (924.48 KB, application/octet-stream)
2009-12-10 15:26 UTC, Casey Dahlin
no flags Details

Description Casey Dahlin 2009-11-12 16:38:59 UTC
Description of problem:
When a RHEL 5 host's receive window reaches zero, it does not respond to additional data appropriately. The TCP spec says that if window size reaches zero, the sender can send an empty packet as a "zero-window probe" in order to provoke an ACK and see if the window has become nonzero yet. The TCP standard also says that a packet with additional data can be sent, so that if the window has become open the receiver can accept the payload.

The customer observed that when sending TCP packets with a 1-byte payload to a RHEL 5 host with a zero receive window, the host would accept the packet /and receive the payload/ even though the receive window is zero. This goes on for some time, until eventually the kernel decides to stop stretching the buffer and simply starts dropping the probes on the floor. The sender sees the receiver go silent and kills the connection.

This seems to have arisen from a subtle misinterpretation of the spec. The kernel should drop the 1-byte payload immediately, but it should still send an ACK acknowledging the already accepted data. We want to drop the /payload/ but still respond to the /packet/.

Windows hosts are known to send this 1-byte payload

How reproducible:
Always

Steps to Reproduce:
1. Have a windows host saturate the receive window of a RHEL host.
2. Wait.
3. Note disaster.
  
Actual results:
A few bytes slip in past the window, and then the connection goes silent.

Expected results:
Additional bytes get ignored, preceding data gets ACK'd

Comment 1 Neil Horman 2009-12-10 13:49:48 UTC
Do you have a tcpdump of this behavior?

Also, why is a windows host required to saturate the recieve window?

Comment 2 Casey Dahlin 2009-12-10 15:23:28 UTC
Windows' TCP implementation sends 1-byte payloads with its zero window probes, Linux's TCP implementation sends no payload with them. By my reading of the TCP spec both behaviors are allowed.

IOW the default behavior of Windows just so happens to trigger the bug. Linux could probably be rigged to trigger it as well, and there's likely other TCP stacks out there that would do it too.

Comment 3 Casey Dahlin 2009-12-10 15:26:03 UTC
Created attachment 377479 [details]
tcpdump for issue

Here's the dump

Comment 4 Neil Horman 2009-12-10 19:51:52 UTC
So, I've looked at the tcpdump above, and discussed with casey what I see in it.  In summary the assertion in this bugzilla is that, when receiving zero window probes with single byte payloads that the linux system eventually starts dropping the frames, and the peer kills the connection when the connection goes quiet.  

This isn't really whats happening, or at least it doesn't appear to be based on the tcpdump attached here.  Looking at the connection between the linux system (ip 10.208.228.99) and the windows system (ip 10.208.228.119), we see a connection that is running smoothly, until the window closes on the linux system in frame 8920.  From there we see a series of zero window probes and zero window responses, in which the linux system consumes the offered octed in the probe (which is in fact allowed by RFC 793), it just needs to advance the ACK number appropriately by 1 byte.  It may also choose to not consume the byte ACKing the last accpeted sequence number.  This probe/ack exchange continues until frame 9070 when a probe is sent.  At that point the linux system fails to respond for approximately 30 secnods, at which point it then responds with a ACK to the last frame, opening the window.  The windows system then responds to it with a reset frame, tearing down the connection.

Things to note:

1) There are approximately 40 zero window probes/ack pairs in this trace, meaning the linux system accepted approximately 40 bytes of data during the time the window was closed.  Allowing for skb overhead, thats approximately 60k worth of memory allocation.  If this system was so exhausted on memory that it could not find 60k worth of data in order 0 allocations, then this system would have paniced, or at least been oom-killing processes agressively.  There is no way a system constrained in that way simply began dropping frames.

2) A pause of 30 seconds between a zero window probe and an ACK isn't pretty, but its perfectly legal.  Theres no requirement for a tcp peer to respond within a fixed time to a zero window probe

I think the more likely case in regards to whats happening here is that this is all working as configured.  I think the linux system is accepting and responding to zero window probes appropriately, and at some point encounteres a condition in which its unable to respond on the network.  During that time the windows system decided that the connection timed out and on the next acknoweldgement reset the connection.  If thats the case, thats a windows configuration issue, not a linux bug.  We can try to determine what the delay was during that period and see if anything can be done to minimize, or otherwise reduce it, but thats just an arbitrary line.  Theres no specification indicating how long to wait for a zero window response is, thats a peer configuration option, and it sounds like in this case, the windows system simply needs to set it higher.

So, in summary, looking at this, this currently doesn't seem like a bug at all, unlesss some other evidence can be presented.


Note You need to log in before you can comment on or make changes to this bug.