Bug 595756 - socket send does not come out of flowcontrol
Summary: socket send does not come out of flowcontrol
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Thomas Graf
QA Contact: Red Hat Kernel QE team
Depends On:
TreeView+ depends on / blocked
Reported: 2010-05-25 14:42 UTC by William Reich
Modified: 2018-11-14 19:03 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2011-04-27 12:28:57 UTC
Target Upstream Version:

Attachments (Terms of Use)
c program to show the problem (4.29 KB, application/octet-stream)
2010-05-25 14:42 UTC, William Reich
no flags Details
new echo_client.c that support IPV6 (5.26 KB, application/octet-stream)
2010-08-26 15:49 UTC, William Reich
no flags Details
tcpdump showing failure on 5.6 kernel (590.61 KB, application/octet-stream)
2011-02-01 12:51 UTC, William Reich
no flags Details
RH5.6 test using private LANs (3.52 MB, application/octet-stream)
2011-02-17 22:01 UTC, William Reich
no flags Details

Description William Reich 2010-05-25 14:42:57 UTC
Created attachment 416426 [details]
c program to show the problem

Description of problem:

socket 'send' goes into flow control and never comes out.

Version-Release number of selected component (if applicable):

RH 5

% uname -a
Linux boeing 2.6.18-194.3.1.el5 #1 SMP Sun May 2 04:17:42 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Compile the given client program. ( attachement )
Run it with a ipv4-address and port number.  The port number should be 7 (echo server).
Run this client program on one machine and give the ip address of another machine, which has its echo server enabled.  The problem may not happen within the same machine.

Wait until it gets stuck.  It seems to get stuck forever.  The amount of time to wait before it gets stuck varies, but should take less than 5 minutes.

The client has two threads.  The main thread sends messages in a tight loop to the given transport address.  The created thread receives the messages in a tight loop (using the same socket) and discards the messages.

You can do a pstack on the client to see that the sending thread is stuck on send() and the receiving thread is stuck on recv().  It seems like the flow control does not subside.

Steps to Reproduce:
Actual results:

program will get stuck in 'send'

Expected results:

program should not get stuck.

Additional info:

Comment 1 Neil Horman 2010-06-10 20:02:03 UTC
hmm, very odd.  It appears that after some time, the rec-q on the client socket is empty (hence the recv hang), but the sendq is full, and the echo service socket has a full sendq and recvq.  but a tcpdump shows what appears to be a slew of repeated ack frames between the two clients.  Something is getting very confused here.  I need to look further.

Comment 2 Neil Horman 2010-06-14 20:26:32 UTC
Interesting note, problem doesn't seem to reproduce when using the reproducer over the loopback interface.  This suggests its fairly timing sensitive.  Given the notes above, this is likely a failure on the side running the echo service, in the kernel.  I expect we get into a state in which we are waiting for a window to open on the client side, but never send probes for it.  I'll need to slice the tcpdump up and check to be sure though.

Comment 10 William Reich 2010-08-23 12:18:21 UTC
this is RedHat case number 2030082.

Comment 11 William Reich 2010-08-23 12:19:08 UTC
As per engineering team it seems an issue with tcp time-stamp incompatibility. 

Below is the analysis performed on the tcpdump outputs captured from our test environment.

==> This is the last successful data packet from echo server to client with TS 1288721431

20:33:58.819433 IP dhcp208-213.gsslab.pnq.redhat.com.echo > dhcp211-105.gsslab.pnq.redhat.com.56834: Flags [.], seq 1747949171:1747956843, ack 2510806253, win 203, options [nop,nop,TS val 1288721431 ecr 10382243], length 7672

==> The ack packet from the echo client

20:33:58.820195 IP dhcp211-105.gsslab.pnq.redhat.com.56834 > dhcp208-213.gsslab.pnq.redhat.com.echo: Flags [.], ack 1747956843, win 0, options [nop,nop,TS val 10382245 ecr 1288721430], length 0

The ecr here is 1288721430 - which is less than the data TS 1288721431

The ecr here should equal to the data segment's TS.

That should means the former packet never get acked with timestamp.

Because of this we are seeing the client keep sending the ack packets.

So we suspected this can be an issue with "tcp" time-stamp. To confirm, we tried to reproduce this issue after disabling tcp_timestamps.

[root@dhcp208-213 ~]# sysctl -p |grep time net.ipv4.tcp_timestamps = 0
[root@dhcp208-213 ~]# 

This time the issue was not reproducible.  How-ever we would like to get your test result after disabling tcp time-stamp in your test setup.

Comment 12 William Reich 2010-08-23 12:25:50 UTC
We ( Ulticom ) set the value to zero with the following command:

# sysctl -w net.ipv4.tcp_timestamps=0 net.ipv4.tcp_timestamps = 0

...confirmed with the following command:
% sysctl net.ipv4.tcp_timestamps   
net.ipv4.tcp_timestamps = 0

It did NOT work.  

Both the client & server machines were confirmed in this manner when the latest test failed.

Comment 13 William Reich 2010-08-26 13:01:59 UTC
interestingly, we have discovered that when
we use IPV6, the problem is not observed.
An overnight test ran successfully using IPV6 addresses.

Comment 14 William Reich 2010-08-26 15:49:42 UTC
Created attachment 441264 [details]
new echo_client.c that support IPV6

attached is a new test program that 
reveals the problem described in this report.
This version supports IPV6 addresses.
Inside the file are comments on how to use it.

Comment 16 William Reich 2010-10-19 11:59:15 UTC
It appears that the RedHat Case number for this problem has been
changed. Maybe a new system was put into place ??

Case 00332012


 ( it was previously RedHat case number 2030082. )

Comment 17 Neil Horman 2010-10-22 20:34:20 UTC
can this be re-tested on site, I'm trying with the latest RHEL-5 kernel here and its working quite well for me now.

Comment 18 William Reich 2010-10-22 20:58:32 UTC
Please disclose the kernel version that you are using to test with.

If it so happens that this new kernel fails for us,
what data do you wish to see ?

Comment 19 Neil Horman 2010-10-25 17:05:37 UTC
I'm using kernel 2.6.18-228.el5 on both systems with your echoClient against the xinetd echo-stream service.  If it fails for you I would be interested to know what you did to ensure it broke, because its been running for hours here.

Comment 20 William Reich 2010-10-26 18:13:24 UTC
we did a 'yum update' and the kernel only changed in the fifth 'digit'
from 2.6.18-194.3 to

Comment 21 Neil Horman 2010-10-26 18:28:25 UTC
Thats because the public beta hasn't been released yet.  It will be out soon and you can try then.

FWIW, I've had this running overnight here on that kernel and it hasn't stopped running.

Comment 22 William Reich 2010-10-27 17:57:34 UTC
it appears that RH6 beta does not show this problem.

% uname -a
Linux dynatac 2.6.32-44.1.el6.x86_64 #1 SMP Wed Jul 14 18:51:29 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

Comment 23 Neil Horman 2010-10-28 14:31:44 UTC
Ok, good. I concur.  I'm going to close this as WORKSFORME for now.  If the problem recurrs, please reopen this bug and let me know.  Thanks!

Comment 24 William Reich 2010-10-28 15:07:04 UTC

I need this fixed on RH4 & RH 5 for my customers.
RH 6 is not the solution for my customers.

Comment 25 Neil Horman 2010-10-28 15:27:41 UTC
Shoot, I'm sorry.  Wasn't thinking.  For some reason I was thinking this was a RHEL6 bug.

Comment 26 Neil Horman 2010-10-28 15:29:42 UTC
Wait, I'm sorry again.  I was trying with RHEL5 kernels on both systems.  You're the one who went with the RHEL6 public beta.  You need to wait for the RHEL5.6 release to test this.  Thats what I meant in comment 19 & 21.

Comment 28 Neil Horman 2010-12-30 12:59:30 UTC
Justin, please see comments 19,20 and 21

Comment 29 William Reich 2011-02-01 12:49:27 UTC
we just tested a 5.6 kernel, and the test failed.
The reported problem still exists in the 5.6 kernel.

Linux boeing 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

RH Case 00332012 has been updated accordingly.
A new capture of tcpdump was added to that case.
I will upload it to this buzz as well.

Comment 30 William Reich 2011-02-01 12:51:04 UTC
Created attachment 476387 [details]
tcpdump showing failure on 5.6 kernel

tcpdump -w capture -C 5 -W 10

Comment 31 Neil Horman 2011-02-09 14:24:27 UTC
I don't know what to tell you, it works fine for me.

Given that, I have to ask how you have these systems connected.  I'm looking at the two tcpdumps you've sent, and it appears you are loosing multiple frames for every frame that gets through.  For example look at capture 0, frames 5 through 11.  They are all identical, and I can only match them up with frame 13 in capture 1.  This pattern continues in both directions with multiple losses to varying degrees.  That would certainly explain why you eventually get 'stuck' writing on a socket and reading from it.  If you keep loosing this much data, you're socket queues are likely to remain full or close to full indefinately.  Do you record any drop statistics on your hardware counters or in the stack?

Comment 32 William Reich 2011-02-09 21:09:11 UTC
the dumps from comment 30 were using the corporate lan.
We will rerun the test on private LANs.

Comment 33 William Reich 2011-02-17 21:59:41 UTC
we have rerun the tests on private LANs.

It still fails.

Comment 34 William Reich 2011-02-17 22:01:25 UTC
Created attachment 479412 [details]
RH5.6 test using private LANs

reset using private LANs - tcpdump capture file

tcpdump -i eth3 -w <file> -C 5 -W 10

Comment 35 Neil Horman 2011-02-21 11:59:20 UTC
Note: I need to look more closely at these captures, but the peer that is sending to the echo service has a (note relative) sequence value of 4294967209, which is almost 0xffffffff.  I wonder if we're encountering a roll over bug of some sort

Note also these capture appear to be missing frames.  Its just a few, but the client sequence value goes from xxx7209 in capture0 to xxxx7281 in capture1,. back to xxxx7209 in capture2 and back again to xxxx7281 in capture3.  Each capture also begins with the acking of one of these sequence values as lost frames or duplicate acks (which I expect is simply an artifact of the capture split), I'll splice them back together and see if this makes more sense.

Comment 36 Thomas Graf 2011-03-15 12:38:54 UTC
I have the same problem as Neil. I have been trying to reproduce this with RHEL5.6 between two KVM guests and I have not been successful in reproducing it so far.

Any new insights?

Comment 37 William Reich 2011-03-15 12:52:16 UTC
The machines used for this test are HP DL585.
All CPUs are active.
We are not using virtualization.
As far as I know, my team has not seen this issue when using virtual machines.

Comment 38 William Reich 2011-03-15 13:07:40 UTC
In addition, it only happens when going over a real Ethernet network.  Loopback does NOT reproduce the problem.

Comment 39 William Reich 2011-03-15 13:13:44 UTC
regarding comment 35 and "missing frames", is there
a better set of parameters that we can use with 'tcpdump' in an attempt to avoid missing frames in the data capture ?

Comment 40 Thomas Graf 2011-03-15 14:02:45 UTC
Not sure why the frames are missing when rotating but I don't regard this as a huge problem. The problem description is clear, current problem is reproduction on our side. I have shifted my test setup to two physical machines now.

Comment 46 Thomas Graf 2011-04-08 09:08:04 UTC
(In reply to comment #45)

> If that is the case, can you please have a look at the tcpdump provided by the
> customer with his tests with RHEL 5.6 kernel and flash your findings?

There is nothing special to be found in the capture files. From a capture file POV the stalling occurs suddenly without reason.

It would be interesting to see the output of "ss -t -a -i" at the point when the connection stalls. To see if we are no longer sending because of the window or not.

Comment 47 William Reich 2011-04-27 12:17:23 UTC
We have rerun the test on the same machines
as the original test ( reported on May 2010 ).
However, this time we confirmed/double-checked that each machine
is running RH 5.6.

We determined that this test passed. The test ran for
more than 8 hours successfully. 

We went back to investigate why we reported a failure
on Feb 1, 2011. It turns out only one of the machines was
upgraded to RH 5.6. The other machine was running
an older version of RH.

SO, we can say that this issue
is resolved in RedHat 5.6 Case closed.

Comment 48 Thomas Graf 2011-04-27 12:28:57 UTC
Thank you for retesting!

I was starting to triple check our testing procedures.

Note You need to log in before you can comment on or make changes to this bug.