Created attachment 416426 [details] c program to show the problem Description of problem: socket 'send' goes into flow control and never comes out. Version-Release number of selected component (if applicable): RH 5 % uname -a Linux boeing 2.6.18-194.3.1.el5 #1 SMP Sun May 2 04:17:42 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux How reproducible: Compile the given client program. ( attachement ) Run it with a ipv4-address and port number. The port number should be 7 (echo server). Run this client program on one machine and give the ip address of another machine, which has its echo server enabled. The problem may not happen within the same machine. Wait until it gets stuck. It seems to get stuck forever. The amount of time to wait before it gets stuck varies, but should take less than 5 minutes. The client has two threads. The main thread sends messages in a tight loop to the given transport address. The created thread receives the messages in a tight loop (using the same socket) and discards the messages. You can do a pstack on the client to see that the sending thread is stuck on send() and the receiving thread is stuck on recv(). It seems like the flow control does not subside. Steps to Reproduce: (above) Actual results: program will get stuck in 'send' Expected results: program should not get stuck. Additional info:
hmm, very odd. It appears that after some time, the rec-q on the client socket is empty (hence the recv hang), but the sendq is full, and the echo service socket has a full sendq and recvq. but a tcpdump shows what appears to be a slew of repeated ack frames between the two clients. Something is getting very confused here. I need to look further.
Interesting note, problem doesn't seem to reproduce when using the reproducer over the loopback interface. This suggests its fairly timing sensitive. Given the notes above, this is likely a failure on the side running the echo service, in the kernel. I expect we get into a state in which we are waiting for a window to open on the client side, but never send probes for it. I'll need to slice the tcpdump up and check to be sure though.
this is RedHat case number 2030082.
As per engineering team it seems an issue with tcp time-stamp incompatibility. Below is the analysis performed on the tcpdump outputs captured from our test environment. ==> This is the last successful data packet from echo server to client with TS 1288721431 20:33:58.819433 IP dhcp208-213.gsslab.pnq.redhat.com.echo > dhcp211-105.gsslab.pnq.redhat.com.56834: Flags [.], seq 1747949171:1747956843, ack 2510806253, win 203, options [nop,nop,TS val 1288721431 ecr 10382243], length 7672 ==> The ack packet from the echo client 20:33:58.820195 IP dhcp211-105.gsslab.pnq.redhat.com.56834 > dhcp208-213.gsslab.pnq.redhat.com.echo: Flags [.], ack 1747956843, win 0, options [nop,nop,TS val 10382245 ecr 1288721430], length 0 The ecr here is 1288721430 - which is less than the data TS 1288721431 The ecr here should equal to the data segment's TS. That should means the former packet never get acked with timestamp. Because of this we are seeing the client keep sending the ack packets. So we suspected this can be an issue with "tcp" time-stamp. To confirm, we tried to reproduce this issue after disabling tcp_timestamps. [root@dhcp208-213 ~]# sysctl -p |grep time net.ipv4.tcp_timestamps = 0 [root@dhcp208-213 ~]# This time the issue was not reproducible. How-ever we would like to get your test result after disabling tcp time-stamp in your test setup.
We ( Ulticom ) set the value to zero with the following command: # sysctl -w net.ipv4.tcp_timestamps=0 net.ipv4.tcp_timestamps = 0 ...confirmed with the following command: % sysctl net.ipv4.tcp_timestamps net.ipv4.tcp_timestamps = 0 It did NOT work. Both the client & server machines were confirmed in this manner when the latest test failed.
interestingly, we have discovered that when we use IPV6, the problem is not observed. An overnight test ran successfully using IPV6 addresses.
Created attachment 441264 [details] new echo_client.c that support IPV6 attached is a new test program that reveals the problem described in this report. This version supports IPV6 addresses. Inside the file are comments on how to use it.
It appears that the RedHat Case number for this problem has been changed. Maybe a new system was put into place ?? Case 00332012 https://access.redhat.com/support/cases/00332012 ( it was previously RedHat case number 2030082. )
can this be re-tested on site, I'm trying with the latest RHEL-5 kernel here and its working quite well for me now.
ok... Please disclose the kernel version that you are using to test with. If it so happens that this new kernel fails for us, what data do you wish to see ?
I'm using kernel 2.6.18-228.el5 on both systems with your echoClient against the xinetd echo-stream service. If it fails for you I would be interested to know what you did to ensure it broke, because its been running for hours here.
we did a 'yum update' and the kernel only changed in the fifth 'digit' from 2.6.18-194.3 to 2.6.18-194-17...
Thats because the public beta hasn't been released yet. It will be out soon and you can try then. FWIW, I've had this running overnight here on that kernel and it hasn't stopped running.
it appears that RH6 beta does not show this problem. % uname -a Linux dynatac 2.6.32-44.1.el6.x86_64 #1 SMP Wed Jul 14 18:51:29 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
Ok, good. I concur. I'm going to close this as WORKSFORME for now. If the problem recurrs, please reopen this bug and let me know. Thanks!
I need this fixed on RH4 & RH 5 for my customers. RH 6 is not the solution for my customers.
Shoot, I'm sorry. Wasn't thinking. For some reason I was thinking this was a RHEL6 bug.
Wait, I'm sorry again. I was trying with RHEL5 kernels on both systems. You're the one who went with the RHEL6 public beta. You need to wait for the RHEL5.6 release to test this. Thats what I meant in comment 19 & 21.
Justin, please see comments 19,20 and 21
we just tested a 5.6 kernel, and the test failed. The reported problem still exists in the 5.6 kernel. Linux boeing 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux RH Case 00332012 has been updated accordingly. A new capture of tcpdump was added to that case. I will upload it to this buzz as well.
Created attachment 476387 [details] tcpdump showing failure on 5.6 kernel tcpdump -w capture -C 5 -W 10
I don't know what to tell you, it works fine for me. Given that, I have to ask how you have these systems connected. I'm looking at the two tcpdumps you've sent, and it appears you are loosing multiple frames for every frame that gets through. For example look at capture 0, frames 5 through 11. They are all identical, and I can only match them up with frame 13 in capture 1. This pattern continues in both directions with multiple losses to varying degrees. That would certainly explain why you eventually get 'stuck' writing on a socket and reading from it. If you keep loosing this much data, you're socket queues are likely to remain full or close to full indefinately. Do you record any drop statistics on your hardware counters or in the stack?
the dumps from comment 30 were using the corporate lan. We will rerun the test on private LANs.
we have rerun the tests on private LANs. It still fails.
Created attachment 479412 [details] RH5.6 test using private LANs reset using private LANs - tcpdump capture file tcpdump -i eth3 -w <file> -C 5 -W 10
Note: I need to look more closely at these captures, but the peer that is sending to the echo service has a (note relative) sequence value of 4294967209, which is almost 0xffffffff. I wonder if we're encountering a roll over bug of some sort Note also these capture appear to be missing frames. Its just a few, but the client sequence value goes from xxx7209 in capture0 to xxxx7281 in capture1,. back to xxxx7209 in capture2 and back again to xxxx7281 in capture3. Each capture also begins with the acking of one of these sequence values as lost frames or duplicate acks (which I expect is simply an artifact of the capture split), I'll splice them back together and see if this makes more sense.
I have the same problem as Neil. I have been trying to reproduce this with RHEL5.6 between two KVM guests and I have not been successful in reproducing it so far. Any new insights?
The machines used for this test are HP DL585. All CPUs are active. We are not using virtualization. As far as I know, my team has not seen this issue when using virtual machines.
In addition, it only happens when going over a real Ethernet network. Loopback does NOT reproduce the problem.
regarding comment 35 and "missing frames", is there a better set of parameters that we can use with 'tcpdump' in an attempt to avoid missing frames in the data capture ?
Not sure why the frames are missing when rotating but I don't regard this as a huge problem. The problem description is clear, current problem is reproduction on our side. I have shifted my test setup to two physical machines now.
(In reply to comment #45) > If that is the case, can you please have a look at the tcpdump provided by the > customer with his tests with RHEL 5.6 kernel and flash your findings? There is nothing special to be found in the capture files. From a capture file POV the stalling occurs suddenly without reason. It would be interesting to see the output of "ss -t -a -i" at the point when the connection stalls. To see if we are no longer sending because of the window or not.
We have rerun the test on the same machines as the original test ( reported on May 2010 ). However, this time we confirmed/double-checked that each machine is running RH 5.6. We determined that this test passed. The test ran for more than 8 hours successfully. We went back to investigate why we reported a failure on Feb 1, 2011. It turns out only one of the machines was upgraded to RH 5.6. The other machine was running an older version of RH. SO, we can say that this issue is resolved in RedHat 5.6 Case closed.
Thank you for retesting! I was starting to triple check our testing procedures.