Bug 127246 - Strange socket hangs on RHEL3 kernel
Summary: Strange socket hangs on RHEL3 kernel
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 3.0
Hardware: i686
OS: Linux
Target Milestone: ---
Assignee: David Miller
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2004-07-05 06:46 UTC by Pasi Pirhonen
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-07-06 14:25:43 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
The capture tcpdump data + strace info (2.28 KB, text/plain)
2004-07-05 06:48 UTC, Pasi Pirhonen
no flags Details

Description Pasi Pirhonen 2004-07-05 06:46:57 UTC
Description of problem:

RST is received, but sockets are hung until timeout.

Version-Release number of selected component (if applicable):

- RHEL3 Update1 (BEA certified level).
- Proliant DL380
- WebLogic 8.1 (jrockit)
- There is load balancer in front of this

How reproducible:

Sooner or later there is so many of these hung sockets that the
service will be blocked out. Happens randomly.

Steps to Reproduce:
1. Wait and wait
Actual results:

Jrockit runs out of free threads as those are 'sending until timeout'

Expected results:

RST would kill the connection/socket

Additional info:

Comment 1 Pasi Pirhonen 2004-07-05 06:48:23 UTC
Created attachment 101632 [details]
The capture tcpdump data + strace info

Comment 2 David Miller 2004-07-06 02:38:13 UTC
Either the checksum or something else about the RESET packet
makes it unacceptable.  That is why there are still ACKs
coming back from the machine.

Something, either aaa.bbb.ccc.ddd or some machine in between
(most likely, load balancing and firewall boxes are notorious
for corrupting TCP packets) is messing with the contents.

I don't think anything shown so far indicate that the RHEL3
machine is doing anything wrong.

Comment 3 David Miller 2004-07-06 02:57:15 UTC
Another thing of note in the traces is that as 'machine' is
still sending ACKs back, aaa.bbb.ccc.ddd is not sending a
RST packet back in response when it very well should.

Something is definitely amiss on the path from aaa.bbb.ccc.ddd
to the RHEL3 box.

Comment 4 Pasi Pirhonen 2004-07-06 14:25:43 UTC
OK. Customer and BEA asked me to make this bugzilla entry.

That must be some other equipment (firewall or the load balancer
then). I have been kind of stupid. I have watched tools like nmap
sending ACKS to host to probe and see RST coming back. It's so obvious
when i read David's comment above.

I have to bug other people with this. lowering the value for
will help those to timeout sooner, but there is still risk that
WebLogic runs out of threads.

I change this entry myself to NOTABUG

Note You need to log in before you can comment on or make changes to this bug.