Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 714670 - TCP_CRR and concurrent TCP stream tests over IPv6 sometime fails on rhel5.7
TCP_CRR and concurrent TCP stream tests over IPv6 sometime fails on rhel5.7
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.7
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Jiri Benc
Adam Okuliar
:
Depends On:
Blocks: 742099 784372
  Show dependency treegraph
 
Reported: 2011-06-20 08:04 EDT by Adam Okuliar
Modified: 2012-02-20 22:39 EST (History)
7 users (show)

See Also:
Fixed In Version: kernel-2.6.18-305.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 742099 (view as bug list)
Environment:
Last Closed: 2012-02-20 22:39:58 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0150 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 5.8 kernel update 2012-02-21 02:35:24 EST

  None (edit)
Comment 1 Jiri Benc 2011-10-10 13:02:12 EDT
Cannot reproduce using KVM in ~30 runs. Could you provide more details about the setup (or access to the machines showing the problem, if you still have them)?
Comment 2 Adam Okuliar 2011-10-11 09:53:39 EDT
Hi Jiri,

I prepared two affected systems for you. You can use

redclient-01.rhts.bos.redhat.com
redclient-02.rhts.bos.redhat.com

Please run:
sysctl net.ipv4.tcp_tw_reuse=1
to enable TIME_WAIT connections reusing. 

Results of tests on redclients are following:
for i in `seq 1 30`; do netperf  -H fd20::2 -L fd20::1 -t TCP_CRR -l 60 -P0; done 
16384  87380  1        1       60.00    4365.31   
16384  87380 
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
16384  87380  1        1       60.00    4354.53   
16384  87380 
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
16384  87380  1        1       60.00    4353.37   
16384  87380 
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
16384  87380  1        1       60.00    4352.90   
16384  87380 
send_tcp_conn_rr: data recv error: Connection reset by peer
16384  87380  1        1       60.00    4357.50   
16384  87380 
16384  87380  1        1       60.00    4358.20   
16384  87380 
16384  87380  1        1       60.00    4363.02   
16384  87380 
16384  87380  1        1       60.00    4360.61   
16384  87380 
16384  87380  1        1       60.00    4356.58   
16384  87380 
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer
send_tcp_conn_rr: data recv error: Connection reset by peer

Redclient machines are reserved for 7 days, but please try to investigate this problem ASAP, Shacks team uses these machines for NFS testing.
Comment 3 Adam Okuliar 2011-10-11 12:02:48 EDT
I believe that sysctl net.ipv4.tcp_tw_reuse=1 does not work for ipv6 connections.

for i in `seq 1 10`; do netperf  -H 192.168.1.2 -t TCP_CRR -l 60 -P0>/dev/null ; netstat -nta | grep TIME_WAIT | wc -l;done
19
17
25
17
18
18
30
15
19
21

During IPv4 CRR test number of connections in TIME_WAIT state stays small during all time.

for i in `seq 1 10`; do netperf  -H fd20::2 -Lfd20::1 -t TCP_CRR -l 60 -P0>/dev/null ; netstat -nta | grep TIME_WAIT | wc -l;done
send_tcp_conn_rr: data recv error: Connection reset by peer
109
send_tcp_conn_rr: data recv error: Connection reset by peer
146
send_tcp_conn_rr: data recv error: Connection reset by peer
11553
send_tcp_conn_rr: data recv error: Connection reset by peer

During IPv6 CRR test number of connections in TIME_WAIT state rises rapidly until it exhausts whole port range available for assigning TCP source ports.
Comment 4 Jiri Benc 2011-10-17 15:58:31 EDT
> redclient-01.rhts.bos.redhat.com
> redclient-02.rhts.bos.redhat.com

The machines have been down today. I still haven't found the culprit; If I can get access to the machines again for few hours, I'll gather as much data as possible and will continue the analysis off-line.

The TIME_WAIT reuse is not the source of the problem, as the sockets are opened with SO_REUSEADDR (but you're correct that tcp_tw_reuse is not supported on IPv6). Btw, the problem is highly timing-sensitive and some attempts to debug it make it irreproducible.
Comment 5 Jiri Benc 2012-01-03 14:58:51 EST
Successfully tested the fix from bug 742099 comment 10 on RHEL5.8 kernel.
Comment 7 RHEL Product and Program Management 2012-01-03 15:09:44 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 9 Jarod Wilson 2012-01-18 12:02:15 EST
Patch(es) available in kernel-2.6.18-305.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5/
Detailed testing feedback is always welcomed.
If you require guidance regarding testing, please ask the bug assignee.
Comment 11 Adam Okuliar 2012-01-20 09:14:27 EST
Reproduced on
Linux redclient-01.rhts.bos.redhat.com 2.6.18-268.el5 

Verified on 
Linux redclient-01.rhts.bos.redhat.com 2.6.18-305.el5
Comment 12 errata-xmlrpc 2012-02-20 22:39:58 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html

Note You need to log in before you can comment on or make changes to this bug.