Bug 147807 - rhr2 network tests fail with "connection reset by peer " while doing ab (ApacheBench) through ssh
rhr2 network tests fail with "connection reset by peer " while doing ab (Apac...
Status: CLOSED ERRATA
Product: Red Hat Ready Certification Tests
Classification: Retired
Component: rhr2 (Show other bugs)
1.0
ia64 Linux
medium Severity medium
: ---
: ---
Assigned To: Rob Landry
Rob Landry
:
Depends On:
Blocks: 143442
  Show dependency treegraph
 
Reported: 2005-02-11 11:00 EST by Syl DES
Modified: 2007-04-18 13:19 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-11 12:39:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Syl DES 2005-02-11 11:00:04 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20041105 Firefox/1.0RC1 (Debian package 0.99+1.0RC1-4)

Description of problem:
Although network tests worked perfectly with previous version of rhr2
(rhr2-rhel4-1.0-10) on our platform, the modification applied to the
ApacheBench (ab) command line conducts to this error after about 40
minutes:
"Read from remote host xxxx: Connection reset by peer"

Systems are RHEL4 RC1.

iptables service is stopped on both machines, and we have a direct
connection through a cross cable (both 100Mb cards). So no NAT, no
firewall.

Here is my attempt to explain the problem:
The command line in the previous version of tests was
ssh -l root -x xxxx 'ab -c 218 -k -n 256 xxxx/httptest.file'

and the new command line is:

ssh -l root -x xxxx 'ab -c 30 -k -n 2000 xxxx/httptest.file'

The number of requests is bigger (-n option), so the opened ssh link
should stay alive longer, while the test being completed.

But it seems that a timeout occurs between 30-40 minutes (lack of
interaction keyboard/display), stopping the ssh connection.
Normally, ApacheBench should display text messages every 10% of
completed requests (this is a side effect which should normally
prevents the ssh to timeout) but the network connection on the apache
side is so solicited that theses messages are not received.

I found a workaround for this problem by adding a "-v 4" option (very
verbose mode) to the ab command line, thus completing successfully the
tests. This -v option sends a lot of messages through the ssh link, so
some of them are correctly received by ssh (in spite of network load)
and displayed, preventing ssh to timeout.

I saw that for some other users, the new ab command line with "-c 30
-k -n 2000" resolved their problems but this is not the case for
us...it worked perfectly before (no timeout).
(See bug 145570, bug 146826, bug 139965)

Version-Release number of selected component (if applicable):
rhr2-rhel4-1.0-14a

How reproducible:
Always

Steps to Reproduce:
1. Install a machine with needed packages
2. Run redhat-ready tests manually
3. Choose NETWORK tests
3. ApacheBench fails to complete
    

Actual Results:  After about 40 minutes, ssh/ab fails with "Read from
remote host xxxx: Connection reset by peer"

Expected Results:  ApacheBench should complete normally

Additional info:
Comment 1 Richard Li 2005-02-11 11:29:47 EST
Thank you for the detailed bug report. We will investigate this issue and fix
it. We will accept certifications on -10 or -14a or with the -v4 patch.
Comment 2 Syl DES 2005-02-17 11:48:11 EST
I made a mistake in the previous bug report, the ethernet adapter on
the machine where ab is executed is a Gigabit adapter. But as a result
of auto negotiation, adapter speed is downgraded to 100Mb/s (ethtool
shows that). File transmitted (httptest.file) is 12MB in size.
So I think that the fact the adapter is a Gigabit is not the reason
for tests to fail.

Maybe this could help: I tried to invert roles of the two machines,
and I got the same "Connection reset by peer" error.

When I'm launching the ab command manually (without using a ssh link)
ApacheBench completes successfully, so it tends to prove that ssh is
timing out somewhere...
Comment 3 Richard Li 2005-02-17 11:51:08 EST
Just to clarify: the -v 4 option still enables the test to succeed in either case?
Comment 4 Syl DES 2005-02-18 08:52:15 EST
Yes, the -v 4 option enables the test to succeed in either case.
Comment 5 Richard Li 2005-02-18 11:34:44 EST
merci i've added -v 4 to the CVS, and this fix will go out in the next errata
release.
Comment 6 Syl DES 2005-02-22 06:54:37 EST
Ok, thanks!
Comment 7 Richard Li 2005-05-11 12:39:36 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-419.html

Note You need to log in before you can comment on or make changes to this bug.