From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041105 Firefox/1.0RC1 (Debian package 0.99+1.0RC1-4) Description of problem: Although network tests worked perfectly with previous version of rhr2 (rhr2-rhel4-1.0-10) on our platform, the modification applied to the ApacheBench (ab) command line conducts to this error after about 40 minutes: "Read from remote host xxxx: Connection reset by peer" Systems are RHEL4 RC1. iptables service is stopped on both machines, and we have a direct connection through a cross cable (both 100Mb cards). So no NAT, no firewall. Here is my attempt to explain the problem: The command line in the previous version of tests was ssh -l root -x xxxx 'ab -c 218 -k -n 256 xxxx/httptest.file' and the new command line is: ssh -l root -x xxxx 'ab -c 30 -k -n 2000 xxxx/httptest.file' The number of requests is bigger (-n option), so the opened ssh link should stay alive longer, while the test being completed. But it seems that a timeout occurs between 30-40 minutes (lack of interaction keyboard/display), stopping the ssh connection. Normally, ApacheBench should display text messages every 10% of completed requests (this is a side effect which should normally prevents the ssh to timeout) but the network connection on the apache side is so solicited that theses messages are not received. I found a workaround for this problem by adding a "-v 4" option (very verbose mode) to the ab command line, thus completing successfully the tests. This -v option sends a lot of messages through the ssh link, so some of them are correctly received by ssh (in spite of network load) and displayed, preventing ssh to timeout. I saw that for some other users, the new ab command line with "-c 30 -k -n 2000" resolved their problems but this is not the case for us...it worked perfectly before (no timeout). (See bug 145570, bug 146826, bug 139965) Version-Release number of selected component (if applicable): rhr2-rhel4-1.0-14a How reproducible: Always Steps to Reproduce: 1. Install a machine with needed packages 2. Run redhat-ready tests manually 3. Choose NETWORK tests 3. ApacheBench fails to complete Actual Results: After about 40 minutes, ssh/ab fails with "Read from remote host xxxx: Connection reset by peer" Expected Results: ApacheBench should complete normally Additional info:
Thank you for the detailed bug report. We will investigate this issue and fix it. We will accept certifications on -10 or -14a or with the -v4 patch.
I made a mistake in the previous bug report, the ethernet adapter on the machine where ab is executed is a Gigabit adapter. But as a result of auto negotiation, adapter speed is downgraded to 100Mb/s (ethtool shows that). File transmitted (httptest.file) is 12MB in size. So I think that the fact the adapter is a Gigabit is not the reason for tests to fail. Maybe this could help: I tried to invert roles of the two machines, and I got the same "Connection reset by peer" error. When I'm launching the ab command manually (without using a ssh link) ApacheBench completes successfully, so it tends to prove that ssh is timing out somewhere...
Just to clarify: the -v 4 option still enables the test to succeed in either case?
Yes, the -v 4 option enables the test to succeed in either case.
merci i've added -v 4 to the CVS, and this fix will go out in the next errata release.
Ok, thanks!
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-419.html