Bug 74539 - Kernel 2.4.18-10 - network nearly unusable
Summary: Kernel 2.4.18-10 - network nearly unusable
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-09-26 11:07 UTC by Markus Doehr
Modified: 2008-08-01 16:22 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:39:57 UTC
Embargoed:


Attachments (Terms of Use)

Description Markus Doehr 2002-09-26 11:07:23 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2b) Gecko/20020828

Description of problem:
I installed Kernel 2.4.18-10 on a Redhat 7.3 system serving a large website.
Apache 1.3.23-14, php-4.1.2-7.3.4. The system isn't able to server more than 5
requests per seconds. Tried to tune many kernel parameters but it didn't help.

Going back to 2.4.18-3bigmem will serve about 1000 requests per second.

I saw many timeouts to select() in strac'ing the apache root process. 

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install 2.4.18-10 on a i686 system
2.
3.
	

Additional info:

Comment 1 Arjan van de Ven 2002-09-26 11:09:09 UTC
what kind of network card is this ?

Comment 2 Markus Doehr 2002-09-26 11:12:16 UTC
[root@www ~]# lsmod
Module                  Size  Used by    Not tainted
ipchains               46184   0
autofs                 12740   0  (autoclean) (unused)
e100                   77524   1
usb-ohci               22688   0  (unused)
usbcore                77024   1  [usb-ohci]
ext3                   70720   6
jbd                    53504   6  [ext3]
aic7xxx               125728   7
sd_mod                 12896  14
scsi_mod              115120   2  [aic7xxx sd_mod]



Comment 3 Markus Doehr 2002-09-26 20:20:49 UTC
This now also happens with the 'old' kernel:

Here are the results of

# http_load -parallel 50 -rate 50 -seconds 60 urls

where it fetches the index.html page.

1977 fetches, 1038 max parallel, 1.07245e+07 bytes, in 60.0051 seconds
5424.66 mean bytes/connection
32.9472 fetches/sec, 178727 bytes/sec
msecs/connect: 6857.95 mean, 45008.7 max, 3.437 min
msecs/first-response: 5345.73 mean, 32055 max, 5.173 min
1111 bad byte counts
HTTP response codes:
  code 200 -- 866

Only 32 fetches per second, is this normal? 

[root@www ~]# uname -a
Linux www.edonkey2000.com 2.4.18-3bigmem #1 SMP Thu Apr 18 07:17:10 EDT 2002
i686 unknown

Is there anything I can do?

Comment 4 Markus Doehr 2002-09-26 20:22:54 UTC
forgot to mention that the system the http_load is running on is connected via
100Mbit switch and using the eepro100 driver.

Comment 5 Markus Doehr 2002-09-26 20:23:50 UTC
[root@www ~]# strace -p 846
select(0, NULL, NULL, NULL, {0, 280000}) = 0 (Timeout)
time(NULL)                              = 1033070689
kill(2253, SIGALRM)                     = 0
kill(2320, SIGALRM)                     = 0
kill(2341, SIGALRM)                     = 0
kill(2367, SIGALRM)                     = 0
wait4(-1, 0xbffff6d8, WNOHANG, NULL)    = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
time(NULL)                              = 1033070690
kill(2215, SIGALRM)                     = 0
kill(2219, SIGALRM)                     = 0
kill(2246, SIGALRM)                     = 0
wait4(-1, 0xbffff6d8, WNOHANG, NULL)    = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
time(NULL)                              = 1033070691
kill(2131, SIGALRM)                     = 0
kill(2214, SIGALRM)                     = 0
kill(2366, SIGALRM)                     = 0
wait4(-1, 0xbffff6d8, WNOHANG, NULL)    = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
time(NULL)                              = 1033070692
kill(2302, SIGALRM)                     = 0
kill(2310, SIGALRM)                     = 0
kill(2312, SIGALRM)                     = 0
kill(2374, SIGALRM)                     = 0
kill(2390, SIGALRM)                     = 0
wait4(-1, 0xbffff6d8, WNOHANG, NULL)    = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
time(NULL)                              = 1033070693
kill(2129, SIGALRM)                     = 0
kill(2154, SIGALRM)                     = 0
kill(2193, SIGALRM)                     = 0
wait4(-1, 0xbffff6d8, WNOHANG, NULL)    = 0
select(0, NULL, NULL, NULL, {1, 0}


Comment 6 Arjan van de Ven 2002-09-26 20:40:46 UTC
any idea if this gets fixed when using the "e100" driver instead?

Comment 7 Markus Doehr 2002-09-26 20:50:23 UTC
misunderstanding:
www is running the e100 driver
sda is running the eepro100 driver

www is SLOW
sda is normal (running Redhat 7.2 - 2.4.7-10)

So the www server is suffering due to the usage of e100? Should we try to switch
to eepro100?

Comment 8 Markus Doehr 2002-09-26 20:59:45 UTC
could this be related to https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=73185



Comment 9 Arjan van de Ven 2002-09-26 21:02:51 UTC
not really; the card in that bug is new and not quite on the market yet

Comment 10 Bugzilla owner 2004-09-30 15:39:57 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.