Red Hat Bugzilla – Bug 102792
Kernel 2.4.20-20.7 dies under very heavy network traffic
Last modified: 2015-01-04 17:03:02 EST
Description of problem: Kernel 2.4.20-20.7 installs fine and under a normal load
may work just fine for most users. But under very heavy network traffic the
kernel hangs. Probably a problem with the driver (gigabit tg3 I believe - Dell 2650)
You Bugzilla wouldn't let me select "Kernel" as the component....
Version-Release number of selected component (if applicable): 2.4.20-20.7
How reproducible: every time!
Steps to Reproduce:
1. run a robot fetching 2+ million URLs a day and you'll see the bug
This same thing happened with an older kernel for 7.3, but I can't remember
which one. Kernel 2.4.18-27.7.xsmp works just fine. Load up the new Kernel
2.4.20-20.7 and it dies every time under heavy load.
I see this bug too - I have a Dell PE2650 running Redhat Linux 7.3.
The machine is an NFS server that has been running for at least 6
months on kernel 2.4.18-27.7.xsmp quite happily.
I tested the new 2.4.20-20.7smp kernel on my test machine for a week
with no problems (admittedly I did not hammer it with NFS traffic) and
presumed it OK. When I then up2dated my production server to this
kernel, the machine crashed within one hour. I then rebooted and it
crashed after about 4 hours. Since rebooting into the old
2.4.18-27.7.xsmp kernel, the machine has been fine again.
The symptoms of the crash were that the server just hung/locked up -
it would not respond to pings. There were no messages in the log files.
Is anything being done about this ???
I too have been experiencing crashes under heavy network/disk io load.
I am (was) using kernel 2.4.20-20.7smp on a dell 2450 with a 10/100
ethernet card adaptor. When I get a crash the box is completely dead
to the world and no log messages related to the crash are written to
syslog or the console. Is there a way to increase verbosity so
something will get logged? Anyway, To work around the crashing issue I
now boot off the kernel that was stable for me before the udpate
(2.4.18-10smp). I am not using any binary only modules. I am taint
free. If you need more info from me please let me know.
Redhat isn't going to do anything about it. After all they can't wait to drop support for
7.3 December 31st and try and push you to Enterprise or something else. I'm
moving away from Redhat and compiling my own from sources at kernel.org. I can't
wait to get away from all the rpm crap anyway.
tg3 is not the problem, it's *aacraid* driver.
If you think it's the *aacraid* driver then I guess you can download the most recent
There's quite a bit of discussion here and to be honest with you, I'm not sure what to
do or how to fix my current box. Currently I'm running kernel 2.4.20-24.7smp without
a crash, but cpu is eaten by kscand which is the #1 cpu usage on the box running
almost constantly at 5% on a dual XEON(TM) CPU 1.80GHz.
If anyone has any suggestions on calming this beast down without breaking the box,
let me know.
Xose, why do you say its the aacraid driver ? Are there settings
where I can have the kernel print something out when things crash ?
Currently I am not getting anything written to the console or to
kscand bug is another thing, and that bug is already open in bugzilla.
dave jones is going to release a new kernel errata very soon: try the
*beta* release http://people.redhat.com/davej/rhl-errata/2.4.20-27.7/
a colleague of mine has some dell-2650. She updated to latest BIOS,
BackPlane firmware, RAID firmware and RHL-kernel(2.4.20-20.x) and the
systems are stable.
Other people in the dell mailing list has problems with aacraid
driver, but tg3 driver use to be stable. If you have any doubt try
bcm-5700 instead of tg3 (danger!! unsupported by Red Hat):
But I am sure that the problem is aacraid.
to catch the bug -> /usr/src/linux-2.4/Documentation/nmi_watchdog.txt
I tried using the beta errata kernel from Dave Jones and my system
crashed again after a couple of days. Same symptoms (no syslog, no
text printed to the console). I am using rh7.3, a dell 2450, 10/100
ethernet cards, no hardware raid only software raid.
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases,
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/