Bug 83789 - ksoftirqd_CPU0 hogs system for 2.4.18-24.8.0, not 2.4.18-24
Summary: ksoftirqd_CPU0 hogs system for 2.4.18-24.8.0, not 2.4.18-24
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 8.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-02-08 18:09 UTC by Allyn Dimock
Modified: 2007-04-18 16:50 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:40:30 UTC
Embargoed:


Attachments (Terms of Use)

Description Allyn Dimock 2003-02-08 18:09:04 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

Description of problem:
ksoftirqd_CPU0 using > 90% of CPU time.  Drive by going to a page with
lots of images in mozilla, using 802-11b pcmcia wireless card with
orinoco_cs driver.  Reproducible on kernel 2.4.18-24.8.0 on my hardware.  Not
drivable at all (so far) on kernel 2.4.18-14.

Hardware is Dell Inspiron 8200, using Netgear 802-11b wireless card in first
pcmcia slot with orinoco_cs driver as device "eth1"
Software for 2.4.18-14 is Redhat 8.0 distribution with no tweaks or new drivers
but ntfs file system module added).  2.4.18-24.8.0 is update
done on 7-Feb-03 (no ntfs file system module if it matters...)


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.with hardware and software described above use mozilla to open pages
with many images (gardening catalogs in this case).
2.after loading a number of such pages, ksoftirqd_CPU0 will start to take > 85%,
often > 90% of CPU time.
3.System remains unusable until rebooted.
    

Additional info:

Comment 1 jeffrey.buchsbaum 2003-02-17 14:47:59 UTC
This affects my dual xeon 2.4 Ghz system as well. I have the latest kerner from
RH (as of 2/17/03.....non-beta). My system is a Dell 530 with 4 gb of ram and
two 120gb ide disks. I use a ps2 keyboard, a ps2 mouse, and a few usb
peripherals.  I also see memory use climbing as it sits still (is there a memory
leak in the kernel or the gterm app?).

It makes typing and mouse use impossible!
For me, no browser is necessary...just a telnet terminal for (gterm) is all that
is needed.It occurs randomely. My machine is typically up non-stop.  About once
every month I need to reboot because of this bug. Sometimes it happens two days
after a reboot. Sometimes two weeks....

See bug 80279 too. This is likely the same thing.
It would be great if RH addressed these two bugs as one bug and fixed the
problem or at least came up with a "patch" (some solution for us).  

I moved to linux for stability. Even XP doesn't do this kind of stuff. 


Please help!

Jeff



Comment 2 Allyn Dimock 2003-02-17 21:00:41 UTC
Two new comments:
(1) I _can_ drive bug 83789 under 2.4.18.24 after all, but requires much
    larger transfer over wireless:  5M .rpm file being downloaded from
    ftp.research.bell-labs.com managed to hang wireless 4 times in a row
    this morning in 2.4.18.24.

(2) See http://www.uwsg.iu.edu/hypermail/linux/net/0211.3/0020.html
    Suresh Singh Keisam claims that he had similar problem with 
    linux-2.4.20-rc3 kernel, and problem was fixed by updating to 
    orinoco-0.13beta1 driver.  I do not have experience with reinstalling
    drivers (that is why I get pre-packaged linux), but please check it out
    and see if this fixes problem. -- my wireless card uses the orinoco 
    driver, so this seems likely to be the problem.

Comment 3 Allyn Dimock 2003-02-17 22:27:53 UTC
I just got the orinoco-0.13b driver from
http://ozlabs.org/people/dgibson/dldwd/orinoco-0.13b/,
installed as per their instructions (including using
the -DMODVERSIONS -include $(KERNEL_SRC)/include/linux/modversions.h
flags as recommended in README.orinoco).
Rebooted, and was able to download the file that I had consistently been
unable to download previously:
ftp://ftp.research.bell-labs.com/dist/smlnj/release/110.0.7/RPMS/smlnj-110.0.7-4.i386.rpm

This certainly does not constitute a thorough test of the fix,
but gives some indication that the orinoco driver packaged with the
Red Hat 8.0 system _may_ have been at fault.

Comment 4 jeffrey.buchsbaum 2003-02-18 13:49:36 UTC
The above is very interesting...but I am running on a land line 100BaseT dhcp
set-up. While orinoco drivers "might" be loaded by the kernel, they should not
be in use....am I missing something?  My box is a Dell Precision Workstation 530.

Thanks.

Comment 5 Ivo Gough Eschrich 2003-03-24 18:53:20 UTC
Had identical problem on a recent Dell 8200 with DLink 650 (ie Prism2.0) wireless
card, running stock RH8.0 with 2.4.18-26.8.0 kernel. ksoftirqd would consistently
bog down the machine whenever a X connection using wireless was up, even with
minimal traffic.

Using the updated orinoco driver as described above _fixed_ the problem. I used
the most recent one available at the time, orinoco-0.13c.

Ivo

Comment 6 David Eriksson 2003-04-17 16:43:50 UTC
I have had similar problems in RedHat 9 with kernel-2.4.20-9. I am trying 
orinoco-0.13c from http://ozlabs.org/people/dgibson/dldwd/ now.

This is the first indication in the log that something was wrong:

Apr 17 18:16:24 zion kernel: eth1: orinoco_reset failed in
orinoco_pci_open()<3>eth1: error -110 reading Rx descriptor. Frame
dropped.
Apr 17 18:16:24 zion kernel: eth1: Error -110 writing Tx descriptor to BAP
Apr 17 18:16:25 zion last message repeated 13 times
Apr 17 18:16:25 zion kernel: hermes @ MEM 0xd0957000: Timeout waiting for
command completion.
Apr 17 18:16:25 zion kernel: hermes @ MEM 0xd0957000: Error -16 issuing command.
Apr 17 18:16:25 zion kernel: eth1: Error -110 writing Tx descriptor to BAP
Apr 17 18:16:25 zion last message repeated 30 times
Apr 17 18:16:25 zion kernel: hermes @ MEM 0xd0957000: Error -16 issuing command.
Apr 17 18:16:25 zion kernel: hermes @ MEM 0xd0957000: Error -16 issuing command.
Apr 17 18:16:25 zion kernel: eth1: Error -110 writing Tx descriptor to BAP
Apr 17 18:16:26 zion last message repeated 1102 times

This was repeated until I shutdown the interface and unloaded the hermes,
orinoco and orinoco_pci modules.

(When I previously used RedHat 7.3 and 8 on the same machine, I used the
linux_wlan driver [http://www.linux-wlan.com/linux-wlan/] and it never did
anything like this.)

Comment 7 Bugzilla owner 2004-09-30 15:40:30 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.