Red Hat Bugzilla – Bug 46941
Many D state hangs with 2.4.3-12
Last modified: 2008-08-01 12:22:51 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.77 [en] (Win95; U)
Description of problem:
A lightly loaded P3 server accumulates, on average, an additional process hanging in D state roughly every two hours. Hangs have been
running lilo, piping through pgp, the gunzip of wtmp.1.gz which emacs runs to determine boot time, and installing kernel source. Each of these
has been seen more than once in 24 hour period before reverting to 2.4.2-2. ps lax shows wchan of "end " or "wait_o"; ps lnax shows numeric
equivalents to be 8242d5 and 133178. P3/600Mhz, 128MB RAM, VIA Apollo chipset, all drives s/w RAID-1 on IDE. Ran fine on 2.2.12-2
before and after 2.4.3-12. Used rpm -Va and also switched back to 2.2.12-2, erased 2.4.3-12, then reinstalled - problem is not a bad copy of a
Steps to Reproduce:
1. See examples in description section
Once processes are in this state, reboot is nearly impossible. Even "reboot -n -f" only works about a third of the time.
I have also seen this using kernel-enterprise-2.4.3-12. I can't duplicate it on demand, sometimes processes start blocking after a few days, a few
weeks, or never ( ~50 days). To date I haven't found a pattern.
Like you, ps lax shows wchan of end and wait_on_buffer; but I also have several "down" as well.
Hardware: p3/1Ghz, 1GB RAM, ASUS CUV4x-e mother board (VIA 694x chipset) all partitions SW RAID-1 on SCSI (aic7xxx driver).
Have you tried the more recent 2.4.9-12 kernel rpm? I'll be upgrading this evening (hopefully).
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases,
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/