From Bugzilla Helper: User-Agent: Mozilla/4.77 [en] (Win95; U) Description of problem: A lightly loaded P3 server accumulates, on average, an additional process hanging in D state roughly every two hours. Hangs have been running lilo, piping through pgp, the gunzip of wtmp.1.gz which emacs runs to determine boot time, and installing kernel source. Each of these has been seen more than once in 24 hour period before reverting to 2.4.2-2. ps lax shows wchan of "end " or "wait_o"; ps lnax shows numeric equivalents to be 8242d5 and 133178. P3/600Mhz, 128MB RAM, VIA Apollo chipset, all drives s/w RAID-1 on IDE. Ran fine on 2.2.12-2 before and after 2.4.3-12. Used rpm -Va and also switched back to 2.2.12-2, erased 2.4.3-12, then reinstalled - problem is not a bad copy of a file. How reproducible: Sometimes Steps to Reproduce: 1. See examples in description section 2. 3. Additional info: Once processes are in this state, reboot is nearly impossible. Even "reboot -n -f" only works about a third of the time.
I have also seen this using kernel-enterprise-2.4.3-12. I can't duplicate it on demand, sometimes processes start blocking after a few days, a few weeks, or never ( ~50 days). To date I haven't found a pattern. Like you, ps lax shows wchan of end and wait_on_buffer; but I also have several "down" as well. Hardware: p3/1Ghz, 1GB RAM, ASUS CUV4x-e mother board (VIA 694x chipset) all partitions SW RAID-1 on SCSI (aic7xxx driver). Have you tried the more recent 2.4.9-12 kernel rpm? I'll be upgrading this evening (hopefully).
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/