Bug 61120 - System freeze about every 4-6 weeks
System freeze about every 4-6 weeks
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-03-13 16:49 EST by Michael St. Laurent
Modified: 2008-08-01 12:22 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-30 11:39:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Output from the dmesg command on the problem system (11.18 KB, text/plain)
2002-03-13 16:50 EST, Michael St. Laurent
no flags Details

  None (edit)
Description Michael St. Laurent 2002-03-13 16:49:24 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

Description of problem:
About every 4-6 weeks a NAS server system I built on Red Hat 7.2 locks up so 
badly that it will not respond to ctrl-alt-del and must be rebooted using the 
reset button or cycling the power.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.  Installed Red Hat 7.2 with three IDE HDs in a Raid-5
2.  Updated with Errata packages through the 2.4.9-21 kernel
3.  Made network shares using Samba and NFS
4.  Wait
	

Additional info:

The system is using the Intel D850GBAL Motherboard and three Maxtor 5T060H6 
Hard Drives in a software Raid-5.  I think this is an IDE driver problem due to 
the following entries made in the /var/log/messages file around the same time 
as the system freeze.

Mar 11 20:23:35 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:23:55 hart-nas kernel: hdg: lost interrupt
Mar 11 20:23:55 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:23:55 hart-nas kernel: hda: lost interrupt
Mar 11 20:23:55 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:23:55 hart-nas kernel: hdg: lost interrupt
Mar 11 20:23:55 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:23:55 hart-nas kernel: hde: lost interrupt
Mar 11 20:23:55 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:23:55 hart-nas kernel: hdg: lost interrupt
Mar 11 20:23:55 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:23:55 hart-nas kernel: hde: lost interrupt
Mar 11 20:23:55 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:28:55 hart-nas kernel: hde: lost interrupt
Mar 11 20:31:05 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:32:35 hart-nas kernel: hda: lost interrupt
Mar 11 20:33:25 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:36:35 hart-nas kernel: hdg: lost interrupt
Mar 11 20:38:45 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:40:55 hart-nas kernel: hde: lost interrupt
Mar 11 20:45:40 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:50:10 hart-nas kernel: hdg: lost interrupt
Mar 11 20:57:10 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 20:59:20 hart-nas kernel: hde: lost interrupt
Mar 11 21:00:30 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 21:03:40 hart-nas kernel: hda: lost interrupt
Mar 11 21:07:50 hart-nas kernel: ide_dmaproc: chipset supported ide_dma_lostirq 
func only: 13
Mar 11 21:10:20 hart-nas kernel: hdg: lost interrupt

After this there are no further entries in the messages file until the first 
message from the reboot.

I've attached the output from the dmesg command to help with hardware details.  
Let me know what else you need.
Comment 1 Michael St. Laurent 2002-03-13 16:50:25 EST
Created attachment 48429 [details]
Output from the dmesg command on the problem system
Comment 2 Arjan van de Ven 2002-03-13 17:00:16 EST
Hmm this looks like an IDE thing indeed. It's a tossup between hardware and
software, but given that stuff works for 4 weeks..... 
Is speed very relevant for your workload ? if not it might we worth a try to use
udma33 instead of udma66/100 to see if that makes it go away.
Comment 3 Michael St. Laurent 2002-03-15 11:16:40 EST
Speed is an issue as this is a NAS fileserver system.  I suppose we could live 
with it for a while though.  ;-)  BTW, are there any bugfixes between the -21 
and -31 kernels that might have an effect on this problem?
Comment 4 Michael St. Laurent 2002-05-22 18:15:28 EDT
The hdparms for the system in question are:

/dev/hda:
 multcount    = 16 (on)
 I/O support  =  0 (default 16-bit)
 unmaskirq    =  0 (off)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 nowerr       =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 7476/255/63, sectors = 120103200, start = 0

Comment 5 Bugzilla owner 2004-09-30 11:39:26 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.