Bug 72806

Summary: (IDE)Crash caused by hdd
Product: [Retired] Red Hat Linux Reporter: Steve Auerbach <steven.p.auerbach>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3CC: lare, s.j.katzberg, steven.p.auerbach
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steve Auerbach 2002-08-27 22:43:15 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.0.0-10; Linux)

Description of problem:
System crashes, apparently caused by the CD/RW. I find the following in /var/log/messages:

legolas kernel: hdd: lost interrupt
legolas kernel: ide-scsi: CoD != 0 in idescsi_pc_int
legolas kernel: hdd: ATAPI reset complete
legolas kernel: hdd: status error: status=0x00 { }
legolas kernel: ide-scsi: Strange, packet command initiated yet DRQ isn't asserted.

This is repeated thousands of times, then the machine crashes. Only way to recover is to cycle the power. Crash happens at random times, no pattern apparent to me...

Version-Release number of selected component (if applicable):


How reproducible:
Couldn't Reproduce

Steps to Reproduce:
1.
2.
3.
 

Additional info:

The Red Hat support folks suggested passing the following parameter to the kernel:ide1=nodma, but this does not solve the problem. They also suggested changing the IDE cable, but this had no effect.

Comment 1 Need Real Name 2002-12-08 18:03:56 UTC
I have perhaps had similar problem with my just purchased Pentium IV with Intel
board and 256 Meg.  I had been running g77 programs on a 2.4 gig machine with
RedHat 7.2 installed with no problems, also the fortran code runs fine on UNIX
platforms and Cygwind Windows fortran.  Problem:  program compiles fine and
starts running.  It does a lot of disk searching, repetitively in the same file
under the influence of variable data (matched filter).  After a short while
(compared to the data file and compared to the successful running on the other
platforms) the hardrive goes unstable.  The program crashes and just shuts down
the terminal window.  Processed data is fine up to the point of crash.  This
sound similar?

Comment 2 Steve Auerbach 2002-12-09 20:32:30 UTC
I can't tell if the problems are related. I believe that the real diagnostic 
is the message "ide-scsi: Strange, packet command initiated yet DRQ isn't 
asserted.". Note that, for my machine, hdd is a CD/RW; it looks like hdd is a 
hard drive in your machine, in which case the issues are probably unrelated.

Comment 3 Andrew Lare 2002-12-19 23:04:47 UTC
I am having the exact same problem that the original poster is having.  Mine is
also a IDE/ATAPI CD-RW.  This only started after upgrading to 8.0 (2.4.18-18.8)
from 7.2 (2.4.9-31).  It seems to start between 0400 and 0600 local time on
random days.  I don't even have to be using the CD-RW for it to happen. 
Sometimes I can go up to a week or more without an incident, while other times
it will occur a couple times in a single day.  Any updates on this?

Comment 4 Andrew Lare 2003-01-13 15:27:21 UTC
After trying several suggestions, the only successful workaround that I was able
to come up with is to edit the grub.conf file to include two boots.  (1) with
CDROM support (hdc=ide-cd) and (2) with CD-RW support (hdc=ide-scsi). 
Fortunately, I do not need to burn CDs often.  When I do, I am required to
reboot to get CD-RW support and then boot back to CDROM only support.  Since
this problem did not exist in the 2.4.9 kernel series, it is only logical that
the bug lies somewhere in the 2.4.18 series.

Comment 5 Steve Auerbach 2003-01-14 00:22:36 UTC
Andrew Lare states that "Since this problem did not exist in the 2.4.9 kernel
series, it is only logical that the bug lies somewhere in the 2.4.18 series."
My notes from 3/15/02, around the time I first began to have this problem (on my
first Linux boxe, not my current Linux box) show that I was using kernel
2.4.9-31 (result of uname -r). So, I believe the problem has been around for
quite some time, and was not just introduced in the 2.4.18 kernel series.

Comment 6 Andrew Lare 2003-06-09 19:09:53 UTC
I might have found a solution to the problem....at least it works for me:

cp /etc/sysconfig/harddisks /etc/sysconfig/harddiskhdc

In /etc/sysconfig/harddiskhdc, the following should appear:

USE_DMA=0
EIDE_32BIT=1

Comment 7 Bugzilla owner 2004-09-30 15:39:52 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/