Bug 41070

Summary: (IDE PIIX)IDE disk errors with 2.4 kernels
Product: [Retired] Red Hat Linux Reporter: vladimir
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: medium    
Version: 7.1CC: rkaa
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description vladimir 2001-05-17 06:55:30 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.76 [en] (X11; U; Linux 2.4.3-2.14.10 i686)

Description of problem:
I experience severe filesystem corruptions on the hda hard drive. These
are  related to disk errors in the system log, such as "hda: status error:
status=0x58 { DriveReady SeekComplete DataRequest }" and "drive not ready
for command'"

How reproducible:
Always

Steps to Reproduce:
1.Boot Linux
2.Wait a few hours, doing usual work (i.e., running Gnome, lots of
applications)
3.
	

Actual Results:  After some time I will see these messages in the system
log:
May 16 23:00:01 zeos kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
May 16 23:00:01 zeos kernel: hda: drive not ready for command
Then I usually run 'rpm -Va' and it produces a lot of errors. The system
needs to be rebooted and the drives checked. e2fsck always returns a lot of
severe filesystem errors.
Also, I see these errors for hdc:
May 16 17:35:27 zeos kernel: hdc: drive_cmd: error=0x02 { TrackZeroNotFound
}
May 16 17:35:27 zeos kernel: hdc: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
May 16 17:35:27 zeos kernel: hdc: dma_intr: error=0x02 { TrackZeroNotFound
}
May 16 17:35:27 zeos kernel: hdc: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
May 16 17:35:27 zeos kernel: hdc: dma_intr: error=0x02 { TrackZeroNotFound
}
May 16 17:35:27 zeos kernel: hdc: DMA disabled
May 16 17:35:27 zeos kernel: hdd: DMA disabled
May 16 17:35:27 zeos kernel: ide1: reset: success


Additional info:

Motherboard and chipset: Abit BH6, Intel BX chipset

  Bus  0, device   7, function  1:
    IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 1).
      Master Capable.  Latency=32.  
      I/O at 0xf000 [0xf00f].

hda: IC35L040AVER07-0 (IBM 40GB)
hdc: CASTLEWOOD ORB2-E (2.2GB)

Both hdb and hdd are CD-ROM drives.

I tried both the standard RH7.1 2.4.2 kernel and the 2.4.3-2.14.10 from
rawhide. Note that these disks work fine with RH6.1 and kernel 2.2.18. I
replaced the ORB with my old Seagate drive with RH6.1 and ran heavy disk
read/write tasks (copied the entire /usr partition 25 times to 5 different
places). Not a single error (I turn DMA on in both cases).

Comment 1 Arjan van de Ven 2001-05-17 09:03:45 UTC
Can you give me the output of "lspci -v" and "cat /proc/ide/*/model" ?
And were you by chance listening to music off one of the cdrom drives ?

Comment 2 vladimir 2001-05-18 03:32:19 UTC
OK, lspci -v gives:

00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev
02)
	Flags: bus master, medium devsel, latency 32
	Memory at d0000000 (32-bit, prefetchable) [size=64M]
	Capabilities: [a0] AGP version 1.0

00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 02)
(prog-if 00 [Normal decode])
	Flags: bus master, 66Mhz, medium devsel, latency 64
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
	Memory behind bridge: d4000000-d7ffffff
	Prefetchable memory behind bridge: d8000000-d8ffffff

00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
	Flags: bus master, medium devsel, latency 0

00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80
[Master])
	Flags: bus master, medium devsel, latency 32
	I/O ports at f000 [size=16]

00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00
[UHCI])
	Flags: bus master, medium devsel, latency 32, IRQ 10
	I/O ports at e000 [size=32]

00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
	Flags: medium devsel, IRQ 9

00:09.0 Multimedia video controller: Brooktree Corporation Bt878 (rev 11)
	Subsystem: Hauppauge computer works Inc. WinTV/GO
	Flags: bus master, medium devsel, latency 32, IRQ 10
	Memory at da001000 (32-bit, prefetchable) [size=4K]
	Capabilities: [44] Vital Product Data
	Capabilities: [4c] Power Management version 2

00:09.1 Multimedia controller: Brooktree Corporation Bt878 (rev 11)
	Subsystem: Hauppauge computer works Inc. WinTV/GO
	Flags: bus master, medium devsel, latency 32, IRQ 10
	Memory at da000000 (32-bit, prefetchable) [size=4K]
	Capabilities: [44] Vital Product Data
	Capabilities: [4c] Power Management version 2

00:0b.0 Serial controller: US Robotics/3Com 56K FaxModem Model 5610 (rev 01)
(prog-if 02 [16550])
	Subsystem: US Robotics/3Com USR 56k Internal Voice Modem (Model 2976)
	Flags: medium devsel, IRQ 11
	I/O ports at e400 [size=8]
	Capabilities: [dc] Power Management version 2

00:0d.0 Multimedia video controller: 3Dfx Interactive, Inc. Voodoo (rev 02)
	Flags: fast devsel
	Memory at d9000000 (32-bit, prefetchable) [size=16M]

01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200 AGP (rev 01)
(prog-if 00 [VGA])
	Subsystem: Matrox Graphics, Inc. Millennium G200 AGP
	Flags: bus master, medium devsel, latency 32, IRQ 12
	Memory at d8000000 (32-bit, prefetchable) [size=16M]
	Memory at d4000000 (32-bit, non-prefetchable) [size=16K]
	Memory at d5000000 (32-bit, non-prefetchable) [size=8M]
	Expansion ROM at <unassigned> [disabled] [size=64K]
	Capabilities: [dc] Power Management version 1
	Capabilities: [f0] AGP version 1.0

Disk models I already posted, but in case CDROMs are important:

IC35L040AVER07-0
Pioneer CD-ROM ATAPI Model DR-A02S 0108
CASTLEWOOD ORB2-E
HITACHI GD-2000

I also tried this configuration (also gives errors):

IC35L040AVER07-0
Pioneer CD-ROM ATAPI Model DR-A02S 0108
ST36530A

I established that hda errors often, but not always occur when I start GNOME
(and packages such as sawfish or librep usually get damaged):

From /var/log/messages:

May 17 13:01:11 zeos modprobe: modprobe: Can't locate module char-major-145
May 17 13:01:11 zeos modprobe: modprobe: Can't locate module char-major-145
May 17 13:01:11 zeos kernel: Linux agpgart interface v0.99 (c) Jeff Hartmann
May 17 13:01:11 zeos kernel: agpgart: Maximum main memory to use for agp memory:
204M
May 17 13:01:11 zeos kernel: agpgart: Detected Intel 440BX chipset
May 17 13:01:11 zeos kernel: agpgart: AGP aperture is 64M @ 0xd0000000
May 17 13:01:11 zeos kernel: [drm] AGP 0.99 on Intel 440BX @ 0xd0000000 64MB
May 17 13:01:11 zeos kernel: [drm] Initialized mga 2.0.1 20000928 on minor 63
May 17 13:01:20 zeos gnome-name-server[852]: starting
May 17 13:01:20 zeos gnome-name-server[852]: name server starting
May 17 13:01:21 zeos kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
May 17 13:01:21 zeos kernel: hda: drive not ready for command
May 17 13:01:21 zeos kernel: hdb: ATAPI 24X CD-ROM drive, 128kB Cache, DMA
May 17 13:01:21 zeos kernel: Uniform CD-ROM driver Revision: 3.12
May 17 13:01:21 zeos kernel: hdd: ATAPI 20X DVD-ROM drive, 512kB Cache, DMA

There are many fragments in the log just like this one.
I don't listen to CDs.

Comment 3 Arjan van de Ven 2001-05-18 08:36:23 UTC
Could you try passing "ide=nodma" to the lilo prompt? That disables DMA and
might prevent a lot of problems.

Comment 4 vladimir 2001-05-24 06:54:11 UTC
Well, it is stable without DMA. Of course, performance went down the drain.

Comment 5 Need Real Name 2001-06-01 17:33:13 UTC
I'm having the same kind of problems on my Compaq Armada E500 laptop. I began to
think it was a hardware/harddisk problem, so I've tried to install RedHat 7.1 on
another laptop, also an Armada E500. And I'm getting the same errors on this
machine.

Comment 6 R.K.Aa. 2001-06-03 21:27:18 UTC
same prob. randomly with an ASUS P2B-F. Seems to only happen when accessing a
disk the second IDE controller (/dev/hdb). ide=nodma does not cure it
completely, but i have less crashed after applying it. Haven't disabled DMA in
BIOS though - perhaps that would do the trick.

However, i encounter the bug most often when building mozilla in /dev/hdb.
Another thing i noticed: While keyboard was dead and all seemingly frozed, the
WinTV PCI card application (xawtv) still worked: the TV app was alive and well
displaying image and playing sounds like nothing had happened.
Serious file-system corruptions occur after a reboot though.

I have started to exit X and build in console mode, and have not seen the freeze
again. Like the first reporter I also use a Brooktree card.

 /sbin/lspci -v
00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 03)
	Flags: bus master, medium devsel, latency 64
	Memory at e4000000 (32-bit, prefetchable) [size=64M]
	Capabilities: [a0] AGP version 1.0

00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 03)
(prog-if 00 [Normal decode])
	Flags: bus master, 66Mhz, medium devsel, latency 64
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=64

00:04.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
	Flags: bus master, medium devsel, latency 0

00:04.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80
[Master])
	Flags: bus master, medium devsel, latency 32
	I/O ports at d800 [size=16]

00:04.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00
[UHCI])
	Flags: bus master, medium devsel, latency 32, IRQ 5
	I/O ports at d400 [size=32]

00:04.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
	Flags: medium devsel, IRQ 9

00:0a.0 VGA compatible controller: nVidia Corporation Riva TnT 128 [NV04] (rev
04) (prog-if 00 [VGA])	Subsystem: Diamond Multimedia Systems Viper V550 with TV out
	Flags: bus master, 66Mhz, medium devsel, latency 248, IRQ 10
	Memory at e1000000 (32-bit, non-prefetchable) [size=16M]
	Memory at e3000000 (32-bit, prefetchable) [size=16M]
	Expansion ROM at 000c0000 [disabled] [size=64K]
	Capabilities: [60] Power Management version 1

00:0c.0 Multimedia video controller: Brooktree Corporation Bt848 TV with DMA
push (rev 11)
	Flags: medium devsel, IRQ 11
	Memory at e2000000 (32-bit, prefetchable) [size=4K]

00:0d.0 Multimedia audio controller: Creative Labs SB Live! EMU10000 (rev 07)
	Subsystem: Creative Labs CT4832 SBLive! Value
	Flags: bus master, medium devsel, latency 32, IRQ 5
	I/O ports at d000 [size=32]
	Capabilities: [dc] Power Management version 1

00:0d.1 Input device controller: Creative Labs SB Live! (rev 07)
	Subsystem: Creative Labs Gameport Joystick
	Flags: bus master, medium devsel, latency 32
	I/O ports at b800 [size=8]
	Capabilities: [dc] Power Management version 1




Comment 7 Need Real Name 2001-06-05 08:29:40 UTC
When removing the CD drive I don't get the SeekComplete error wich I usualy get
when dma is enabled:

May 30 09:16:22 pc5 kernel: hda: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
May 30 09:16:22 pc5 kernel: hda: drive not ready for command
May 30 09:16:22 pc5 kernel: hdb: ATAPI 24X CD-ROM drive, 512kB Cache, DMA
May 30 09:16:22 pc5 kernel: Uniform CD-ROM driver Revision: 3.12


Comment 8 R.K.Aa. 2001-06-10 22:24:43 UTC
my hdb is on same IDE controller as hda. This seems to be some sort of PCI
related conflict, and i discovered my contribution in this bug may be a sidetrack:

When i installed 7.1 (or rather upgraded from 6.2) i installed the nVidia driver
(v.1.0-1251.rh71) immediately afterwards. Since it seemed the freeze only
happened when writing to /dev/hdb while using X, i changed driver to the nv
driver provided with XFree86 4.0.3-5.

Afterwards I haven't been able to force the freeze/collision to happen.
So in my case, the conflict is somehow triggered by the nVidia driver, but where
that leaves the actual bug i have no idea.
--
$ cat /proc/ide/*/model
ST313620A
ST313021A
PLEXTOR CD-R PX-W1210A
pci
pci
$

Comment 9 Arjan van de Ven 2001-06-11 08:26:51 UTC
dark: it leaves your bug in the "don't use the binary only nVidia
driver" category... there's nothing we can do about it

Comment 10 Bugzilla owner 2004-09-30 15:39:00 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/