Bug 40085 - "DriveReady SeekComplete Error" with K7, VIA chipset
Summary: "DriveReady SeekComplete Error" with K7, VIA chipset
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-05-10 14:45 UTC by Vidar Langseid
Modified: 2007-04-18 16:33 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-05-11 09:11:47 UTC
Embargoed:


Attachments (Terms of Use)
Here is my dmesg (6.23 KB, text/plain)
2001-05-10 14:47 UTC, Vidar Langseid
no flags Details
my dmesg, with ide=nodma (6.53 KB, text/plain)
2001-05-10 15:11 UTC, Vidar Langseid
no flags Details
my dmesg when the HPT370 is enabled in BIOS (6.13 KB, text/plain)
2001-05-11 09:03 UTC, Vidar Langseid
no flags Details
The letter I sendt to IBM and the reply I got (3.76 KB, patch)
2001-05-11 11:36 UTC, Vidar Langseid
no flags Details | Diff

Description Vidar Langseid 2001-05-10 14:45:38 UTC
Description of Problem:
I have a computer with 6.2, 7.0 and 7.1. This problem only exists when
booting 7.1, probably because of the 2.4 kernel ?

I get messages like 
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

I thought that bug 39689 might be related but "ide=nodma" did not help
# cat /proc/cmdline
BOOT_IMAGE=redhat71 ro root=306
BOOT_FILE=/mnt/boot/hda6_RedHat-7.1_Drift/vmlinuz-2.4.2-2 ide=nodma


How Reproducible: Every time on my system


Steps to Reproduce:
1. Install rh7.1
2. Boot up your system
3. 

Actual Results:


Expected Results:


Additional Information:
System is
 - Abit KT7-RAID motherbord (ide disc attached to primary IDE, not the RAID
controller)
 - IBM 70GXP harddrive

Comment 1 Vidar Langseid 2001-05-10 14:47:58 UTC
Created attachment 18026 [details]
Here is my dmesg

Comment 2 Arjan van de Ven 2001-05-10 14:52:59 UTC
BadCRC usually indicates you have a bad IDE cable.
2.4 kernels have higher performance on the IDE subsystem, but that breaks
if the cable is bad.
Also, it appears that the "ide=nodma" didn't stick, did you use a 
lilo.conf "append" line or did you type it at the lilo prompt ?

Comment 3 Vidar Langseid 2001-05-10 15:09:42 UTC
I tried it on the lilo prompt. It didn't help so I rebooted again. That's why
the dmesg didn't had a "ide=nodma" option. Sorry if this was confusing

Anyway, linux seems to disable DMA on KT7 without regard to "ide=nodma" ?
from dmesg : "KT7 series board detected. Disabling IDE DMA"

I'll attach a dmesg with "ide=nodma" too

I'll try to change the cable too and see if that helps




Comment 4 Vidar Langseid 2001-05-10 15:11:16 UTC
Created attachment 18027 [details]
my dmesg, with ide=nodma

Comment 5 Vidar Langseid 2001-05-10 17:10:24 UTC
A cable switch didn't work. I tried with both a standard old fashion IDE cable
and a IDE66/100 cable.


This however seems to stop the errormessages:
# hdparam -d 0 /dev/hda

So when I do this in rh7.0:
# hdparam -d 1 /dev/hda
Then I manage to reproduce the same errormessages there as in rh7.1


This indicates that dma must be turned off on my system. 
Any ideas why ?


Is "ide=nodma" and "hdparam -d 0 /dev/hda" supposed to do the same thing ?

regards,
vidar

Comment 6 Arjan van de Ven 2001-05-10 17:17:55 UTC
Is this a fresh install or an upgrade ?



Comment 7 Arjan van de Ven 2001-05-10 17:19:05 UTC
and yes, ide=nodma is supposed to do the same thing, however it could be that
something turns DMA back on in the initscripts. I've seen that on a system
that was made to use DMA on 7.0 and then got upgraded to 7.1

Comment 8 Vidar Langseid 2001-05-10 17:27:56 UTC
It is a fresh install

The computer have rh6.2 and rh7.0 installed too, but on different partitions so
that shouldn't have anything to do with it.

Comment 9 Arjan van de Ven 2001-05-10 17:43:45 UTC
CRC errors are 99.99% of the time a hardware issue, sorry about that.
However, there is a bug in certain VIA chipsets, for which we have a _partial_
workaround in the 2.4.2-2 kernel (the one you have) and the full workaround in
the kernel upgrade we are working on, a snapshot of that kernel can be found in
the rawhide directory on our ftp site. If you don't have a problem with
experimenting, you could try to see if the "official" chipset workaround will
fix your problem.

Comment 10 Vidar Langseid 2001-05-10 18:52:29 UTC
I tried both kernel-2.4.3-2.14.10.athlon.rpm and kernel-2.4.3-2.14.10.i686.rpm.
The problem still remains...

I also tried IBM Drive Fitness Test, a tool IBM have to test their discs. That
test didn't find anything wrong.

So what you are saying is that there is a hardware problem somewhere that forces
me to disable DMA and there is nothing you or me can do about it? That realy
su... If so this should affect all KT7-RAID owners....


Anyway, thank you for the FAST response
This was kinda instant messaging

If you have any other suggestions, I am keen on testing it.

Best regards,
Vidar


Comment 11 Arjan van de Ven 2001-05-11 08:38:01 UTC
One more hunch: does your board have a HPT366 or HPT370 controller?
(lspci will tell you)

Comment 12 Vidar Langseid 2001-05-11 09:01:10 UTC
Hi
 
I found out why ide=nodma didn't work. I did a "hdparm -d /dev/hda" in the
beginning of rc.sysinit. It seems
that the kernel shipping in rh7.1 fails to disable dma on my hardware even
though it claims to do so. After installing the rawhide kernel
(kernel-2.4.3-2.14.10.athlon.rpm) it works.
 
So now I manage to boot my system in a clean way without nasty errormessages.
However, I am not that happy since I have to disable dma....


About the HPT controller
This motherboard have a HPT370 controller

This does not show up in the dmesg I previously attached because I have disbled
the controller entirely ( in bios).
This is a lspci with the HTP370 disabled:
ls[root@dozer /root]# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
22)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 10)
00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10)
00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10)
00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
30)
00:0f.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev
74)
01:00.0 VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 15)




When I now tried to enable the HPT370 I see that something strange is going on.
I'll attach a dmesg showing that linux detects the HPT370. But when I then now
do a lspci I get this:

[root@dozer /root]# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
22)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 10)
00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10)
00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10)
00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
30)
00:0f.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev
74)
00:13.0 Unknown mass storage controller: Triones Technologies, Inc. HPT366 (rev
03)
01:00.0 VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 15)


It suddenly thinks I have a HPT366

This is a extract of /etc/sysconfig/hwconf
class: OTHER
bus: PCI
detached: 0
driver: unknown
desc: "Triones|HPT366"
vendorId: 1103
deviceId: 0004
subVendorId: 1103
subDeviceId: 0001
pciType: 1


I don't know if this is related to my problem since I don't use the HPT370 at
all


Regards,
vidar

Comment 13 Vidar Langseid 2001-05-11 09:03:09 UTC
Created attachment 18126 [details]
my dmesg when the HPT370 is enabled in BIOS

Comment 14 Arjan van de Ven 2001-05-11 09:11:42 UTC
Oh ok. If the IBM drive was connected to the HPT controller, that would explain
stuff; there is a bad interaction between IBM drives and these controllers
(which we can work around so you can still do DMA on that). But as you
are not using it, this is not the problem here.

Comment 15 Vidar Langseid 2001-05-11 10:08:08 UTC
I am aware of that problem. My workstation has a HPT366 ->the system freezed
instantly when trying to access a IBM drive. My workaround there was to force
the IBM drive to ATA2 mode(DMA33). Sent a bugreport to abit, hpt and IBM. Took
over three weeks before I got an answer. Nothing like bugzilla.redhat.com's
instant messaging...
:)

I can't use the HPT controller on this system because of the various OSes the
computer need the run (I don't have 13 partiotions on a system just for fun)

The computer works fine when DMA is selected in Win98. At last it doesn't
complain. Neither is it more unstable than win98 usually is.

If you want to look more into the problem, I am there for you. If you instead
think that this is unresolvable you may change the resolution to WONTFIX or
something. You decide.

Best regards,
vidar



Comment 16 Arjan van de Ven 2001-05-11 10:11:04 UTC
We fixed our kernel to default to that mode for most/all IBM drives on HPT
controllers. If you could give me the output of "cat /proc/ide/hda/model"
I can check if your drive is in there as well...

Comment 17 Vidar Langseid 2001-05-11 10:32:23 UTC
On the system which I now have trouble on:
# cat /proc/ide/hda/model
IBM-DTLA-307030

On my workstation I have various IBM drives
[vl@rebel vl]$ cat /proc/ide/hda/model
IBM-DHEA-38451
[vl@rebel vl]$ cat /proc/ide/hde/model
IBM-DJNA-370910
[vl@rebel vl]$ cat /proc/ide/hdg/model
IBM-DTLA-307030

On my workstation it was hdg that caused the problem. I used a tool called "IBM
ATA-switch v1.40" in order to force ATA2 mode (UDMA33). As you can see, this is
the same type of disc as hda on my trouble computer. Forcing ATA2 mode on that
system makes no difference.

My hde works fine in ATA4 mode (UDMA66)


When you previously said that you had a fix for IBM drives on HPT, was you 
talking about forcing it to ATA2 or do you have a workaround that prevents
system freeze in ATA4 and ATA5 mode ?


Best regards
vidar

Comment 18 Arjan van de Ven 2001-05-11 10:39:42 UTC
Forcing it to ata2

Comment 19 Arjan van de Ven 2001-05-11 10:40:32 UTC
And the model for hdg was already in our list.

Comment 20 Vidar Langseid 2001-05-11 11:34:50 UTC
I'l send you the email I sendt to IBM and the respond I got. There is acually
nothing new to add there, execpt that the IBM guy talks about a beta bios might
help. I didn't try the betabios, since I allready had a workaround (user/kernel
space software erasing your harddrive is one thing, a betabios that might
disable your computer entirely is not what I would like to risk. Ofcource this
is unlikely to happend but still...)

It might be a possibility that with the betabios, it works in ATA4 and ATA5 too?
I know that linux doesn't usually use the system BIOS, but I don't know what
linux does regarding the HPT BIOS on the controller.

PS
Abit has not yet released a new stable bios which is supposed to fix what the
betabios is supposed to do.

Comment 21 Vidar Langseid 2001-05-11 11:36:29 UTC
Created attachment 18146 [details]
The letter I sendt to IBM and the reply I got


Note You need to log in before you can comment on or make changes to this bug.