Description of Problem: I have a computer with 6.2, 7.0 and 7.1. This problem only exists when booting 7.1, probably because of the 2.4 kernel ? I get messages like hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } I thought that bug 39689 might be related but "ide=nodma" did not help # cat /proc/cmdline BOOT_IMAGE=redhat71 ro root=306 BOOT_FILE=/mnt/boot/hda6_RedHat-7.1_Drift/vmlinuz-2.4.2-2 ide=nodma How Reproducible: Every time on my system Steps to Reproduce: 1. Install rh7.1 2. Boot up your system 3. Actual Results: Expected Results: Additional Information: System is - Abit KT7-RAID motherbord (ide disc attached to primary IDE, not the RAID controller) - IBM 70GXP harddrive
Created attachment 18026 [details] Here is my dmesg
BadCRC usually indicates you have a bad IDE cable. 2.4 kernels have higher performance on the IDE subsystem, but that breaks if the cable is bad. Also, it appears that the "ide=nodma" didn't stick, did you use a lilo.conf "append" line or did you type it at the lilo prompt ?
I tried it on the lilo prompt. It didn't help so I rebooted again. That's why the dmesg didn't had a "ide=nodma" option. Sorry if this was confusing Anyway, linux seems to disable DMA on KT7 without regard to "ide=nodma" ? from dmesg : "KT7 series board detected. Disabling IDE DMA" I'll attach a dmesg with "ide=nodma" too I'll try to change the cable too and see if that helps
Created attachment 18027 [details] my dmesg, with ide=nodma
A cable switch didn't work. I tried with both a standard old fashion IDE cable and a IDE66/100 cable. This however seems to stop the errormessages: # hdparam -d 0 /dev/hda So when I do this in rh7.0: # hdparam -d 1 /dev/hda Then I manage to reproduce the same errormessages there as in rh7.1 This indicates that dma must be turned off on my system. Any ideas why ? Is "ide=nodma" and "hdparam -d 0 /dev/hda" supposed to do the same thing ? regards, vidar
Is this a fresh install or an upgrade ?
and yes, ide=nodma is supposed to do the same thing, however it could be that something turns DMA back on in the initscripts. I've seen that on a system that was made to use DMA on 7.0 and then got upgraded to 7.1
It is a fresh install The computer have rh6.2 and rh7.0 installed too, but on different partitions so that shouldn't have anything to do with it.
CRC errors are 99.99% of the time a hardware issue, sorry about that. However, there is a bug in certain VIA chipsets, for which we have a _partial_ workaround in the 2.4.2-2 kernel (the one you have) and the full workaround in the kernel upgrade we are working on, a snapshot of that kernel can be found in the rawhide directory on our ftp site. If you don't have a problem with experimenting, you could try to see if the "official" chipset workaround will fix your problem.
I tried both kernel-2.4.3-2.14.10.athlon.rpm and kernel-2.4.3-2.14.10.i686.rpm. The problem still remains... I also tried IBM Drive Fitness Test, a tool IBM have to test their discs. That test didn't find anything wrong. So what you are saying is that there is a hardware problem somewhere that forces me to disable DMA and there is nothing you or me can do about it? That realy su... If so this should affect all KT7-RAID owners.... Anyway, thank you for the FAST response This was kinda instant messaging If you have any other suggestions, I am keen on testing it. Best regards, Vidar
One more hunch: does your board have a HPT366 or HPT370 controller? (lspci will tell you)
Hi I found out why ide=nodma didn't work. I did a "hdparm -d /dev/hda" in the beginning of rc.sysinit. It seems that the kernel shipping in rh7.1 fails to disable dma on my hardware even though it claims to do so. After installing the rawhide kernel (kernel-2.4.3-2.14.10.athlon.rpm) it works. So now I manage to boot my system in a clean way without nasty errormessages. However, I am not that happy since I have to disable dma.... About the HPT controller This motherboard have a HPT370 controller This does not show up in the dmesg I previously attached because I have disbled the controller entirely ( in bios). This is a lspci with the HTP370 disabled: ls[root@dozer /root]# lspci 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 02) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 22) 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 10) 00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10) 00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10) 00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30) 00:0f.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) 01:00.0 VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 15) When I now tried to enable the HPT370 I see that something strange is going on. I'll attach a dmesg showing that linux detects the HPT370. But when I then now do a lspci I get this: [root@dozer /root]# lspci 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 02) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 22) 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 10) 00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10) 00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 10) 00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30) 00:0f.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) 00:13.0 Unknown mass storage controller: Triones Technologies, Inc. HPT366 (rev 03) 01:00.0 VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 15) It suddenly thinks I have a HPT366 This is a extract of /etc/sysconfig/hwconf class: OTHER bus: PCI detached: 0 driver: unknown desc: "Triones|HPT366" vendorId: 1103 deviceId: 0004 subVendorId: 1103 subDeviceId: 0001 pciType: 1 I don't know if this is related to my problem since I don't use the HPT370 at all Regards, vidar
Created attachment 18126 [details] my dmesg when the HPT370 is enabled in BIOS
Oh ok. If the IBM drive was connected to the HPT controller, that would explain stuff; there is a bad interaction between IBM drives and these controllers (which we can work around so you can still do DMA on that). But as you are not using it, this is not the problem here.
I am aware of that problem. My workstation has a HPT366 ->the system freezed instantly when trying to access a IBM drive. My workaround there was to force the IBM drive to ATA2 mode(DMA33). Sent a bugreport to abit, hpt and IBM. Took over three weeks before I got an answer. Nothing like bugzilla.redhat.com's instant messaging... :) I can't use the HPT controller on this system because of the various OSes the computer need the run (I don't have 13 partiotions on a system just for fun) The computer works fine when DMA is selected in Win98. At last it doesn't complain. Neither is it more unstable than win98 usually is. If you want to look more into the problem, I am there for you. If you instead think that this is unresolvable you may change the resolution to WONTFIX or something. You decide. Best regards, vidar
We fixed our kernel to default to that mode for most/all IBM drives on HPT controllers. If you could give me the output of "cat /proc/ide/hda/model" I can check if your drive is in there as well...
On the system which I now have trouble on: # cat /proc/ide/hda/model IBM-DTLA-307030 On my workstation I have various IBM drives [vl@rebel vl]$ cat /proc/ide/hda/model IBM-DHEA-38451 [vl@rebel vl]$ cat /proc/ide/hde/model IBM-DJNA-370910 [vl@rebel vl]$ cat /proc/ide/hdg/model IBM-DTLA-307030 On my workstation it was hdg that caused the problem. I used a tool called "IBM ATA-switch v1.40" in order to force ATA2 mode (UDMA33). As you can see, this is the same type of disc as hda on my trouble computer. Forcing ATA2 mode on that system makes no difference. My hde works fine in ATA4 mode (UDMA66) When you previously said that you had a fix for IBM drives on HPT, was you talking about forcing it to ATA2 or do you have a workaround that prevents system freeze in ATA4 and ATA5 mode ? Best regards vidar
Forcing it to ata2
And the model for hdg was already in our list.
I'l send you the email I sendt to IBM and the respond I got. There is acually nothing new to add there, execpt that the IBM guy talks about a beta bios might help. I didn't try the betabios, since I allready had a workaround (user/kernel space software erasing your harddrive is one thing, a betabios that might disable your computer entirely is not what I would like to risk. Ofcource this is unlikely to happend but still...) It might be a possibility that with the betabios, it works in ATA4 and ATA5 too? I know that linux doesn't usually use the system BIOS, but I don't know what linux does regarding the HPT BIOS on the controller. PS Abit has not yet released a new stable bios which is supposed to fix what the betabios is supposed to do.
Created attachment 18146 [details] The letter I sendt to IBM and the reply I got