Bug 29709 - Installation failure on ABIT KT7A-RAID motherboard
Installation failure on ABIT KT7A-RAID motherboard
Status: CLOSED WORKSFORME
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Michael K. Johnson
Aaron Brown
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-02-26 22:32 EST by jbowman
Modified: 2007-04-18 12:31 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-03-09 11:19:55 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lspci -vxxx output from a RedHat 7.0 install on my machine (12.81 KB, text/plain)
2001-03-08 09:19 EST, jbowman
no flags Details
lspci output (10.94 KB, text/plain)
2001-03-08 15:44 EST, Red Hat Bugzilla
no flags Details
Output of /proc/pci (1.96 KB, text/plain)
2001-03-08 15:45 EST, Red Hat Bugzilla
no flags Details

  None (edit)
Description Red Hat Bugzilla 2001-02-26 22:32:38 EST
Thought I'd report that installation crashes and fails after the formatting
filesystems stage when I try to install onto the primary IDE hard drive of
my KT7-RAID. When I turn off the motherboard's HPT370 controller, I can
install without issues. This is during a 'custom' install.

I use disk druid to partition my primary drive, and I can see the drive I
have connected to the HPT370 controller just fine in disk druid when I have
the controller active. I can't see that drive in disk druid with the
controller off (obviously :).

Unfortunately I didn't manage to nab the installer crash report before I
reset to retry the install without the HPT370 controller enabled. I'm more
than happy to generate it again, as it was reproducible through the two or
three attempts I made before switching off the controller.

System hardware is:
ABIT KT7-RAID motherboard
1ghz Thunderbird Athlon
256mb RAM
Nvidia GeForce DDR video card
Western Digital 27gb hard drive as the primary master IDE device (/dev/hda)
Western Digital 40gb hard drive as the quarternary master (/dev/hdg)
Pioneer 10x DVD drive as secondary master (/dev/hdc)
HP CDR/W 32x/8x/4x drive as secondary slave (/dev/hdd)
Comment 1 Red Hat Bugzilla 2001-02-27 11:48:23 EST
Assigning to kernel team.
Comment 2 Red Hat Bugzilla 2001-02-28 11:01:39 EST
I've tried several more installs now. I discovered upon rebooting that my
working install had somehow managed to eat itself, though in light of the error
messages I managed to dig up I guess I was just lucky in getting it to boot the
first time.

The installs seem to either die randomly in different places, or suceed and then
result in an unusable system (since the one 'fluke' install I made that worked,
I've done another half-dozen or so without any luck).

The common error seems to be the following. It shows up both on the f4 console
of failed installs and also shows up during the boot process of "sucessful"
installs:

hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
end_request: I/O error, dev 03:0a (hda), sector 528392

This shows up with and without the HPT370 controller enabled, so it appears my
original suspicions were wrong. This system had been previously running RedHat
7.0 without a hint of trouble using the latest RedHat-issued 2.2 kernel.


Comment 3 Red Hat Bugzilla 2001-02-28 15:53:46 EST
We have added KT7R to the list of motherboards we will be looking to
obtain so that we can reasonably debug this.
Comment 4 Red Hat Bugzilla 2001-03-02 13:06:07 EST
Our installer team-lead thinks we should really fix this before next release.
Comment 5 Red Hat Bugzilla 2001-03-07 15:54:23 EST
Cannot reproduce problem using a KT7-RAID board and IDE chain (disk, CDROM).
Installation ran completely using Disk Druid and fdisk, and even checking for
bad blocks. Resulting systems ran fine. Need more information...

1. Does it make any difference whether your disk setup is RAID or not?

2. Are you using the DVD or CDRW for installation (I'm guessing DVD)? Are you
booting the CD or a floppy?

3. Can you try removing some hardware to see if it is involved with the problem?
Does the failure occur if the 2nd HD disk is absent? If the DVD is absent? Does
it occur if you use a CDROM drive instead of DVD or CDRW? 

4. Does using your CDROM drive in a different IDE interface make any difference?

Comment 6 Red Hat Bugzilla 2001-03-07 16:02:05 EST
1.) I don't use the RAID support at all, as Linux doesn't have support for the
HPT-370's RAID functionality yet.

2) I'm booting from an iso CD in the DVD drive, which was also how I sucessfully
installed 7.0 on it.

3) and 4) I'll try this weekend and get back to you on. I've tried disabling the
HPT370 controller in the BIOS (which "should" and does cause the second hard
drive to not be seen by the OS), but will also try physically disconnecting the
second hard drive as well.

Comment 7 Red Hat Bugzilla 2001-03-07 17:03:49 EST
kbarrett was testing with a KT7A-RAID, which is a slightly newer revision
and all we could find available now.  We've had other problems like this
go away with BIOS updates -- are any BIOS updates available for your
motherboard?

Also, could you post here the lspci -vxxx output for that system?
Comment 8 Red Hat Bugzilla 2001-03-07 17:21:26 EST
My apologies, I am in fact using a KT7A-RAID motherboard, not the KT7-RAID. I 
didn't realize there were two different models.

I'm looking into the BIOS updates. I believe I have the most recent applied, 
but I will double-check to be sure. I've also added the lspci -l output to my 
list of "things to check" this weekend when I have the time to break my system 
again. :)
Comment 9 Red Hat Bugzilla 2001-03-08 04:11:00 EST
lspci -vxxx (done as root) would be most appreciated as that might allow
us to find differences between working and broken systems.
Comment 10 Red Hat Bugzilla 2001-03-08 09:19:47 EST
Created attachment 12084 [details]
lspci -vxxx output from a RedHat 7.0 install on my machine
Comment 11 Red Hat Bugzilla 2001-03-08 09:23:45 EST
I found a bios update from ABIT and applied it, going from the "UL" bios to the
"WW" bios. Apparently no change in the results, as I managed to finish the
install but still got a 'dead' system after the first reboot again.

The error message on boot was:
"EXT2=fs error (device ide0(3,8)): ext2_read_inode: unable to read inode 
block - inode=2, block=4
VGS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 236k freed
Warning: unable to open an initial console
Kernel panic: No init found. Tray passing init= option to kernel.

This is only a single test case, as I didn't have time for more (pulling
hardware, etc...), but it seems to rule out a BIOS update fixing the problem.
I've also attached the lspci -vxxx output from the RedHat 7.0 install that I
managed to get on there before going to bed.
Comment 12 Red Hat Bugzilla 2001-03-08 10:09:38 EST
Keith, could you please attach the lspci -vxxx output for our
KT7A-RAID motherboard?
Comment 13 Red Hat Bugzilla 2001-03-08 15:44:10 EST
Created attachment 12126 [details]
lspci output
Comment 14 Red Hat Bugzilla 2001-03-08 15:45:03 EST
Created attachment 12127 [details]
Output of /proc/pci
Comment 15 Red Hat Bugzilla 2001-03-08 15:47:50 EST
Please note...

On our setup, the KT7 will only boot the primary IDE0 device (and floppy), even
if you rearrange settings, cables, etc. Because of this, I cannot directly boot
a CDROM and perform an installation that will result in an HD that boots.
Booting a floopy to perform a CDROM installation to an IDE0 pri disk works fine.
The attachments above are from that resulting system.

Comment 16 Red Hat Bugzilla 2001-03-08 16:20:43 EST
Are you meaning that the only way you can boot and get a working install is via
the floppy disk, or that you cannot boot from the cdrom at all?

I'll try doing a floppy-based install and see how that works out.

One thing I noticed in comparing my lspci output to yours is that you seem to
have a newer revision of the motherboard, as the various VIA components all have
higher revisions on yours (i.e. the PCI bridge, the USB controller). I wonder if
this could be a significant difference?
Comment 17 Red Hat Bugzilla 2001-03-08 16:31:16 EST
>Are you meaning that the only way you can boot and get a working install
>is via the floppy disk, or that you cannot boot from the cdrom at all?

I can boot the CDROM if I make it primary 0. The installation runs to
completion, but the system doesn't see the HD as bootable. When I make the HD
primary 0 and the CDROM a slave and boot a floppy, the installation runs fine
and the HD. Obviously I have to change bios settings when I make these changes,

I am trying to determine why this is -- it could simply be my fault, or the way
we have things set up in the lab.

>I'll try doing a floppy-based install and see how that works out.

Good idea. If you have the same behavior, and it did not happen on 7.0, then it
means we may have something to look at. I am in the process of seeing if a DVD
or CDRW affect things here.

Comment 18 Red Hat Bugzilla 2001-03-08 21:00:51 EST
>Good idea. If you have the same behavior, and it did not happen on 7.0, then it
>means we may have something to look at. I am in the process of seeing if a DVD
>or CDRW affect things here.

Well, it looks like we have something to look at, as even with a floppy I still
get the same behaviour. This time around it died right before formatting during
the install with a bunch of DMA BAD_CRC errors again, along with a complaint
about not being able to find the drive. RedHat 7.0 continues to work
beautifully.

I won't have time to start playing around with moving hardware around until the
weekend, but I'm more than happening to provide any debugging info you need.
Comment 19 Red Hat Bugzilla 2001-03-09 11:02:27 EST
I had no problems installing full installations of Wolverine and the latest
build (as of 03/08/01).
I ran "lspci" and "lspci -v" for both installations, and am appending the
results.  The only thing I could add about this is that the actual
bios-motherboard setup is not very user friendly.  It's very easy to
misconfigure this board, due to the non-intuitive BIOS setup. 

Wolverine:
--------

00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
40)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
40)
00:0d.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev
21)
00:13.0 Unknown mass storage controller: Triones Technologies, Inc. HPT366 (rev
03)
01:00.0 VGA compatible controller: nVidia Corporation GeForce 256 DDR (rev 10)

00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03)
        Subsystem: ABIT Computer Corp.: Unknown device a401
        Flags: bus master, medium devsel, latency 0
        Memory at d8000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 2.0
        Capabilities: [c0] Power Management version 2

00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
(prog-if 00 [Normal decode])
        Flags: bus master, 66Mhz, medium devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Memory behind bridge: dc000000-ddffffff
        Prefetchable memory behind bridge: d0000000-d7ffffff

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
40)
        Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
        Flags: bus master, stepping, medium devsel, latency 0
        Capabilities: [c0] Power Management version 2

00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) (prog-if
8a [Master SecP PriP])
        Flags: bus master, medium devsel, latency 32
        I/O ports at c000 [size=16]
        Capabilities: [c0] Power Management version 2

00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) (prog-if 00
[UHCI])
        Subsystem: Unknown device 0925:1234
        Flags: bus master, medium devsel, latency 32, IRQ 9
        I/O ports at c400 [size=32]
        Capabilities: [80] Power Management version 2

00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) (prog-if 00
[UHCI])
        Subsystem: Unknown device 0925:1234
        Flags: bus master, medium devsel, latency 32, IRQ 9
        I/O ports at c800 [size=32]
        Capabilities: [80] Power Management version 2

00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
40)
        Flags: medium devsel
        Capabilities: [68] Power Management version 2

00:0d.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev
21)
        Subsystem: Cogent Data Technologies, Inc. ANA-6911A/TX Fast Ethernet
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at cc00 [size=128]
        Memory at df000000 (32-bit, non-prefetchable) [size=128]
        Expansion ROM at <unassigned> [disabled] [size=256K]

00:13.0 Unknown mass storage controller: Triones Technologies, Inc. HPT366 (rev
03)
        Subsystem: Triones Technologies, Inc.: Unknown device 0001
        Flags: bus master, 66Mhz, medium devsel, latency 120, IRQ 11
        I/O ports at d000 [size=8]
        I/O ports at d400 [size=4]
        I/O ports at d800 [size=8]
        I/O ports at dc00 [size=4]
        I/O ports at e000 [size=256]
        Expansion ROM at <unassigned> [disabled] [size=128K]
        Capabilities: [60] Power Management version 2

01:00.0 VGA compatible controller: nVidia Corporation GeForce 256 DDR (rev 10)
(prog-if 00 [VGA])
        Subsystem: Guillemot Corporation 3D Prophet DDR-DVI
        Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 5
        Memory at dc000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (32-bit, prefetchable) [size=128M]
        Expansion ROM at <unassigned> [disabled] [size=64K]
        Capabilities: [60] Power Management version 1
        Capabilities: [44] AGP version 2.0

qa0308.0 tree:
----------

00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
40)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
40)
00:0d.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev
21)
00:13.0 Unknown mass storage controller: Triones Technologies, Inc. HPT366 (rev
03)
01:00.0 VGA compatible controller: nVidia Corporation GeForce 256 DDR (rev 10)

00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03)
        Subsystem: ABIT Computer Corp.: Unknown device a401
        Flags: bus master, medium devsel, latency 0
        Memory at d8000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 2.0
        Capabilities: [c0] Power Management version 2

00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
(prog-if 00 [Normal decode])
        Flags: bus master, 66Mhz, medium devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Memory behind bridge: dc000000-ddffffff
        Prefetchable memory behind bridge: d0000000-d7ffffff

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
40)
        Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
        Flags: bus master, stepping, medium devsel, latency 0
        Capabilities: [c0] Power Management version 2

00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) (prog-if
8a [Master SecP PriP])
        Flags: bus master, medium devsel, latency 32
        I/O ports at c000 [size=16]
        Capabilities: [c0] Power Management version 2

00:07.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) (prog-if 00
[UHCI])
        Subsystem: Unknown device 0925:1234
        Flags: bus master, medium devsel, latency 32, IRQ 9
        I/O ports at c400 [size=32]
        Capabilities: [80] Power Management version 2

00:07.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) (prog-if 00
[UHCI])
        Subsystem: Unknown device 0925:1234
        Flags: bus master, medium devsel, latency 32, IRQ 9
        I/O ports at c800 [size=32]
        Capabilities: [80] Power Management version 2

00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev
40)
        Flags: medium devsel, IRQ 11
        Capabilities: [68] Power Management version 2

00:0d.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev
21)
        Subsystem: Cogent Data Technologies, Inc. ANA-6911A/TX Fast Ethernet
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at cc00 [size=128]
        Memory at df000000 (32-bit, non-prefetchable) [size=128]
        Expansion ROM at <unassigned> [disabled] [size=256K]

00:13.0 Unknown mass storage controller: Triones Technologies, Inc. HPT366 (rev
03)
        Subsystem: Triones Technologies, Inc.: Unknown device 0001
        Flags: bus master, 66Mhz, medium devsel, latency 120, IRQ 11
        I/O ports at d000 [size=8]
        I/O ports at d400 [size=4]
        I/O ports at d800 [size=8]
        I/O ports at dc00 [size=4]
        I/O ports at e000 [size=256]
        Expansion ROM at <unassigned> [disabled] [size=128K]
        Capabilities: [60] Power Management version 2

01:00.0 VGA compatible controller: nVidia Corporation GeForce 256 DDR (rev 10)
(prog-if 00 [VGA])
        Subsystem: Guillemot Corporation 3D Prophet DDR-DVI
        Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 5
        Memory at dc000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (32-bit, prefetchable) [size=128M]
        Expansion ROM at <unassigned> [disabled] [size=64K]
        Capabilities: [60] Power Management version 1
        Capabilities: [44] AGP version 2.0

Comment 20 Red Hat Bugzilla 2001-03-09 11:19:50 EST
Hmmmm. That's two apparently same-revision boards that are reporting no problems
and are of newer revision than my own (see the revision numbers on all the
motherboard components in the lspci output). I'm going to take this up with
ABIT, as it may be that older revisions of the motherboard have issues in this
area. If that's the case, I can probably swap my old board for a newer-revision
one and see if that fixes the problem. I'll also muck about with my hardware as
well, just to be sure.
Comment 21 Red Hat Bugzilla 2001-03-10 21:16:02 EST
Ah-ha! After digging around on the kernel mailing list for a while, I managed to
dig up the solution to my problem (which I've tested to be sure). As it turns
out, I apparently had a bad IDE cable which was generating crosstalk. I've
replaced the cable to the primary IDE channel with a new one, and my problems
have vanished.

Still very odd, though, that this problem only showed up under 2.4.x Linux.
2.2.x-series linux and Windows didn't show a single bit of difficulty.
Comment 22 Red Hat Bugzilla 2001-03-12 15:41:38 EST
The failure was presumably triggered by the driver now using DMA which
caused different usage patterns.  Thanks for letting us know!
Comment 23 Red Hat Bugzilla 2001-03-12 16:44:35 EST
> The failure was presumably triggered by the driver now using DMA which
> caused different usage patterns.  Thanks for letting us know!

Actually, the failure I kept experiencing with the 7.1 beta was different from
what I experienced when running a 7.0 install upgraded to the 2.4.2 kernel.
Under 2.4.2, when I activated DMA and began acessing disk, the usual DMA BAD_CRC
errors would crop up, and then after the ide reset occurred DMA would be flagged
back to mdma2 mode and things would work nicely (if a bit slowly compared to
what it should have been). In the installer for 7.1 beta and the "sucessful"
installs as well, I'd get what appeared to be a complete loss of disk
accessibility, rather than a reset down to a useful dma mode (or even a
disabling of DMA altogether, which I think would have been a valid option in
this case).
Comment 24 Red Hat Bugzilla 2001-03-12 17:34:29 EST
Our QA department was wondering if you would like to send them the
flaky cable so that they can try to use it to reproduce bug reports.  :-)
Comment 25 Red Hat Bugzilla 2001-03-12 23:08:38 EST
Unfortunately, it's already on its way to the local landfill, otherwise I'd be
happy to send it to you. And of course, now that I want to reference it I can't
find the original LKM email from Andre Hedrick I was referring to. Whee! :\

Ah well, sorry I can't be more help, and thank you for your patience in dealing
with this silly hardware problem. :)
Comment 26 Red Hat Bugzilla 2001-03-12 23:57:44 EST
:-)

It's been useful; this exchange prompted us to write up documentation
for our support department to give people with similar problems later...

Note You need to log in before you can comment on or make changes to this bug.