Bug 36519

Summary: megaraid doesn't work on HP boards
Product: [Retired] Red Hat Linux Reporter: ville.sulko
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: bbrock, howanitz, mduncan, rtodd, tim_clymo
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-06-06 00:26:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ville.sulko 2001-04-18 17:24:16 UTC
Just installed (fresh install) RH71 on :
   HP netserver lp200r / 2*PIII/1000 [SMP]
   1 GB RAM
   2 * Symbios 53c1010 Ultra3 SCSI Adapter (not used!)
   American Megatrends Inc. MegaRAID (2M)
     18 GB RAID1 (2 * 18 GB ...)

Install succeeded, but after first boot one couldn't help
noticing massive filesystem corruption... Even "man" wouldn't
run because of corrupted /usr/bin/tbl. "rpm -Va" wouldn't run
resulting
  error: cannot open Packages index using db3 - Input/output error (5)
And of course there were kernel messages like the following :


init_special_inode: bogus imode (71537)
init_special_inode: bogus imode (30143)
init_special_inode: bogus imode (32040)
init_special_inode: bogus imode (71661)
init_special_inode: bogus imode (110632)
init_special_inode: bogus imode (244)
EXT2-fs error (device sd(8,5)): ext2_readdir: bad entry in directory 
#368224: rec_len % 4 != 0 - offset=0, inode=1701667150, rec_len=2570, 
name_len=32
EXT2-fs error (device sd(8,5)): ext2_readdir: bad entry in directory 
#351484: rec_len % 4 != 0 - offset=0, inode=1196314761, rec_len=2573, 
name_len=26
EXT2-fs error (device sd(8,5)): ext2_readdir: bad entry in directory 
#160800: directory entry across blocks - offset=0, inode=174416750, 
rec_len=8224, name_len=105
init_special_inode: bogus imode (32462)
init_special_inode: bogus imode (72557)
EXT2-fs error (device sd(8,1)): ext2_readdir: bad entry in directory 
#46773: rec_len % 4 != 0 - offset=0, inode=35586, rec_len=2594, name_len=0
init_special_inode: bogus imode (24)
init_special_inode: bogus imode (72162)
init_special_inode: bogus imode (71145)
init_special_inode: bogus imode (112553)
...

Fortunately this system wasn't a production system, so I may simply
reinstall it with different version of RedHat or kernel... However,
I was left with a feeling that kernel 2.4.2 isn't ready for
wider use. Things like these give distributions a bad name...

Secdly, about upgrading from RH70 to RH71 with similar (but not
the same) system. Upgrade simply crashes when determining which
packages to upgrade... I wasn't fast enough to see wether there
was any detailed error-message, but something about 'abnormal
installation termination' anyway. Same thing with both graphical and
text only -upgrade modes...

Thing worth noting is, that the RH70 on the latter system seems
to run just fine.

Here are the boot messages about the raid controller :

megaraid: v1.14g (Release Date: Feb 5, 2001; 11:42)
megaraid: found 0x101e:0x1960:idx 0:bus 3:slot 0:func 0
scsi2 : Found a MegaRAID controller at 0xf883f000, IRQ: 5
scsi2 : Enabling 64 bit support
megaraid: [^G^AH :^B^AG ] detected 1 logical drives
scsi2 : AMI MegaRAID ^G^AH  254 commands 16 targs 2 chans 40 luns
scsi2: scanning channel 1 for devices.
  Vendor: SDR       Model: GEM318            Rev: 0
  Type:   Processor                          ANSI SCSI revision: 02
  Vendor: SDR       Model: GEM318            Rev: 0
  Type:   Processor                         ANSI SCSI revision: 02
scsi2: scanning channel 2 for devices.
scsi2: scanning virtual channel for logical drives.
  Vendor: MegaRAID  Model: LD0 RAID1 17365R  Rev:   H
  Type:   Direct-Access                      ANSI SCSI revision: 02
Attached scsi disk sda at scsi2, channel 2, id 0, lun 0
SCSI device sda: 35563520 512-byte hdwr sectors (18209 MB)
Partition check:
 sda: sda1 sda2 < sda5 sda6 sda7 sda8 >

Comment 1 ville.sulko 2001-04-19 10:56:27 UTC
More information about the bug.

I have now tried to install RH71 on two similar HP lp2000r machines, and on both
machines the install either hangs, or completes, but then the system won't boot or
boots but is corrupt (either disk fs or in-memory kernel, it's hard to tell). The
first time I managed to complete the install and booted, all critical bootup components
were intact, and I managed to get to the shell, but that's just about it, filesystem was
too badly corrupted in order to do any sensible work. On next install(s) I managed
to hang the install totally (no visible panics/oopses). Them I managed to complete the
install, but the system refused to boot (hung before loading system services). Next
try the install completed, and the system booted (one service failed, most likely due
to corrupted binary-file) but trying to log in caused oops...

So, most likely this is a kernel bug maybe related to the megaraid driver or fs operations.
Note that I used both UP (install + boot) and SMP (boot) kernels, with similar results.
And on both of these machines RH70 installs just fine, so I think kernel 2.2.x is just fine.


Comment 2 Arjan van de Ven 2001-04-19 11:04:11 UTC
We've tested a megaraid controller in our QA lab extensively and found no
problems. But that doesn't mean _all_ versions of the megaraid controller work.
What version(s) did you try exactly ?

> However,
> I was left with a feeling that kernel 2.4.2 isn't ready for
> wider use.

We've put a LOT of effort in the kernel to make it usable for wider use;    
just look at the number of patches for bugfixes in our kernel.






Comment 3 ville.sulko 2001-04-19 12:15:50 UTC
> We've tested a megaraid controller in our QA lab extensively and found no
> problems. But that doesn't mean _all_ versions of the megaraid controller work.
> What version(s) did you try exactly ?

Kernel boot-information was on my first mail, here's lspci -v :

03:00.0 RAID bus controller: American Megatrends Inc.: Unknown device 1960 (rev 20)
        Subsystem: Hewlett-Packard Company: Unknown device 60e8
        Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 16
        Memory at f8000000 (32-bit, prefetchable)
        Capabilities: [80] Power Management version 2

and here's just about all information I was able to dig out from BIOS-screens etc :

HP NetRaid-2M (bios version G.01.02)
Firmware version H.01.07
  2* RAID processors(?) SDR GEM318
  Disks : 2 * HP 80-8C42

> We've put a LOT of effort in the kernel to make it usable for wider use;    
> just look at the number of patches for bugfixes in our kernel.

Yes, I'm sure you have done excellent work trying to stabilize 2.4.x. I was
mostly referring to Linus 2.4.2, which has had several problems in it
(at least according to pre-patch / ac-patch changenotes).


Comment 4 Arjan van de Ven 2001-04-19 12:54:47 UTC
I'm not sure if it is related, but have you seen:
http://netserver.hp.com/netserver/support/hot_news/bpn04056.asp

Comment 5 ville.sulko 2001-04-19 13:08:14 UTC
> I'm not sure if it is related, but have you seen:
> http://netserver.hp.com/netserver/support/hot_news/bpn04056.asp

I was just browsing HPs site for possible information about the problem,
and came across this one too. I haven't checked the controller yet, since
they are packed quite tightly in a rack... However, the servers are brand new
and were (physically) installed just last week by HP, so I suppose they would
have known if the servers had had faulty components on them. I think I don't
dare to open the case myself, but I might verify this from HP the next time they
are around.


Comment 6 Brian Brock 2001-04-24 20:56:23 UTC
I've looked at the HP NetServer LH3000, with megaraid.  There's also a
symbios/lsi/ncr 53c896 chipset onboard, I don't see it otherwise reported by the
system.

/etc/modules.conf contains:
alias eth0 eepro100
alias scsi_hostadapter megaraid
alias scsi_hostadapter1 aic7xxx
alias parport_lowlevel parport_pc
alias scsi_hostadapter2 aic7xxx


The AMI chip is labeled:
AMI
9942LRM
HP
Proteus2
Version B.0

lspci on that machine reports the following SCSI controllers:
01:03.0 PCI bridge: Intel Corporation 80960RP [i960 RP Microprocessor/Bridge]
(rev 01)
01:03.1 SCSI storage controller: Intel Corporation 80960RP [i960RP
Microprocessor] (rev 01)
02:07.0 SCSI storage controller: Adaptec AIC-7880U (rev 02)

The Adaptec controller does not have drives attached, and the aic7xxx module is
unused (and unloaded without bad consequences to the running system).

System boot reports this:
SCSI subsystem driver Revision: 1.00
megaraid: v1.14g (Release Date: Feb 5, 2001; 11:42)
megaraid: found 0x8086:0x1960:idx 0:bus 1:slot 3:func 1
scsi0 : Found a MegaRAID controller at 0xc8825000, IRQ: 20
megaraid: [:^A^BB ] detected 2 logical drives
scsi0 : AMI MegaRAID  254 commands 16 targs 2 chans 8 luns
scsi0: scanning channel 1 for devices.
  Vendor: HP        Model: SAFTE; U160/M BP  Rev: 1020
  Type:   Processor                          ANSI SCSI revision: 02
scsi0: scanning channel 2 for devices.
  Vendor: HP        Model: SAFTE; U160/M BP  Rev: 1020
  Type:   Processor                          ANSI SCSI revision: 02
scsi0: scanning virtual channel for logical drives.
  Vendor: MegaRAID  Model: LD0 RAID0  8677R  Rev:   E 
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: MegaRAID  Model: LD1 RAID0  8677R  Rev:   E 
  Type:   Direct-Access                      ANSI SCSI revision: 02
Attached scsi disk sda at scsi0, channel 2, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 2, id 0, lun 1
SCSI device sda: 17770496 512-byte hdwr sectors (9098 MB)
Partition check:
 sda: sda1
SCSI device sdb: 17770496 512-byte hdwr sectors (9098 MB)
 sdb: sdb1
(scsi1) <Adaptec AIC-7880 Ultra SCSI host adapter> found at PCI 2/7/0
(scsi1) Wide Channel, SCSI ID=7, 16/255 SCBs
(scsi1) Downloading sequencer code... 436 instructions downloaded
scsi1 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.4/5.2.0
       <Adaptec AIC-7880 Ultra SCSI host adapter>


Comment 7 Brian Brock 2001-04-24 21:02:23 UTC
The previous HP NS LH3k was runing for 13 days before reboot (in the test lab),
with no report of
FS corruption.  No FS problems detected on system reboot with fsck forced,
either.  It's also successfully run/survived the kernel stress testing used in
the test lab without error.

Comment 8 ville.sulko 2001-04-25 06:42:40 UTC
The raid-controller in LH3k seems to be different than the one we have in our lp2k(s).
Just browsed hp's website, and it seems that LH3k has integrated 2-channel raid-
controller, but it didn't say if it's HP NetRAID-2M. At least the bootup sequence would
suggest it isn't, since at least the raid processors seem to be different. In lp2k the
raid is not integrated, but sold as an addon.

Don't know what other differences there might be between LH3k and lp2k, but lp2k
is quite new model (as is lp1k), so there might be a little newer hardware inside.
And of course the problem might be somewhere else than in the raid controller?

> The previous HP NS LH3k was runing for 13 days before reboot (in the test lab),
> with no report of FS corruption. 

As I said before, the fs corruption I had with these machines was visible immediately
after reboot, so the fs was corrupted already during the install process. However, since
I also experienced a couple of failed installs (hanged), it might be some other kernel-
related problem as well.

BTW, I e-mailed about the problem to HP too, and the reply was that they hadn't
tested RH71 on lp2k yet.


Comment 9 ville.sulko 2001-04-25 06:50:41 UTC
>> http://netserver.hp.com/netserver/support/hot_news/bpn04056.asp
>
> I haven't checked the controller yet, since they are packed quite tightly in a rack...

Asked about this one too, and the controllers were in fact changed, but before I
tried to install RH71. So this is not it...


Comment 10 Luke Hutchison 2001-04-28 03:47:55 UTC
This is NOT an HP-specific problem.  Please see 
http://lwn.net/2001/0412/kernel.php3 .  I have personally experienced it 5 
times even in moderate load on a stock K7-1200-266 / Asus A7M266 / Seagate 
Baracuda ATAIII and have reinstalled 5 times.  I'm pulling out my hair.

Luke Hutchison.


Comment 11 Arjan van de Ven 2001-04-28 08:14:28 UTC
lukeh.nz: the bug mentioned there was one we fixed before we released
Red Hat Linux 7.1.
However, you have a VIA chipset and VIA recently announced that their chipset
has a bug (and that they new about it for months). The safest thing to do
is to use "ide=nodma" as kernel option (during the installation and in
lilo.conf) at all times. This is totally unrelated to the real problem of this
bug.

Comment 12 Rob Todd 2001-04-30 02:55:51 UTC
I've experienced the same problem using a PIII with a ServerWorks OSB4 chipset 
and the stock 2.4.2-2 RH71 kernel.

Here is the dmesg:
Linux version 2.4.2-2 (root.redhat.com) (gcc version 2.96 20000731 (
Red Hat Linux 7.1 2.96-79)) #1 Sun Apr 8 20:41:30 EDT 2001
BIOS-provided physical RAM map:
 BIOS-e820: 000000000009fc00 @ 0000000000000000 (usable)
 BIOS-e820: 0000000000000400 @ 000000000009fc00 (reserved)
 BIOS-e820: 0000000000020000 @ 00000000000e0000 (reserved)
 BIOS-e820: 0000000007f00000 @ 0000000000100000 (usable)
 BIOS-e820: 0000000000001000 @ 00000000fec00000 (reserved)
 BIOS-e820: 0000000000001000 @ 00000000fec01000 (reserved)
 BIOS-e820: 0000000000001000 @ 00000000fee00000 (reserved)
 BIOS-e820: 0000000000080000 @ 00000000fff80000 (reserved)
On node 0 totalpages: 32768
zone(0): 4096 pages.
zone DMA has max 32 cached pages.
zone(1): 28672 pages.
zone Normal has max 224 cached pages.
zone(2): 0 pages.
zone HighMem has max 1 cached pages.
Kernel command line: auto BOOT_IMAGE=linux ro root=301 BOOT_FILE=/boot/vmlinuz-2
.4.2-2
Initializing CPU#0
Detected 999.556 MHz processor.                 
Console: colour VGA+ 80x25
Calibrating delay loop... 1992.29 BogoMIPS
Memory: 126472k/131072k available (1365k kernel code, 4212k reserved, 92k data,
236k init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
Buffer-cache hash table entries: 8192 (order: 3, 32768 bytes)
Page-cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
VFS: Diskquotas version dquot_6.5.0 initialized
CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000
CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
CPU: Common caps: 0383fbff 00000000 00000000 00000000
CPU: Intel Pentium III (Coppermine) stepping 06
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.37 (20001109) Richard Gooch (rgooch.au)     
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfdbb1, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Discovered primary peer bus 01 [IRQ]
PCI: Using IRQ router ServerWorks [1166/0200] at 00:0f.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.14)
Starting kswapd v1.8
Detected PS/2 Mouse Port.
pty: 256 Unix98 ptys configured
block: queued sectors max/low 83898kB/27966kB, 256 slots per queue
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ServerWorks OSB4: IDE controller on PCI bus 00 dev 79
ServerWorks OSB4: chipset revision 0
ServerWorks OSB4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio 
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
hda: ST320414A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 39102336 sectors (20020 MB) w/2048KiB Cache, CHS=2434/255/63, UDMA(33)
Partition check:
 hda:<5>apm: get_event: Interface not connected
 hda1 hda2 hda3
Floppy drive(s): fd0 is 1.44M
FDC 0 is a National Semiconductor PC87306
Serial driver version 5.02 (2000-08-09) with MANY_PORTS MULTIPORT SHARE_IRQ SERI
AL_PCI ISAPNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
Real Time Clock Driver v1.10d
md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID arrays
autorun ...
... autorun DONE.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 1024 buckets, 8Kbytes
TCP: Hash tables configured (established 8192 bind 8192)
Linux IP multicast router 0.06 plus PIM-SM                     
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 236k freed
Adding Swap: 2097136k swap-space (priority -1)
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
PCI: Found IRQ 10 for device 00:0f.2
usb-ohci.c: USB OHCI at membase 0xc8915000, IRQ 10
usb-ohci.c: usb-00:0f.2, PCI device 1166:0220 (ServerWorks)
usb.c: new USB bus registered, assigned bus number 1
hub.c: USB hub found
hub.c: 4 ports detected
Winbond Super-IO detection, now testing ports 3F0,370,250,4E,2E ...
SMSC Super-IO detection, now testing Ports 2F0, 370 ...
ip_conntrack (1024 buckets, 8192 max)
eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/driv
ers/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@sa
w.sw.com.sg> and others
PCI: Found IRQ 11 for device 00:06.0
eth0: Intel Corporation 82557 [Ethernet Pro 100], 00:30:48:21:84:D0, I/O at 0xd8
00, IRQ 11.
  Receiver lock-up bug exists -- enabling work-around.        
  Board assembly 000000-000, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
  Receiver lock-up workaround activated.
EXT2-fs error (device ide0(3,1)): ext2_readdir: bad entry in directory #1619184:
 directory entry across blocks - offset=0, inode=0, rec_len=46320, name_len=24
EXT2-fs error (device ide0(3,1)): ext2_readdir: bad entry in directory #1619187:
 rec_len % 4 != 0 - offset=0, inode=0, rec_len=46323, name_len=24
EXT2-fs error (device ide0(3,1)): ext2_readdir: bad entry in directory #1619184:
 directory entry across blocks - offset=0, inode=0, rec_len=46320, name_len=24
EXT2-fs error (device ide0(3,1)): ext2_readdir: bad entry in directory #1619187:
 rec_len % 4 != 0 - offset=0, inode=0, rec_len=46323, name_len=24
EXT2-fs error (device ide0(3,1)): ext2_readdir: bad entry in directory #1619184:
 directory entry across blocks - offset=0, inode=0, rec_len=46320, name_len=24
EXT2-fs error (device ide0(3,1)): ext2_readdir: bad entry in directory #1619187:
 rec_len % 4 != 0 - offset=0, inode=0, rec_len=46323, name_len=24               

I noticed in the kernel source RPM that the OSB4 support is disabled because it 
is known to cause data corruption and I have not changed that option.  I've 
since compiled a 2.4.4 kernel with the same config file used to compile the 
2.4.2-2 RH71 kernel and while DMA no longer works reliably, data corruption is 
no longer a problem.

Rob Todd

Comment 13 Rob Todd 2001-04-30 02:59:36 UTC
I forgot to add:
  We have 32 of these exact machines... this behavior has occurred on 12 of 
them ranging from the installation problem described above to data corruption 
and overall FS weirdness during operation.  At one point even data recovered 
during a swap operation was corrupted (obviously crashing the application that 
had swapped the data).  

Robert

Comment 14 Rob Todd 2001-04-30 05:53:14 UTC
Whoops... I spoke too soon.  It appears as if the same problem exists in 2.4.4 
also.

Robert

Comment 15 Arjan van de Ven 2001-04-30 08:26:08 UTC
PLEASE open a separate bug for the serverworks problem.
This bug is about the megaraid driver which is TOTALLY unrelated to
serverworks. (And IDE on serverwork doesn't work. It's a chipset bug.
Try using ide=nodma; it might work around the chipsetbug)

Comment 16 Tim Clymo 2001-04-30 23:03:41 UTC
Just to add my $0.02 worth... I have had identical experience to the original
poster with an LP1000r and a Netraid-1M (single channel version of the AMI
sourced Netraid-2M).

This system is a 2-way PIII/933 with 750Mb RAM, 3x HP 18Gb drives on the Netraid
and an HP DDS-4 drive on the internal Symbios SCSI (which uses the sym53c8xx
driver).

The Netraid-1M has firmware H.01.07

I have tried setting up hardware RAID1 with hot spare and also simply presenting
the 3 drives as 3 separate LUN's to use with software RAID1. In either case
there are serious problems. Using software RAID set up with Disk Druid, the
installer gets all the way to the end where, instead of the expected "performing
post configuration tasks" progress bar I get "Installer terminated abnormally".
Installing to a conventional partition (with or without hardware RAID) sometimes
gets an apparently successful install which will last an absolute maximum of 2
reboots before it dies a spectacular death due to massive filesystem corruption.
On other occasions the installer will hang part way through.

It is generally noticeable that all does not seem well during the install. The
I/O "feels" really choppy, and there are frequent pauses for thought. It takes
in excess of 30min as reported by the installer to do a custom "install
everything", whereas I usually expect to see no more than around 20min.

This problem is readily reproduced on each of 2 identical LP1000r's

I also removed the disks from the Netraid-1M and connected them instead to the
Symbios SCSI - guess what, it worked... The Netraid was still physically
installed and the megaraid driver loaded, just not connected to any disks. The
install "felt" much cleaner, and took almost exactly 20min

Comment 17 Joseph Lazzaro 2001-05-04 00:05:23 UTC
I'm also having problems with the megaraid controller. I'm using the megaraid 
1600 on a tyan thunder LE motherboard with two 18GB drives on a RAID-1 and a 
single 9GB as a RAID-0. During installation, the megaraid driver is loaded, but 
I'm warned of an invalid partition table on all real and RAID devices connected 
to the megaraid controller. I've repartitioned the drives several times using 
fdisk and disk druid, but each time it claims the partition table is corrupt 
and makes me try again.

I'm a newbie, so if there is any more technical information that would help 
you, you'll have to tell me how to find it.

Comment 18 Joseph Lazzaro 2001-05-04 01:36:16 UTC
According to bug # 37531, others have been having problems with the megaraid 
controller that have been solved by the latest firmware update. It seems to 
have fixed my problems, might it be worth a shot for you?

Comment 19 ville.sulko 2001-05-04 09:11:35 UTC
> According to bug # 37531, others have been having problems with the megaraid 
> controller that have been solved by the latest firmware update. It seems to 

Does anyone know if HP NetRaids (1M/2M) are identical to some AMI model, or
do they have some HP-specific HW/FW in them?


Comment 20 Keith Howanitz 2001-06-13 19:25:13 UTC
I am having same problem, installation failing, HP lp1000r with HP netraid 1M 
card.  Had one good install, but then massive file system coruption. I have 
verified newest BIOS in SCSI drives ( two HP Hot swap Fujitsu drives, firmware 
F612 - trying for RAID1), and Card (H.01.07  G.01.02)

The installation seems to go OK, and installs a MegaRAID driver for the netraid 
card.  (RedHat 7.1)  just before it should make the boot floppy, installation 
stops, I reboot manually, system will not boot.  I got the install to go once, 
but had ext2 errors about an hour later.  Running RAID1.  

Borrowed disks from a friend, no change, swapped SCSI cable, no change.  Do I 
need a different RAID driver?  

TIA -Keith
howanitz

Comment 21 Keith Howanitz 2001-06-13 20:22:58 UTC
Just tryed installing from with fdisk instead of disk druid.  System formatted 
the drives, then crashed while it was copying install image to HDD.  (Before it 
started to install any packages.)  I have always been using the expert install.

-Keith

Comment 22 Keith Howanitz 2001-06-14 16:12:20 UTC
Contacted HP support, they asked me to try the RAID driver for 7.0 available on 
this page:
http://www.hp.com/cposupport/swindexes/hpnetserve28162_swen.html
 going to attempt a clean install after lunch.

-Keith

Comment 23 Gordon Biner 2001-06-14 18:03:57 UTC
Flashed one of my NetRaid 1M cards with the latest AMI bois for the MegaRaid Express 500. (it matched the Series 475 silkscreened to the card)
It is "New Release A159" and can be grabbed from www.ami.com

I can finally install, and reboot without any apparent failures.   

HP tells me that they are aware of the problem, and a fix is due out 'Any Day Now'.
Until then, I will be running the AMI bios.

Gordon


Comment 24 Keith Howanitz 2001-06-14 19:18:57 UTC
The HP drivers for 7.0 did not work for me.  I have flashed my 1M board with 
the same AMI bios as Gordon.  (I believe mine said 471, at least during the 
flash).  Everything appears to be working now.  I am calling to ask HP to carry 
better support for RedHat with their hardware and to close my case number 
(1428411937), but I would feel better if I knew someone at RedHat called also 
(I do not know if my polite insistance will mean as much.)  Thanks Gordon!!!

-Keith

Comment 25 Gordon Biner 2001-06-15 05:04:20 UTC
Now that my NetRaid 1M is working, I have noticed that the automatic partioning during install only uses 12GB of my 18GB available.
(2 x 18GB in Raid 1).  I have no idea if this is related to the 1M card or if it is a problem in Disk Druid.  Anyone have any ideas?

I tried to manually partion using Druid, but it would not allow me to use the remaining disk space.

I remember this kind of thing from ages ago, and back then, I was able to use 'fdisk' to gain the lost space back.
I will be trying that tomorrow morning.

BTW - we spend major $$$ with HP.  Big iron, Intel servers, PC, printers, and network hubs/switches.
I spoke with a couple of techs in the nearby HP office, and they were *very* helpfull; stated that the AMI bois would flash without any issues.
They stated that HP is a Linux friendly company, and that they specifically support RedHat.
Polite insistance is the best policy for a problem that appears to have a temporary band-aid.

Just my $.02 worth.

Gordon

Comment 26 Need Real Name 2001-06-27 05:47:35 UTC
Just wanted to let you know that I have a Netserver LC2000 with Netraid 1M and 
had problems with 7.1. (no problems on a Netraid 1Si)
Found this bug report and as Gordon suggested downloaded the ami bios for the 
express 500 and installed it.
RH 7.1 installation worked like a charm.



Comment 27 ville.sulko 2001-07-17 19:18:31 UTC
Anyone know if the newer megaraid driver (from patch-2.4.6-ac5) version
> 1.16 fixes the NetRaid 1M/2M filesystem corruption problem? The following
is from the driver changelog (megaraid.c) :

+ * Check added for HP 1M/2M controllers if having firmware H.01.07 or
+ * H.01.08. If found, disable 64 bit support since these firmware have
+ * limitations for 64 bit addressing

And BTW, is the H.01.08 firmware available from HP? Their support-pages
seem to offer H.01.07 as the latest firmware.