The megaraid2 RAID driver in RHEL 3 provided RAID status for our Dell PowerEdge 2850 machines in /proc/megaraid/hba0/raiddrives-0-9. This normally look like this: diger.uio.no# cat /proc/megaraid/hba0/raiddrives-0-9 Logical drive: 0:, state: optimal Span depth: 1, RAID level: 1, Stripe size: 64, Row size: 2 Read Policy: Adaptive, Write Policy: Write back, Cache Policy: Direct IO diger.uio.no# When we started using RHEL 4 on these machines, the kernel driver changed to megaraid_mbox, and the status information is no longer available. We depend on the information in /proc/ to monitor the RAID status, and to pass information about failed RAIDs to our error reporting system (Palantir). Without easily available status info, the risk of using the hardware RAID system increases significantly, as we no longer automatically detect failed disks. We use home made scripts to check the RAID status every 5 minutes. This is the PCI ID of the hardware controller used in our Dell PE2850: 02:0e.0 Class 0104: 1028:0013 (rev 06) 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 4 (rev 06) Please update the kernel driver used by this hardware to provide textual information about the RAID status in /proc/ or /sys/. If possible, try to make sure all RAID drivers provide status information using the same format and in files located in a similar path structure in /proc/ or /sys/. The latter would make it easier for us to write the scripts to detect RAID failure.
I notice this also in Fedora Core 3. Where can people get the megaraid card status from now?
The lack of RAID status info make the machine dangerous to use. We get no warning when disks fail, and might end up with a complete disk crash if enough disks crash in the RAID set. Are there anyone working on addressing this issue?
As you may know, there is a movement away from using /proc for this sort of thing in the 2.6 kernel. It would be ideal if there were a commonly agreed upon set of values that RAID devices report in sysfs. So far, unfortunately, I have not seen any such proposals made upstream. Currently it looks like your best bet is to get a monitoring utility from the LSI Logic web page. The MegaRAID Configuration Utility (MEGARC), for example, looks like it does what you want: # ./megarc.bin -dispCfg -a0 ********************************************************************** MEGARC MegaRAID Configuration Utility(LINUX)-1.11(12-07-2004) By LSI Logic Corp.,USA ********************************************************************** [Note: For SATA-2, 4 and 6 channel controllers, please specify Ch=0 Id=0..15 for specifying physical drive(Ch=channel, Id=Target)] Type ? as command line arg for help Finding Devices On Each MegaRAID Adapter... Scanning Ha 0, Chnl 3 Target 15 ********************************************************************** Existing Logical Drive Information By LSI Logic Corp.,USA ********************************************************************** [Note: For SATA-2, 4 and 6 channel controllers, please specify Ch=0 Id=0..15 for specifying physical drive(Ch=channel, Id=Target)] Logical Drive : 0( Adapter: 0 ): Status: OPTIMAL --------------------------------------------------- SpanDepth :01 RaidLevel: 5 RdAhead : No Cache: DirectIo StripSz :064KB Stripes : 4 WrPolicy: WriteThru Logical Drive 0 : SpanLevel_0 Disks Chnl Target StartBlock Blocks Physical Target Status ---- ------ ---------- ------ ---------------------- 3 00 0x00000000 0x010f1800 ONLINE 3 01 0x00000000 0x010f1800 ONLINE 3 03 0x00000000 0x010f1800 ONLINE 3 04 0x00000000 0x010f1800 ONLINE
Why couldnt you put it back in /proc until a better solution is ready, instead of removing an essential feature (to me) in the absence of a better solution?
The MegaRAID Configuration Utility solution (or something like it from LSI Logic) does not work for you?
I run scripts to check RAID status. If it's possible to use the Configuration Utility easily from a script it would be fine. The /proc interface was very convenient (with the eXtremeRAID driver, I could also initiate commands such as 'rebuild'). As Redhat AS 4 comes now, there is no way to monitor or manage megaraid included. Perhaps this is normal, and one should expect to add on something like MegaRC or Dell's Openmanage?
The vendor programs I've seen to extract RAID status have had issues when used from scripts, and I have never found a satisfying solution using them. This is why I prefer to have a text file in /proc/ or /sys/ instead. When such information isn't available in /proc/ or /sys/, I recommend to not use the hardware raid in question. It is sad to have to recommend against using Dell PowerEdge 2850 with RHEL 4, when it worked just fine with RHEL 3. :/ People following this bug might find the information available from <URL: http://developer.skolelinux.no/info/prosjektet/delprosjekt/hw-raid-info.html > interesting. It is a summary of some of the features in hardware raids on linux.