Bug 13635 - Raid 5 config is causing kernel panic
Raid 5 config is causing kernel panic
Product: Red Hat Linux
Classification: Retired
Component: raidtools (Show other bugs)
i686 Linux
high Severity medium
: ---
: ---
Assigned To: Erik Troan
Depends On:
  Show dependency treegraph
Reported: 2000-07-09 18:49 EDT by eballweber
Modified: 2008-05-01 11:37 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2000-07-30 20:04:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description eballweber 2000-07-09 18:49:24 EDT
System config.
N440BX motherboard, 5 IBM 9.1Gb model DNES-309170 SCSI disk drives
connected to on-board Fast-wide SCSI interface, 1 PIII 550Mhz processor,
256M Ram

Red Hat 6.2 installed with kernel version 2.2.14-12 per Update/Errata on
Red Hat support page.

Raid 5 configured as follows (from /etc/raidtab)
raiddev			/dev/md0
raid-level		5
nr-raid-disks		4	
chunk-size		64
parity-algorithm	left-symmetric
nr-spare-disks		1

device			/dev/sdb4
raid-disk		0

device			/dev/sdc1
raid-disk		1

device			/dev/sdd1
raid-disk		2

device			/dev/sde1
raid-disk		3

device			/dev/sda1
spare-disk		0


Raid5 partition created as follows
mkraid /dev/md0
./mke2fs -b4096 -R stride=16 -N1000000 -m1 /dev/md0
mount /dev/md0 /rdb0

Boot partition is on /dev/sdb1  ~ 10M
Primary partition on /dev/sdb2  ~ 1000M
Swap partition is on /dev/sdb3  ~ 128M

Problem occurs while running iozone v3_24 benchmark tool and now seems to
happen sooner with the upgrade to kernel version 2.2.14-12.

Error output follows.

[root@tisk01 bench]# ./iozone -s250m -r4 -f/rdb0/benchtest -i0 -i2 -a
	Iozone: Performance Test of File I/O
	        Version $Revision: 3.24 $
		Compiled for 32 bit mode.

	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
	             Steve Landherr, Brad Smith.

	Run began: Mon Jul 10 06:24:11 2000

	File size set to 256000 KB
	Record Size 4 KB
	Auto Mode
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 Kbytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
random    bkwd  record  stride                                   
              KB  reclen   write rewrite    read    reread    read  
write    read rewrite    read   fwrite frewrite   fread  freread
          256000       4
Message from syslogd@tisk01 at Mon Jul 10 06:24:31 2000 ...
tisk01 kernel: Kernel panic: VFS: LRU block list corrupted
Comment 1 eballweber 2000-07-10 14:37:32 EDT
I have tried a raid 0 configuration with the same equipment and get the same
kernel panic.  This error seems to happen under high load and/or large files. 
Running benchmark tests on smaller files 1-20 MB does not produce this error
Comment 2 Doug Ledford 2000-07-13 22:55:56 EDT
On my web page (http://people.redhat.com/dledford) there is a memory test script
that will detect faulty RAM on a computer far better than any other test we've
found to date.  Can you please try this on your machine.  The description given
here, between the two different reports, more or less clears RAID5 (since one
machine is RAID5 and the other RAID0) and our experience has been that these
problems are almost always hardware related instead of RAID software related. 
The fact that both of you say you have N440BX based machines also makes it sound
hardware related.  You might want to check and see if ECC error correction is
enabled in the BIOS on your motherboard (if you have ECC RAM, you might not). 
If you do have ECC RAM and ECC is enabled in the BIOS, then one of the best ways
to see if you have RAM related problems is to check the event log in the BIOS
and see if it reports any ECC errors.
Comment 3 eballweber 2000-07-30 20:04:46 EDT
I have looked more closely into this problem and it looks likt the SCSI chipset
on the N440BX board is not fully supported by RedHat 6.2.  The SCSI controller
is a "Symbios Logic 53C876 Dual Channel Ultra (one wide, one narrow)" and is
listed as Tier 3 supported.  Could this level of support be related to this

Comment 4 Erik Troan 2000-08-05 09:52:08 EDT
This isn't a problem with raid, but with poorly supported hardware.

Note You need to log in before you can comment on or make changes to this bug.