Description of Problem: Creating a raid 5 partition causes long boot times. All of the testing was done on workstation class systems with 1 hard drive. The system booted as expected up until the message reading "Freeing unused kernel memory" was displayed on the screen. The remaining part of the boot process took about 15 minutes. I verified this with /opt and /usr mounted on the raid device. I was able to read and write to the raid device without any issues. How Reproducible: 100% -- I will verify this against a system with multiple drives, but the only other system that was available for testing was the Compaq. Steps to Reproduce: 1. create raid5 device on a single drive 2. reboot Actual Results: Expected Results: Additional Information: I received a kernel panic when I recreated the above scenario on the BigSur located at my desk. /proc/cpuinfo indicates that it is running with a revision 4 processor. cpuinfo: 1 CPU vendor : GenuineIntel family : IA-64 model : Itanium revision : 6 scsi info: QLogic PCI to SCSI Adapter for ISP 1280/12160: Firmware version: 8.13.08, Driver version 3.24 Beta SCSI Host Adapter Information: QLA1280 Request Queue = 0x0000000008bb8000, Response Queue = 0x0000000008bc8000 Request Queue count= 0x100, Response Queue count= 0x10 Number of pending commands = 0x8b Number of queued commands = 0x0 Number of free request entries = 38 Attached devices: Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: QUANTUM Model: ATLAS IV 9 SCA Rev: 0B0B mdstat: Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sda9[1] sda8[0] sda7[2] 2056064 blocks level 5, 64k chunk, algorithm 0 [3/3] [UUU] [=============>.......] resync = 66.5% (684384/1028032) finish=12.5min speed=456K/sec unused devices: <none>
adding Bill to Cc:
this sounds like pretty bad news. Ben, what was the result of a two or three disc raid5? 1 disc raid5 isn't a real world test.
build qa0522.0 I performed a raid 5 install on the Compaq in the test lab this morning. The system has 3 drives on a cpqarray. raid 5 doesn't look very good here either....at least system boot time wasn't abnormally long. Based on the information from dmesg, it doesn't look like the raid setup is working as expected. /dev/md0 (/opt) was mounted after the system came up, but apparently there were "not enough operational devices for md0 (2/3 failed)." From dmesg: cpqarray: Device 0x1000 has been found at bus 25 dev 8 func 0 Compaq SMART2 Driver (v 2.4.4) Found 1 controller(s) cpqarray: Finding drives on ida0 (Integrated Array) cpqarray ida/c0d0: blksz=512 nr_blks=35553120 cpqarray ida/c0d1: blksz=512 nr_blks=35561280 cpqarray ida/c0d2: blksz=512 nr_blks=35561280 cpqarray: Starting firmware's background processing Partition check: ida/c0d0: p1 p2 p3 < p5 p6 > ida/c0d1: p1 p2 ida/c0d2: p1 p2 p3 raid5: measuring checksumming speed ia64 : 81.920 MB/sec raid5: using function: ia64 (81.920 MB/sec) raid5 personality registered as nr 4 autodetecting RAID arrays (read) ida/c0d0p2's sb offset: 3072128 [events: 00000002] autorun ... considering ida/c0d0p2 ... adding ida/c0d0p2 ... created md0 bind<ida/c0d0p2,1> running: <ida/c0d0p2> ida/c0d0p2's event counter: 00000002 md0: former device ida/c0d1p2 is unavailable, removing from array! md0: former device ida/c0d2p1 is unavailable, removing from array! md: md0: raid array is not clean -- starting background reconstruction md0: max total readahead window set to 512k md0: 2 data-disks, max readahead per data-disk: 256k raid5: device ida/c0d0p2 operational as raid disk 0 raid5: not enough operational devices for md0 (2/3 failed) RAID5 conf printout: --- rd:3 wd:1 fd:2 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ida/c0d0p2 disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00] disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00] raid5: failed to run raid set md0 pers->run() failed ... do_md_run() returned -22 md0 stopped. unbind<ida/c0d0p2,0> export_rdev(ida/c0d0p2) FWIW, I was able to read and write to /dev/md0 (/opt). I know that there were some issues with cpqarray for RC1, so let me know if you would like this tested on another multi-disc system. (is there another semi-recent piece of IA64 hardware w/ multiple drives to test on?)
You can use the Dell box, if you can find caddies & drives for it.
No luck with the dell. The system hard-locks while installing packages. (qa0522.0)
I crammed a couple more drives in the Big Sur in the test lab. The raid 5 install completed successfully! System boot time was just a few seconds shy of normal -- many times faster than the first attempt. from dmesg: <snip> raid5: measuring checksumming speed ia64 : 114.688 MB/sec raid5: using function: ia64 (114.688 MB/sec) raid5 personality registered as nr 4 autodetecting RAID arrays (read) sda2's sb offset: 2048192 [events: 00000002] (read) sdb5's sb offset: 2048192 [events: 00000002] (read) sdc1's sb offset: 2048192 [events: 00000002] autorun ... considering sdc1 ... adding sdc1 ... adding sdb5 ... adding sda2 ... created md0 bind<sda2,1> bind<sdb5,2> bind<sdc1,3> running: <sdc1><sdb5><sda2> sdc1's event counter: 00000002 sdb5's event counter: 00000002 sda2's event counter: 00000002 md: md0: raid array is not clean -- starting background reconstruction md0: max total readahead window set to 512k md0: 2 data-disks, max readahead per data-disk: 256k raid5: device sdc1 operational as raid disk 1 raid5: device sdb5 operational as raid disk 2 raid5: device sda2 operational as raid disk 0 raid5: allocated 12656kB for md0 raid5: raid level 5 set md0 active with 3 out of 3 devices, algorithm 0 raid5: raid set md0 not clean; reconstructing parity ... Is this a typical completion time for the resync on a 4 Gig raid5 setup (it doesn't appear to be speeding up...i've checked on it several times in the last few minutes): cat /proc/mdstat: Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdc1[1] sdb5[2] sda2[0] 4096384 blocks level 5, 64k chunk, algorithm 0 [3/3] [UUU] [==>..................] resync = 12.9% (265216/2048192) finish=69.9min speed=424K/sec unused devices: <none>
raid 5 worked without issue for the gold release. marking resolved "NOTABUG" since the problem did not surface when a realistic RAID 5 scenario was created.
Closing