Red Hat Bugzilla – Bug 82687
Raid5 mke2fs is exceptionally slow and occasionally hangs the system (DEFER RHEL3)
Last modified: 2007-11-30 17:06:52 EST
Description of problem: This is a collaborative-effort defect report. Recently
we received a shipment of HP2300 disk array controllersm, which house 1-14 SCSI
drives. It's controlled by the LSI1030 chip (mptscsih|mptbase).
What we are seeing is when we have sw-RAID5 partitions setup on the drives
controlled by the array, it takes an exceedingly long time to create file
systems. Exceedingly slow == over 10 minutes to make a 300GB file system (7
drives in the RAID5 partition, 2 used as spares, 73GB per drive). Comparable
times for a RAID0 and RAID1 mke2fs are ~1 minute.
It does seem isolated to RAID5. Watching the progress on VC5 (during a text-
mode install) shows that mke2fs seems to progress in "bursts"... i.e., out of
2188 block groups to format, anywhere from 200-300 block groups at a time are
formatted (the numbers increase monotically quite rapidly) and then hang for at
least 1-2 minutes.
Version-Release number of selected component (if applicable):
Stock AS2.1 (kernel-2.4.18-e.12)
How reproducible: 100% (always)
... although I haven't yet tried this with an HP2100 4-drive array enclosure.
I will try one and report the results soon.
Steps to Reproduce:
1. Create RAID5 partition on HP disk array using LSI1030 controller
2. Watch mke2fs generate the file system sporadically, slowly, in bursts
This is likely NOT fixable for the first errata kernel, but really needs to be
address for the following errata kernel (schedule date removed so this defect
can be made public if necessary).
Other timing information observed, all with the HP2300 14-drive disk array and
14 73GB HP drives (mptscsih|mptbase):
14 drives w/ RAID 0 (~980GB): 02:48
7 drives w/ RAID 0 (~490GB): 01:16
7 drives w/ RAID 5 (~290GB): 09:22
... the ---- deliniates two different installations. If it matters, I
formatted the RAID5 as /usr and RAID0 was /extra in the second install.
/extra was used for the RAID0 device in the first.
FWIW, I also saw "bursts" of progress with RAID0 installs but the time
lapse between the 'spurts of busy-ness' was noticably shorter. I know
RAID0 does not use parity, but it still seems that an 8x multiplier
for SW-RAID5 is a bit excessive.
Also, due to /usr being RAID5 in the second install, the install was
twice as slow as the first (both installs were TUI/NFS installs) --
first install was 26:35 and the second install was 52:38.
A team-mate has noticed that making a file system on a RAID-5 device, OUTSIDE
the realm of anaconda, also is quite slow. He has created a 12-drive, RAID5
evice (plus 2 spare drives).
The mke2fs has been running for over 3 hours now, and the system has very poor
response time... it's as if the mke2fs is taking all available CPU. If you
receive this notice soon (i.e., in the next hour or so), please let me know
what information would be helpful to obtain to help troubleshoot this.
------- Additional Comment #2 From Tim Burke on 2003-05-05 10:20 -------
Weekly project meeting - should retry on RHEL3 where we are focusing more on
raid and other storage management. Defer from AS2.1 errata.
Raid5 is more expensive than 0 & 1.
FeatureZilla 90006 Closed=WONTFIX
Closing this Bugzilla as well to mirror FeatureZilla.