Red Hat Bugzilla – Bug 494681
[Brocade 5.6 bug] Very slow boot over SAN with RHEL 5.3 and DM 4.2
Last modified: 2010-11-09 07:33:53 EST
Description of problem:
Slow boot over SAN with RHEL 5.3/DM 4.2 with large number of LUNs
Version-Release number of selected component (if applicable):
RHEL 5.3 with DM 4.2
Steps to Reproduce:
Device Mapper 4.2
2 Brocade HBA ports (at 8G) with each seeing 14 Target ports with a total of 161 LUNs behind them (1 boot LUN + 160 normal LUNs)
With this setup, the boot over SAN is very slow (12-24 hours). The bootup is stuck at “Starting Logical Volume Manager…”.
After boot up, the system works perfectly fine.
From the FC trace and driver trace we just see that the IOs are happening with long delays in between them (6-7 mins). There are no IO errors/aborts seen.
This is reproducible every time.
Same setup with local boot works fine. Same test with RHEL 5.2 is working fine too (with Boot over SAN).
Also, if we bring down the configuration to smaller number of LUNs (4 Target ports with 31 LUNs) - this issue is not seen.
Very slow boot over SAN
No major delays with Boot over SAN.
This problem is seen only with RHEL 5.3 (not with RHEL 5.2) and is not seen with local boot (and seen with Boot over SAN).
You might start with this:
"Why are LVM2 commands, such as vgscan, taking a very long time to complete?"
although I am not sure why RHEL 5.3 would be any different than RHEL 5.2.
Also see the dm-multipath release note discussed here:
(A modification to /etc/udev/rules.d/40-multipath.rules.)
We have tried the workaround suggested in 5.3 release notes already - it didn't help. Sorry, I forgot to mention in the initial defect report.
The original problem was seen with 2GB of memory. The problem did not happen when we increased the memory to 10GB (and then we iteratively found out that the problem does not happen with 6GB or more memory).
Please test with RHEL 5.4 GA and the forthcoming 5.5 Beta.
Is there still a problem when using 5.5?
Closing because of no activity. If there still is a problem, reopen the bug and report evidence.