Red Hat Bugzilla – Bug 198666
Startup race: could not find filesystem '/dev/root'
Last modified: 2007-12-12 16:11:56 EST
Description of problem:
When doing a reboot test on an sx1000-based Superdome using storage located off
an 4Gb Emulex card, I received the attached kernel panic.
Version-Release number of selected component (if applicable):
I've only seen it once so far and that was 7 hours into a 12 hour reboot test.
Steps to Reproduce:
1. Install rhel5a1 on to a ia64-based system using a 4Gb Emulex card for storage
(I used a Superdome with an AD167A)
2. Set up the machine to reboot continuously for 24 hours
3. Come back the next day and see the panic
12 hours of successful reboots
This exact same config was tested with rhel4u4b2 the day before without any issues.
Created attachment 132315 [details]
Boot output with kernel panic
Can you give the latest Beta2 kernel a try? 2.6.18-1.2685.el5
Bryan, any news on trying this?
This might be a superdome or cellular based systems problem. I tried to
reproduce it on an rx6600 with the rhel5a1 code and it didn't have any issues.
I just reserved some time on a superdome again and will attempt to reproduce
this there and then I'll try the newer kernel after that.
Bryan, thanks for the update... keep us posted!
Created attachment 142417 [details]
Diff between a good and bad boot of rhel5b2
While it's not the same panic, it appears there's a race in the
initialization code for FC drives in rhel5b2... See the attached diff
that shows the difference between a good and bad boot. Note: none of
the partitions were lvm.
Have you seen this race on startup?
James Smart from Emulex says that the panic reported in comment #1 is fixed in
RHEL 5 Beta 2.
The problem reported in Comment #7 is new, and has not been seen previously.
I'm changing the summary to reflect the new problem.
Bill, Jeremy, Peter, it looks like we are getting "Creating root device" before
the device is configured by the kernel. Any thoughts?
(In reply to comment #9)
> Bill, Jeremy, Peter, it looks like we are getting "Creating root device" before
> the device is configured by the kernel. Any thoughts?
I expect it's related to bug 213039 (ie, there's no way for userspace to know
when the kernel is actually done scanning).
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release. Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release. This request is not yet committed for
James Smart does not have thie priv to write to ths BZ. He also can not read
213039. I'll try to fix that. In the meantime he replies:
I mentioned to Tom about an effort at HP to async scan SCSI adapters.
It's intended to parallelize some of the delays, but a side effect
is that it must coordinate when scan is done before it attempts to
mount root. It's still in the development stage in the scsi midlayer
(is in scsi-misc-2.6) but sounds like a good overlap. Unfortunately,
it may be a little late for RHEL5 inclusion. (and lpfc has yet to finish
support for it)
Yes, it is too late for RHEL 5. We are going to have to make due with some
hard-coded delays in mkinitrd (or wherever).
(In reply to comment #12)
> James Smart does not have thie priv to write to ths BZ. He also can not read
> 213039. I'll try to fix that. In the meantime he replies:
> I mentioned to Tom about an effort at HP to async scan SCSI adapters.
> It's intended to parallelize some of the delays, but a side effect
> is that it must coordinate when scan is done before it attempts to
> mount root. It's still in the development stage in the scsi midlayer
> (is in scsi-misc-2.6) but sounds like a good overlap. Unfortunately,
> it may be a little late for RHEL5 inclusion. (and lpfc has yet to finish
> support for it)
> (End quote)
> Yes, it is too late for RHEL 5. We are going to have to make due with some
> hard-coded delays in mkinitrd (or wherever).
Tom, are you definately going the user space hard coded delay route for RHEL5?
I was working on slimmed down versions of Mathew's code for RHEL5. If upstream
added new callouts to the scsi_host_template, should I at least send patches to
rh-kernel to add them to RHEL5's host_template, or did you say host template
additions will not break KABI?
I'm open to better solutions. It is really late though.
If you have something to post, then by all means go ahead. The list is the best
place to ask the question about kabi as well.
You are right. I am not going to be able to test every driver for a kernel change.
The user space delay was added in 5.0 (bug 213039). This will hopefully avoid
the problem is most situations, while the upstream kernel solution solidifies. I
am setting this BZ to 5.1, so we can add the proper fix there. Mike, when the
kernel fix is availble, please ask the anaconda team to consider un-doing the
hack in bug 213039.
I just did a 24 hour reboot test on a 2 cell Superdome partition with rhel5rc
and didn't see a kernel panic. So the delay does appear to prevent the problem
This is being deferred to RHEL 5.2 due to resource/time constraints, and
priorities. Seems like the workaround still seems to be work in the meantime.
As we continue to review this, we are of the opinion that the userspace timer
sufficiently addresses the original issue and do not plan to puruse this any
further. Please confirm.
(In reply to comment #29)
> As we continue to review this, we are of the opinion that the userspace timer
> sufficiently addresses the original issue and do not plan to puruse this any
> further. Please confirm.
FYI, Bryan is not currently working with HP so I will answer this...
I have not seen exactly what code was used to address this problem. I only know
that it was considered a workaround, not a real fix. Can someone point me to
the patch for this?
(In reply to comment #30)
> (In reply to comment #29)
> > Bryan,
> > As we continue to review this, we are of the opinion that the userspace timer
> > sufficiently addresses the original issue and do not plan to puruse this any
> > further. Please confirm.
> FYI, Bryan is not currently working with HP so I will answer this...
> I have not seen exactly what code was used to address this problem. I only know
> that it was considered a workaround, not a real fix. Can someone point me to
I am not sure which is better or a real fix.
The kernel fix for async scanning is basically just a wait in the kernel. For
qla2xxx async scanning we wait for loop_reset_delay and then we have the
possible race with the rport addition and actual scanning (qla2xxx_scan_finished
reports when transport scanning is done or timedout but not when the scsi device
scanning is done so it may return before scsi_devices are actually added). With
the kernel fix though, we could just to synchronous scanning, but that is
probably a hack still. Either way we sit around for loop_reset_delay seconds and
at that time if we have devices we have them.
The userspace fix is just a wait in userspace. It affects all drivers though, so
it works for other fc drivers, but it affects other that may not need it.
I think Peter Jones had a idea that is better than the existing kernel one and
the userspace fix we are using, but that seemed like a ways off.
> the patch for this?
I do not think we have a patch. I think some code was just merged in the
mkinitrd release that you tested for bz 213039.
should bug 209160 be marked as a duplicate of this?
(In reply to comment #32)
> should bug 209160 be marked as a duplicate of this?
What is 209160 for? Was it for multipath bugs that were a result of async
scanning? I thought there were two bugs with multipath boot:
1. async scanning causes a device's names (/dev/sX) and major minor numbers to
change between boots. This was bad for the initial multipath boot code back in
5.0 beta, because the multipath boot code was relying major minor numbers to be
the same. I think Peter Jones or someone fixed that by having multipath assemble
devices for boot using uuid like is done with the non-boot multipath setup.
2. Previously, userspace assumed that when a module was done loading the devices
were added and ready to go, but async scanning causes the module loading to
return before devices are found. This causes multipath boot not to find devices.
I thought this was fixed with the wait fix in this bugzilla
To recap: RHEL 5.0 implements a delay in userspace to allow the drivers to
finish configuring devices. Testing of this with as many as 2K devices, plus
some time in the field, indicates that this workaround seems to be adequate.
This has remained true throughout the introduction of multipath boot in 5.1.
Longer term, there is some work being done upstream to interlock the kernel and
userspace configuration actions, so we will not need to depend on a hard-coded
I propose that we leave RHEL 5 as it is, and expect to inherit the improved
implementation in RHEL 6. If problems eventually arise on RHEL 5, we will
consider backporting the functionality then.