Red Hat Bugzilla – Bug 476472
mkinitrd produces initrd unusable with disks serviced by aic7xxx module
Last modified: 2008-12-15 04:17:53 EST
Description of problem:
Mystery!! There are no problems with a disk access during an installation and/or upgrade and resulting system does not boot because disks cannot be found. Again, there is not a whiff of a problem when booting the same system "rescue".
After a prolonged head scratching and hacking initrd images the situation turns out to be as follows. Disks are serviced by Adaptec AIC7XXX controller and dmesg shows the following picture:
<6>aic7xxx 0000:01:03.0: PCI INT A -> GSI 28 (level, low) -> IRQ 28
input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input3
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
<Adaptec aic7892 Ultra160 SCSI adapter>
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsi 0:0:0:0: Direct-Access FUJITSU MAP3367NC 0108 PQ: 0 ANSI: 3
scsi0:A:0:0: Tagged Queuing enabled. Depth 4
scsi target0:0:0: Beginning Domain Validation
scsi target0:0:0: wide asynchronous
scsi: waiting for bus probes to complete ...
scsi target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
scsi target0:0:0: Ending Domain Validation
That "Domain Validation" happens regardless if it is turned off in BIOS, or not, and it takes a rather long time. A controller is also not in a hurry at all to get to "Beginning Domain Validation" point. In the meantime initrd attempts to create disk block devices and mount /. Disks are really not available yet, and it will be not for quite a while, so this miserably fails.
Anaconda does a number of other things after 'aic7xxx' was inserted so when it attempts to find disks they are already there and everything is fine.
When looking how to hack /sbin/mkinitrd script to produce something usable I found that there is 'scsi_wait_scan' module, and it can be configured in with a help of a, non-existent, file /etc/sysconfig/mkinitrd and, as it turned out,
it solves the problem. Now initrd waits for disks to show up before attempting
to use these (for a possible resume device and for a root file system). I would not mind seeing that documented instead of having to figure that out from /sbin/mkinitrd shell code.
Would not be better to default to scsi_wait_scan="yes" if there are any disk scsi modules involved? An extra boot delay in some cases surely beats a non-booting machine. I strongly suspect that this is not the only controller and a situation which would need that. A configuration option could turn that scsi_wait_scan off it really not required.
Version-Release number of selected component (if applicable):
BTW - Fedora 8, from which this system was upgraded, was booting "straight" - without any extra mkinitrd tricks.
Please try the new mkinitrd build in Bug #470628.
With updates to mkinitrd-6.0.71-3.fc10.i386 and nash-6.0.71-3.fc10.i386 I see the following sequence in 'init' from initrd image produced with a help of those:
modprobe -q scsi_transport_spi
echo "Loading aic7xxx module"
modprobe -q aic7xxx
regardless if /etc/sysconfig/mkinitrd with 'MODULES=scsi_wait_scan' line exists or not. This is really equivalent to what I eventually forced with mkinitrd-6.0.71-2.fc10 so, yes, I am convinced that this works. I am afraid that I will not risk rebooting that in this moment as a stricken machine is for me remote.
Actually with /etc/sysconfig/mkinitrd present I got two extra lines after 'modprobe -q aic7xxx'. These:
echo "Loading scsi_wait_scan module"
modprobe -q scsi_wait_scan
but that amounts to the same as above so this will be fixed in the future update.
It appears that I was running an upgrade too early.
This bug has already been reported, so I'm closing this as a dup of the earlier report.
*** This bug has been marked as a duplicate of bug 466607 ***