Bug 481470 - rc.sysinit does not wait for udev loaded scsi adapters to finish scanning their busses
Summary: rc.sysinit does not wait for udev loaded scsi adapters to finish scanning the...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: initscripts
Version: 10
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 474846 484112 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-25 11:48 UTC by acount closed by user
Modified: 2014-03-17 03:17 UTC (History)
12 users (show)

Fixed In Version: 8.86.3-1
Clone Of:
Environment:
Last Closed: 2009-04-22 20:23:23 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg (31.57 KB, text/plain)
2009-01-25 11:48 UTC, acount closed by user
no flags Details
boot.log (1.90 KB, text/plain)
2009-01-25 11:49 UTC, acount closed by user
no flags Details
fdisk -l (1.01 KB, text/plain)
2009-01-29 20:07 UTC, acount closed by user
no flags Details
lsmod (1.43 KB, text/plain)
2009-02-04 15:54 UTC, acount closed by user
no flags Details
ls-sd-block.txt (3.59 KB, text/plain)
2009-02-04 15:54 UTC, acount closed by user
no flags Details

Description acount closed by user 2009-01-25 11:48:17 UTC
Created attachment 329934 [details]
dmesg

hi,

mount and fsck are unable to find a SCSI partition at boot.

boot.log:
Mounting local filesystems:  mount: special device LABEL=/datos does not exist

# tune2fs -l /dev/sdg1 | grep volume
Filesystem volume name:   /datos

fstab:
LABEL=/datos            /datos                  ext3    defaults,noatime       1 0

lspci:
01:06.0 SCSI storage controller: Adaptec AIC-7892B U160/m (rev 02)



-thanks-

Comment 1 acount closed by user 2009-01-25 11:49:48 UTC
Created attachment 329935 [details]
boot.log

Comment 2 Chuck Ebbert 2009-01-29 06:55:43 UTC
Is your root partition on the same disk controller as the missing disk?

Comment 3 acount closed by user 2009-01-29 19:59:02 UTC
(In reply to comment #2)


> Is your root partition on the same disk controller as the missing disk?

NO.

root disk is [0:0:0:0]    disk    ATA      WDC WD3200AAJS-2 01.0  /dev/sda

querida:~ # lsscsi
[0:0:0:0]    disk    ATA      WDC WD3200AAJS-2 01.0  /dev/sda
[1:0:0:0]    disk    ATA      HDS722516VLSA80  V34O  /dev/sdb
[2:0:0:0]    cd/dvd  HL-DT-ST DVDRAM GH15F     EG00  /dev/sr0
[6:0:0:0]    disk    IBM      DPSS-336950N     S96H  /dev/sdg
[7:0:0:0]    disk    Generic  USB SD Reader    1.00  /dev/sdc
[7:0:0:1]    disk    Generic  USB CF Reader    1.01  /dev/sdd
[7:0:0:2]    disk    Generic  USB SM Reader    1.02  /dev/sde
[7:0:0:3]    disk    Generic  USB MS Reader    1.03  /dev/sdf

Comment 4 acount closed by user 2009-01-29 20:07:25 UTC
Created attachment 330398 [details]
fdisk -l

Comment 5 Hans de Goede 2009-02-03 11:51:24 UTC
Can you please tells us:
1) Which disk is on which controller
2) On which disk the /datos partition sits?

Comment 6 acount closed by user 2009-02-03 21:37:43 UTC
(In reply to comment #5)

> Can you please tells us:
> 1) Which disk is on which controller
> 2) On which disk the /datos partition sits?

IBM DPSS-336950N /dev/sdg1(/datos) is attached to Adaptec AIC-7892B U160(aic7xxx driver)

Others, WDC WD3200AAJS-2 /dev/sda and HDS722516VLSA80 /dev/sdb are attached to the MOBO nVidia MCP73 SATA controlled (ahci driver)

Comment 7 Hans de Goede 2009-02-04 08:38:22 UTC
(In reply to comment #6)
> (In reply to comment #5)
> 
> > Can you please tells us:
> > 1) Which disk is on which controller
> > 2) On which disk the /datos partition sits?
> 
> IBM DPSS-336950N /dev/sdg1(/datos) is attached to Adaptec AIC-7892B
> U160(aic7xxx driver)
> 
> Others, WDC WD3200AAJS-2 /dev/sda and HDS722516VLSA80 /dev/sdb are attached to
> the MOBO nVidia MCP73 SATA controlled (ahci driver)

Thanks, can you please attach (or copy) the output of lsmod and attach the output of dmesg ?

Can you also copy the output of:
ls /sys/block
ls /dev/sd*

Thanks.

Comment 8 acount closed by user 2009-02-04 15:53:55 UTC
dmesg already attached !

Comment 9 acount closed by user 2009-02-04 15:54:28 UTC
Created attachment 330873 [details]
lsmod

Comment 10 acount closed by user 2009-02-04 15:54:50 UTC
Created attachment 330874 [details]
ls-sd-block.txt

Comment 11 Hans de Goede 2009-02-11 11:15:47 UTC
Ok,

I've figured out the problem. dmesg tells the story pretty clearly. First the aic7xxx driver gets loaded, then the microcode update gets run, and only after that the disk gets found.

Since the updating of the microcode happens after switching to runlevel 3 or 5, this means that rc.sysinit has completed before the disk is found, so indeed fsck during rc.sysinit will not be able to find the disk.

Changing summary and component to initscripts

Bill, since we no longer load all storage drivers from initrd, but only those needed to get the root fs, others get loaded at the beginning at rc.sysinit by udev, this means that when doing fsck it is possible the disk to fsck has not yet been probed. So rc.sysinit needs to wait for disk probing to be complete before starting fsck. This is easier said then done though.

The best I can come up with is do an ls of /sys/class/scsi_host, and store the output in a variable before starting udev, then after udev has started (and settled) do another ls of /sys/class/scsi_host, compare the 2 and if the y are different do:
modprobe scsi_wait_scan
rmmod scsi_wait_scan

---

Xose, can you try to add these 2 lines:

modprobe scsi_wait_scan
rmmod scsi_wait_scan

To /etc/rc.d/rc.sysinit above this line:
# Start any MD RAID arrays that haven't been started yet

That should fix your issue.

Comment 12 acount closed by user 2009-02-11 11:46:16 UTC
(In reply to comment #11)

> Xose, can you try to add these 2 lines:
> 
> modprobe scsi_wait_scan
> rmmod scsi_wait_scan
> 
> To /etc/rc.d/rc.sysinit above this line:
> # Start any MD RAID arrays that haven't been started yet
> 
> That should fix your issue.


done, and it works.

-thanks-

Comment 13 Bill Nottingham 2009-03-20 21:06:27 UTC
*** Bug 484112 has been marked as a duplicate of this bug. ***

Comment 14 Hans de Goede 2009-03-23 15:55:53 UTC
(In reply to comment #13)
> *** Bug 484112 has been marked as a duplicate of this bug. ***  

Bill, any progress on a fix for this?

Shall I take a shot at writing a patch for this?

Comment 15 Bill Nottingham 2009-03-23 17:04:44 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > *** Bug 484112 has been marked as a duplicate of this bug. ***  
> 
> Bill, any progress on a fix for this?

I can put the stupid fix that's stupid due to the stupid stupid kernel interface in once the RAID stuff is sorted. But mostly, I just want to beat the upstream SCSI stack about the head.

Comment 16 Bill Nottingham 2009-04-02 14:14:30 UTC
*** Bug 474846 has been marked as a duplicate of this bug. ***

Comment 17 Bill Nottingham 2009-04-02 14:27:17 UTC
http://git.fedorahosted.org/git/?p=initscripts.git;a=commitdiff;h=a91d9f003d0afca33cf89b83ba40ac161229852e

Will be cherry-picked back to F-10 as well.

Comment 18 Fedora Update System 2009-04-02 18:04:09 UTC
initscripts-8.86.1-1 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/initscripts-8.86.1-1

Comment 19 Hans de Goede 2009-04-02 18:48:33 UTC
(In reply to comment #17)
> http://git.fedorahosted.org/git/?p=initscripts.git;a=commitdiff;h=a91d9f003d0afca33cf89b83ba40ac161229852e
> 

Thanks!

1 remark and 1 question.

Remark: The first rmmod is not necessary AFAIK, and if not removed
should atleast be silenced when it errors out that no such module is loaded

Question: Are we sure that udev is done modprobing drivers when we do this,
if the driver isn't loaded yet, this is of little use.

Comment 20 Bill Nottingham 2009-04-02 19:38:33 UTC
A new bit pushed that silences the rmmod/modprobe calls. The rmmod is there just in case it happens to have stayed loaded somewhere.

As for whether udev is done, given that there's a 'udevadmn settle' call in start_udev, yes, it should at least be done loading modules. As mentioned in the changelog, USB's still a crapshoot, but there's not much we can do about that.

Comment 21 Fedora Update System 2009-04-03 04:15:51 UTC
initscripts-8.86.2-1 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/initscripts-8.86.2-1

Comment 22 Fedora Update System 2009-04-06 20:29:49 UTC
initscripts-8.86.2-1 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update initscripts'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-3351

Comment 23 Fedora Update System 2009-04-09 16:12:31 UTC
initscripts-8.86.3-1 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update initscripts'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-3351

Comment 24 Fedora Update System 2009-04-22 20:22:16 UTC
initscripts-8.86.3-1 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 25 Frank Samuelson 2009-04-26 21:14:07 UTC
I just installed initscripts-8.86.3-1.  It fixes the problem that I filed (https://bugzilla.redhat.com/show_bug.cgi?id=474846).  Thanks.

Comment 26 acount closed by user 2009-12-09 23:12:53 UTC
This line should be deleted from initscripts:

# Sync waiting for storage.
{ rmmod scsi_wait_scan ; modprobe scsi_wait_scan ; rmmod scsi_wait_scan ; } >/dev/null 2>&1


It's already done by dracut in the "init" script from the initramfs.

-thanks-

Comment 27 Bill Nottingham 2009-12-10 03:56:25 UTC
Please open a separate bug; I'd like to have some discussion on that.


Note You need to log in before you can comment on or make changes to this bug.