Bug 578377 - not all partitions created in /dev/sd* at bootup, causes md1 not to come online, volume group not to be activated
Summary: not all partitions created in /dev/sd* at bootup, causes md1 not to come onli...
Keywords:
Status: CLOSED DUPLICATE of bug 543749
Alias: None
Product: Fedora
Classification: Fedora
Component: util-linux-ng
Version: 12
Hardware: i686
OS: Linux
low
medium
Target Milestone: ---
Assignee: Karel Zak
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-03-31 05:20 UTC by Rich Rauenzahn
Modified: 2010-04-29 08:01 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-04-29 08:01:26 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
boot log and misc output before and after rc.local (61.16 KB, text/plain)
2010-03-31 05:46 UTC, Rich Rauenzahn
no flags Details

Description Rich Rauenzahn 2010-03-31 05:20:42 UTC
Description of problem:

After upgrading to FC12, one of my volume groups doesn't activate after boot.  Looking backwards, the md device doesn't activate..looking further backwards, it appears that some of the sd drive's don't get scanned properly -- no partitions are picked up.

If I add this to rc.local, then the system comes up as I want it:

---------------8<-------------
#!/bin/sh

dmesg  > /tmp/boot.log
cat /proc/mdstat >> /tmp/boot.log
vgdisplay -v VolGroupRAID >> /tmp/boot.log
ll /dev/sd* >> /tmp/boot.log

cat /tmp/boot.log | mailx -s vgraid\ scan rich

partprobe > /dev/null 2>&1 
for i in `blkid | grep UUID=\"23a8685c-42e8-7d52-3a53-d40b51cce092\" | perl -ne'm{^(/dev/sd[a-z]1)} && print "$1\n"'`
do
   mdadm --re-add /dev/md1 $i
done

mdadm -R /dev/md1
vgchange -a y VolGroupRAID
service autofs restart
service backuppc start
---------------8<-------------

It also seems odd -- or maybe related -- that FC doesn't consistenly give the drives the same letter anymore at bootup.  This doesn't cause any problems directly as I'm using labels and volume groups-- but I wonder if it is related.




Version-Release number of selected component (if applicable):

 Source RPM: kernel-2.6.32.10-90.fc12.src.rpm

How reproducible:

everytime

Steps to Reproduce:
1.
2.
3.
  
Actual results:

md1 doesn't come online 

Expected results:

md1 should be activated during boot.

Additional info:

attaching output from rc.local script...

Comment 1 Rich Rauenzahn 2010-03-31 05:46:01 UTC
Created attachment 403658 [details]
boot log and misc output before and after rc.local

output of dmesg, ls /dev/sd*, mdstat and vgdisplay during boot before rc.local workaround and after the workaround

Comment 2 Rich Rauenzahn 2010-03-31 05:49:05 UTC
Minor nit/fix to my rc.local workaround -- I replaced ll with ls -l

Comment 3 Chuck Ebbert 2010-04-21 21:58:39 UTC
Can you try to get debug messages from udev by adding 'rdudevdebug' to the kernel boot options?

Are the sgX devices present when the partitions are missing?  E.g. when you see:

  sd 4:0:0:0: Attached scsi generic sg4 type 0
  sd 4:0:0:0: [sde] Write Protect is off

is /dev/sg4 present even when /dev/sde1 is missing?

Can you create the partitions manually when they're missing and then access them?

  # /sbin/MAKEDEV -x /dev/sde1

Comment 4 Rich Rauenzahn 2010-04-22 03:17:38 UTC
I can get the debug messages -- but how do I capture them?  I gave up after about 15 minutes of text scrolling across the boot because I'm fairly sure they weren't being logged....

I can create (scan) them manually with partprobe, and then they are there.  See my rc.local script above -- I'm doing this manually now at the end of the boot process.

Yes, to sg4:

[rrauenza@tendo ~]$ dmesg | grep sg4
sd 4:0:0:0: Attached scsi generic sg4 type 0
[rrauenza@tendo ~]$ 

[rrauenza@tendo ~]$ ll /dev/sd*
brw-rw---- 1 root disk 8,  0 2010-04-21 20:08 /dev/sda
brw-rw---- 1 root disk 8, 16 2010-04-21 20:08 /dev/sdb
brw-rw---- 1 root disk 8, 17 2010-04-21 20:08 /dev/sdb1
brw-rw---- 1 root disk 8, 32 2010-04-21 20:08 /dev/sdc
brw-rw---- 1 root disk 8, 33 2010-04-21 20:08 /dev/sdc1
brw-rw---- 1 root disk 8, 34 2010-04-21 20:08 /dev/sdc2
brw-rw---- 1 root disk 8, 35 2010-04-21 20:08 /dev/sdc3
brw-rw---- 1 root disk 8, 48 2010-04-21 20:08 /dev/sdd
brw-rw---- 1 root disk 8, 49 2010-04-21 20:08 /dev/sdd1
brw-rw---- 1 root disk 8, 50 2010-04-21 20:08 /dev/sdd2
brw-rw---- 1 root disk 8, 51 2010-04-21 20:08 /dev/sdd3
brw-rw---- 1 root disk 8, 64 2010-04-21 20:08 /dev/sde
brw-rw---- 1 root disk 8, 80 2010-04-21 20:08 /dev/sdf
brw-rw---- 1 root disk 8, 81 2010-04-21 20:08 /dev/sdf1
brw-rw---- 1 root disk 8, 96 2010-04-21 20:08 /dev/sdg
brw-rw---- 1 root disk 8, 97 2010-04-21 20:08 /dev/sdg1
[rrauenza@tendo ~]$ 


[root@tendo ~]# /sbin/MAKEDEV -x /dev/sde1    
[root@tendo ~]# ll /dev/sd*
brw-rw---- 1 root disk 8,  0 2010-04-21 20:08 /dev/sda
brw-rw---- 1 root disk 8, 16 2010-04-21 20:08 /dev/sdb
brw-rw---- 1 root disk 8, 17 2010-04-21 20:08 /dev/sdb1
brw-rw---- 1 root disk 8, 32 2010-04-21 20:08 /dev/sdc
brw-rw---- 1 root disk 8, 33 2010-04-21 20:08 /dev/sdc1
brw-rw---- 1 root disk 8, 34 2010-04-21 20:08 /dev/sdc2
brw-rw---- 1 root disk 8, 35 2010-04-21 20:08 /dev/sdc3
brw-rw---- 1 root disk 8, 48 2010-04-21 20:08 /dev/sdd
brw-rw---- 1 root disk 8, 49 2010-04-21 20:08 /dev/sdd1
brw-rw---- 1 root disk 8, 50 2010-04-21 20:08 /dev/sdd2
brw-rw---- 1 root disk 8, 51 2010-04-21 20:08 /dev/sdd3
brw-rw---- 1 root disk 8, 64 2010-04-21 20:08 /dev/sde
brw-r----- 1 root disk 8, 65 2010-04-21 20:12 /dev/sde1  <========
brw-rw---- 1 root disk 8, 80 2010-04-21 20:08 /dev/sdf
brw-rw---- 1 root disk 8, 81 2010-04-21 20:08 /dev/sdf1
brw-rw---- 1 root disk 8, 96 2010-04-21 20:08 /dev/sdg
brw-rw---- 1 root disk 8, 97 2010-04-21 20:08 /dev/sdg1
[root@tendo ~]# 

sda is still missing its partitions.

Doing a partprobe...

[root@tendo ~]# partprobe [I think the warnings are fine to ignore... -- Rich]
Warning: The kernel was unable to re-read the partition table on /dev/sdb (Device or resource busy).  This means Linux won't know anything about the modifications you made until you reboot.  You should reboot your computer before doing anything with /dev/sdb.
Warning: The kernel was unable to re-read the partition table on /dev/sdc (Device or resource busy).  This means Linux won't know anything about the modifications you made until you reboot.  You should reboot your computer before doing anything with /dev/sdc.
Warning: The kernel was unable to re-read the partition table on /dev/sdd (Device or resource busy).  This means Linux won't know anything about the modifications you made until you reboot.  You should reboot your computer before doing anything with /dev/sdd.
Warning: The kernel was unable to re-read the partition table on /dev/sdf (Device or resource busy).  This means Linux won't know anything about the modifications you made until you reboot.  You should reboot your computer before doing anything with /dev/sdf.
Warning: The kernel was unable to re-read the partition table on /dev/sdg (Device or resource busy).  This means Linux won't know anything about the modifications you made until you reboot.  You should reboot your computer before doing anything with /dev/sdg.
[root@tendo ~]# ll /dev/sd*
brw-rw---- 1 root disk 8,  0 2010-04-21 20:08 /dev/sda
brw-rw---- 1 root disk 8,  1 2010-04-21 20:13 /dev/sda1  <==============
brw-rw---- 1 root disk 8, 16 2010-04-21 20:08 /dev/sdb
brw-rw---- 1 root disk 8, 17 2010-04-21 20:08 /dev/sdb1
brw-rw---- 1 root disk 8, 32 2010-04-21 20:08 /dev/sdc
brw-rw---- 1 root disk 8, 33 2010-04-21 20:08 /dev/sdc1
brw-rw---- 1 root disk 8, 34 2010-04-21 20:08 /dev/sdc2
brw-rw---- 1 root disk 8, 35 2010-04-21 20:08 /dev/sdc3
brw-rw---- 1 root disk 8, 48 2010-04-21 20:08 /dev/sdd
brw-rw---- 1 root disk 8, 49 2010-04-21 20:08 /dev/sdd1
brw-rw---- 1 root disk 8, 50 2010-04-21 20:08 /dev/sdd2
brw-rw---- 1 root disk 8, 51 2010-04-21 20:08 /dev/sdd3
brw-rw---- 1 root disk 8, 64 2010-04-21 20:08 /dev/sde
brw-rw---- 1 root disk 8, 65 2010-04-21 20:12 /dev/sde1
brw-rw---- 1 root disk 8, 80 2010-04-21 20:08 /dev/sdf
brw-rw---- 1 root disk 8, 81 2010-04-21 20:08 /dev/sdf1
brw-rw---- 1 root disk 8, 96 2010-04-21 20:08 /dev/sdg
brw-rw---- 1 root disk 8, 97 2010-04-21 20:08 /dev/sdg1
[root@tendo ~]# 

Unrelated, or not, my disks are also assigned different drive letters than they were in FC11, and it appears to change slightly across boots as well.

Can I rerun udev in debug after a boot to see why it isn't picking them up?

Comment 5 Harald Hoyer 2010-04-28 07:02:15 UTC
Haha :) funny output in the log :)

 sde: sdc1 sdc2 sdc3
sd 2:0:1:0: [sdd] 234441648 512-byte logical blocks: (120 GB/111 GiB)
sd 2:0:1:0: [sdd] Write Protect is off
sd 2:0:1:0: [sdd] Mode Sense: 00 3a 00 00
sd 2:0:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdd: sde1
sd 4:0:0:0: [sde] Attached SCSI disk
 sdb1
sd 1:0:0:0: [sdb] Attached SCSI disk
 sdd1 sdd2 sdd3

.....

ok, now to the real problem:

/dev/sda might be recognized to be part of a raid as a disk (the whole disk, without partitions). Thus all partitions are removed.

Please provide the output of:

# blkid -o udev -p /dev/sda
# blkid -o udev -p /dev/sde

ID_FS_TYPE should not be "linux_raid_member" or "isw_raid_member". But in your case, I suspect it is.

Comment 6 Rich Rauenzahn 2010-04-28 16:19:25 UTC
I've done them all --

%%%%%%%%%%%%%%%%%%%%%%%% /dev/sda
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sdb
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sdc
ID_FS_VERSION=0.90.0
ID_FS_UUID=16117fc1-adc8-2110-37a6-da05623d0241
ID_FS_UUID_ENC=16117fc1-adc8-2110-37a6-da05623d0241
ID_FS_TYPE=linux_raid_member
ID_FS_USAGE=raid
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sdd
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sde
ID_FS_VERSION=0.90.0
ID_FS_UUID=16117fc1-adc8-2110-37a6-da05623d0241
ID_FS_UUID_ENC=16117fc1-adc8-2110-37a6-da05623d0241
ID_FS_TYPE=linux_raid_member
ID_FS_USAGE=raid
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sdf
ID_FS_UUID=zH6jW8-19fR-rPNz-069z-IhmZ-fKoh-vyqFnB
ID_FS_UUID_ENC=zH6jW8-19fR-rPNz-069z-IhmZ-fKoh-vyqFnB
ID_FS_VERSION=LVM2\x20001
ID_FS_TYPE=LVM2_member
ID_FS_USAGE=raid
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sdg
%%%%%%%%%%%%%%%%%%%%%%%% /dev/sdh

The drives/partitions that are actually raided are..

Personalities : [raid1] [raid6] [raid5] [raid4] 
md1 : active raid6 sdc1[0] sdh1[4] sdf1[3] sde1[2] sdd1[1]
      1465150464 blocks super 1.1 level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]
      
md0 : active raid1 sda3[0] sdb3[1]
      116430976 blocks [2/2] [UU]
      
unused devices: <none>

Here are the partition tables:


Disk /dev/sda: 120.0 GB, 120033041920 bytes
255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x000cf695

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          33      265041   83  Linux
/dev/sda2              34          98      522112+  82  Linux swap / Solaris
/dev/sda3              99       14593   116431087+  fd  Linux raid autodetect

Disk /dev/sdb: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x0007e0d2



   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          33      265041   83  Linux
/dev/sdb2              34          98      522112+  82  Linux swap / Solaris
/dev/sdb3              99       14593   116431087+  fd  Linux raid autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       60801   488384001   fd  Linux raid autodetect

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xa66accf0

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       60801   488384001   fd  Linux raid autodetect

Disk /dev/sde: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1       60801   488384001   fd  Linux raid autodetect

Disk /dev/sdf: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1               1       60801   488384001   fd  Linux raid autodetect

Disk /dev/sdg: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x9e49d4c1

   Device Boot      Start         End      Blocks   Id  System
/dev/sdg1               1      121601   976760001   8e  Linux LVM

Disk /dev/sdh: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xa194b1dc

   Device Boot      Start         End      Blocks   Id  System
/dev/sdh1               1       60801   488384001   fd  Linux raid autodetect


So, yes, I think you are right -- the drive is listed as a raid member instead of the partition.

As an aside -- Isn't that the right way to do RAID on Linux, is make a partition and assign it a partition type?  It seemed the safest to keep something else with mucking with it.  Or is whole disk the recommended way to do it now?  I guess I could have done whole disk RAID with a whole disk LVM on top.

Comment 7 Harald Hoyer 2010-04-28 17:57:07 UTC
personally, I would have done the raid with the whole disk...

ok, reassigning to util-linux-ng... I think Karel already has a fix for blkid.

Comment 8 Karel Zak 2010-04-29 08:01:26 UTC

*** This bug has been marked as a duplicate of bug 543749 ***


Note You need to log in before you can comment on or make changes to this bug.