597300 – F13 fails to boot at "dracut: Autoassembling MD Raid"

Bug 597300 - F13 fails to boot at "dracut: Autoassembling MD Raid"

Summary: F13 fails to boot at "dracut: Autoassembling MD Raid"

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mdadm
Sub Component:
Version:	14
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-05-28 15:37 UTC by Wolfgang Denk
Modified:	2011-10-07 14:06 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-10-07 14:06:07 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Console logs of a failing boot with F13 kernel/initrd and of a working boot with F12 kernel/initrd on the same system (9.86 KB, application/x-gzip) 2010-05-28 15:37 UTC, Wolfgang Denk	no flags	Details
Console log of failing Fedora 14 boot (23.65 KB, application/octet-stream) 2010-12-28 13:10 UTC, Wolfgang Denk	no flags	Details
View All

Description Wolfgang Denk 2010-05-28 15:37:55 UTC

Created attachment 417662 [details]
Console logs of a failing boot with F13 kernel/initrd and of a working boot with F12 kernel/initrd on the same system

Description of problem:

System fails to boot after upgrade from F12 to F13; old F12 kernel/ramdisk combo continue to work.

Version-Release number of selected component (if applicable):

dracut-005-3.fc13.noarch ?

How reproducible:

The effect is reliable; this is the first system I upgraded, so I cannot say much more yet.

Steps to Reproduce:
1. Upgrade from F12 to F13; root file system is on a software raid1
2. Reboot
3.
  
Actual results:

Booting stops with these error messages:

    dracut: Autoassembling MD Raid
    No root device found
    No root device found
    Boot has failed, sleeping forever.

Expected results:

Successful boot :-)

Additional info:

Please see attached log files that show the failing boot with the F13 kernel/ramdisk, and a successful boot with the same system, just using the old F12 kernel/ramdisk.

Comment 1 Thomas Moschny 2010-05-28 17:04:07 UTC

Very similar here, although sometimes, root is found, but it hangs either trying to mount it, or later on trying to remount rw. Sometimes I see even the 120 seconds timeout message from the kernel.

So not sure if it is ok jumping on that bug, as it is not as 'reliable' as for you, but reliable in the sense that I didn't manage to boot using f13 kernel and initramfs, while booting using f12 kernel and initramfs works fine.

Meanwhile I tend to think it is a kernel problem, not a dracut problem.

Comment 2 Wolfgang Denk 2010-06-11 11:41:12 UTC

No, this is not a kernel problem, but a dracut issue.

Compare the succesfull boot log:

...
md: raid1 personality registered for level 1
md: md0 stopped.
md: bind<sdb2>
md: bind<sda2>
raid1: raid set md0 active with 2 out of 2 mirrors
md0: detected capacity change from 0 to 492093702144
 md0: unknown partition table
XFS mounting filesystem dm-0
SELinux:  Disabled at runtime.
type=1404 audit(1276254698.641:2): selinux=0 auid=4294967295 ses=4294967295
                Welcome to Fedora
                Press 'I' to enter interactive startup.
Starting udev: ^[%Gudevd-work[228]: '/usr/bin/vmmouse_detect' unexpected exit with status 0x000b

udevd-work[233]: '/usr/bin/vmmouse_detect' unexpected exit with status 0x000b

[  OK  ]
Setting hostname sirius.denx.de:  [  OK  ]
Setting up Logical Volume Management:   3 logical volume(s) in volume group "misc" now active
  3 logical volume(s) in volume group "virt" now active
  4 logical volume(s) in volume group "sirius" now active
[  OK  ]
Checking filesystems
Checking all file systems.
...

against the broken one:
...
dracut: Reading all physical volumes. This may take a while...
dracut: Found volume group "misc" using metadata type lvm2
dracut: Found volume group "virt" using metadata type lvm2
dracut: 3 logical volume(s) in volume group "misc" now active
dracut: 3 logical volume(s) in volume group "virt" now active
dracut: Autoassembling MD Raid
ESC%GESC%G
Boot has failed, sleeping forever.
...

As you can see, in the successful case we have 3 volume groups ("sirius", "virt", and "misc"); in the failing dracut case we have but two ("virt" and "misc"). The "sirius" volume group is needed because the root file system in on a LV of this group:

# mount
/dev/mapper/sirius-root on / type xfs (rw,noatime)
...

And VG "sirius" is located on physical volume /dev/md0.

It seems to me that dracut starts the auto-assembly of the /dev/md0 raid array too late, so the raid array is not ready yet when the LV's get acticated, which results in a system without root file system.

This is a timing / synchronization issue in dracut resp. one of the udev or whatever scripts.

Comment 3 Wolfgang Denk 2010-06-11 14:10:47 UTC

Actually it seems I'm wrong - probably it's not a timing issue. When dropping into a debug shell I cannot see the /dev/md0 array at all: there is no /dev/md0 device node, there is nothing in /proc/mdstat, not even the raid1 personality has been loaded.

but I do see the "Autoassembling MD Raid" message printed, and adding another echo I can also see that it terminates without apparent errors.

The old ramdisks built by the fc11 mkinitrd tool work still fine.

I have no idea what's going on here.

Comment 4 lsllll1 2010-08-14 05:05:21 UTC

I believe I may be able to shed some more light on this problem.

Using rdshell I was able to figure out that the partitions that held the RAID container were not available to MD.

In my set up, I have sdb and sdd which are SSDs.  I have partitioned a small alignment partition at the beginning and then one partition on each (sdb1 and sdd1) for RAID0.

Here's the output from 'fdisk -l' with irrelavent disks cut out:

Disk /dev/sdb: 64.0 GB, 64023257088 bytes
64 heads, 32 sectors/track, 61057 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00010920

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1           1        1008   83  Linux
/dev/sdb2               2       61057    62521344   fd  Linux raid autodetect

Disk /dev/sdd: 64.0 GB, 64023257088 bytes
64 heads, 32 sectors/track, 61057 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00012512

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1           1        1008   83  Linux
/dev/sdd2               2       61057    62521344   fd  Linux raid autodetect

However, here's the output from 'cat /proc/partitions':

major minor  #blocks  name

   8        0  244198584 sda
   8        1     512000 sda1
   8        2    4099026 sda2
   8        3  239583510 sda3
   8       16   62522712 sdb
   8       32  244198584 sdc
   8       33  244187968 sdc1
   8       48   62522712 sdd

It seems as though the boot process does not see the partitions on sdb and sdd.  I don't know if that is because I have changed the number of Heads/Sectors from 255/63 to 64/32, but it definitely appears that MD cannot see the partitions it needs in order to assemble the RAID0 container that holds root.

mdadm --assemble --scan does nothing.  Oddly enough, partprobe sees the correct partitions, but does not update /proc/partitions.

Comment 5 Harald Hoyer 2010-08-16 06:51:24 UTC

(In reply to comment #4)
> I believe I may be able to shed some more light on this problem.
> 
> Using rdshell I was able to figure out that the partitions that held the RAID
> container were not available to MD.
> 
> In my set up, I have sdb and sdd which are SSDs.  I have partitioned a small
> alignment partition at the beginning and then one partition on each (sdb1 and
> sdd1) for RAID0.
> 
> Here's the output from 'fdisk -l' with irrelavent disks cut out:
> 
> Disk /dev/sdb: 64.0 GB, 64023257088 bytes
> 64 heads, 32 sectors/track, 61057 cylinders
> Units = cylinders of 2048 * 512 = 1048576 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00010920
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdb1               1           1        1008   83  Linux
> /dev/sdb2               2       61057    62521344   fd  Linux raid autodetect
> 
> Disk /dev/sdd: 64.0 GB, 64023257088 bytes
> 64 heads, 32 sectors/track, 61057 cylinders
> Units = cylinders of 2048 * 512 = 1048576 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00012512
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdd1               1           1        1008   83  Linux
> /dev/sdd2               2       61057    62521344   fd  Linux raid autodetect
> 
> However, here's the output from 'cat /proc/partitions':
> 
> major minor  #blocks  name
> 
>    8        0  244198584 sda
>    8        1     512000 sda1
>    8        2    4099026 sda2
>    8        3  239583510 sda3
>    8       16   62522712 sdb
>    8       32  244198584 sdc
>    8       33  244187968 sdc1
>    8       48   62522712 sdd
> 
> It seems as though the boot process does not see the partitions on sdb and sdd.
>  I don't know if that is because I have changed the number of Heads/Sectors
> from 255/63 to 64/32, but it definitely appears that MD cannot see the
> partitions it needs in order to assemble the RAID0 container that holds root.
> 
> mdadm --assemble --scan does nothing.  Oddly enough, partprobe sees the correct
> partitions, but does not update /proc/partitions.

What is the output of:

# blkid -o udev -p /dev/sdb | grep ID_FS_TYPE
# blkid -o udev -p /dev/sdd | grep ID_FS_TYPE

You might have used /dev/sdb and /dev/sdd as a raid device once (without partitions) and have stale raid signatures on them.

Comment 6 Dennis Schafroth 2010-08-16 08:10:00 UTC

Seeing the same on a system with LVM only.

Comment 7 Wolfgang Denk 2010-09-23 10:17:39 UTC

Note: The problem is still present with recent kernel versions, up to and including kernel-2.6.34.7-56.fc13.x86_64

Comment 8 Wolfgang Denk 2010-12-28 13:08:57 UTC

The problem is still present in F14; verified with kernel 2.6.35.10-74.fc14.x86_64

I will attach a full log of the failing boot under F14.  What catches my eye is this:

With a F12 kernel, I see this:

...
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 
ata1.00: ATA-6: ST3500320NS, SN04, max UDMA/133
ata1.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      ST3500320NS      SN04 PQ: 0 ANSI: 5
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB) 
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 
sd 0:0:0:0: [sda] Attached SCSI disk 
input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input4
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 
ata2.00: ATA-6: ST3500320NS, SN04, max UDMA/133
ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata2.00: configured for UDMA/133
scsi 1:0:0:0: Direct-Access     ATA      ST3500320NS      SN04 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) 
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdb: sdb1 sdb2 
sd 1:0:0:0: [sdb] Attached SCSI disk 
...
md: raid1 personality registered for level 1
md: md0 stopped.
md: bind<sdb2>
md: bind<sda2>
raid1: raid set md0 active with 2 out of 2 mirrors
md0: detected capacity change from 0 to 492093702144

Here the partitions on disks sda and sdb get detected, and the raid1 consisting of partitions sda2 and sdb2 gets assembled - this is the root file system we need.

With recent kernel, it stops at " Autoassembling MD Raid". Checking, I see this:

dracut:/# cat /proc/partitions
major minor  #blocks  name

   8        0  488386584 sda
   8       16  488386584 sdb
   8       32     250368 sdc
   8       33     250352 sdc1
   8       48  976519168 sdd
   8       64  976519168 sde
dracut:/# ls -l /dev/disk/by-id
total 0
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 ata-ST3500320NS_9QM0HT95 -> ../../sda
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 ata-ST3500320NS_9QM0KW95 -> ../../sdb
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 ata-Transcend_256M_SSSC256M04Z27A25906T -> ../../sdc
lrwxrwxrwx 1 0 root 10 Dec 28 12:46 ata-Transcend_256M_SSSC256M04Z27A25906T-part1 -> ../../sdc1
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 scsi-1AMCC_5ND2NSGH6F2EA100131A -> ../../sde
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 scsi-1AMCC_5ND2NTSQ6F2E9C002385 -> ../../sdd
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 scsi-SATA_ST3500320NS_9QM0HT95 -> ../../sda
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 scsi-SATA_ST3500320NS_9QM0KW95 -> ../../sdb
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 scsi-SATA_Transcend_25SSSC256M04Z27A25906T -> ../../sdc
lrwxrwxrwx 1 0 root 10 Dec 28 12:46 scsi-SATA_Transcend_25SSSC256M04Z27A25906T-part1 -> ../../sdc1
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 wwn-0x5000c50009adc710 -> ../../sda
lrwxrwxrwx 1 0 root  9 Dec 28 12:46 wwn-0x5000c50009ae3ab4 -> ../../sdb

It appears as if the partitions on /dev/sda and /dev/sdb were not recognized, even though the kernel boot messages see them?

dracut:/# blkid -o udev -p /dev/sda
ID_FS_UUID=3c490619-37ae-ee35-0b8f-d36e9667770c
ID_FS_UUID_ENC=3c490619-37ae-ee35-0b8f-d36e9667770c
ID_FS_UUID_SUB=23ad74bb-8b6d-5a29-9718-f79ed21de330
ID_FS_UUID_SUB_ENC=23ad74bb-8b6d-5a29-9718-f79ed21de330
ID_FS_LABEL=0
ID_FS_LABEL_ENC=0
ID_FS_VERSION=1.0
ID_FS_TYPE=linux_raid_member
ID_FS_USAGE=raid
dracut:/# blkid -o udev -p /dev/sdb
ID_FS_UUID=3c490619-37ae-ee35-0b8f-d36e9667770c
ID_FS_UUID_ENC=3c490619-37ae-ee35-0b8f-d36e9667770c
ID_FS_UUID_SUB=861525a2-6bdc-63aa-b304-391de17ad83b
ID_FS_UUID_SUB_ENC=861525a2-6bdc-63aa-b304-391de17ad83b
ID_FS_LABEL=0
ID_FS_LABEL_ENC=0
ID_FS_VERSION=1.0
ID_FS_TYPE=linux_raid_member
ID_FS_USAGE=raid

So I guess Harald's hint "You might have used /dev/sdb and /dev/sdd as a raid device once (without
partitions) and have stale raid signatures on them." might actually apply here, too.  If so, how can I fix that?

Comment 9 Wolfgang Denk 2010-12-28 13:10:41 UTC

Created attachment 470952 [details]
Console log of failing Fedora 14 boot

Comment 10 Wolfgang Denk 2010-12-28 13:17:18 UTC

It seems I can work around the problem like this:

dracut:/# partx -a /dev/sda
dracut:/# partx -a /dev/sdb
dracut:/# mdadm_auto
+ info Autoassem[ 1760.377314] dracut: Autoassembling MD Raid
bling MD Raid
+ check_quiet
+ [ -z  ]
+ DRACUT_QUIET=yes
+ getarg rdinfo
+ set +x
+ return 1
+ getarg quiet
+ set +x
+ return 1
+ DRACUT_QUIET=yes
+ echo <6>dracut: Autoassembling MD Raid
+ [ yes != yes ]
+ /sbin/mdadm -As+  --auto=yes --run
vinfo
+ read line
[ 1760.430807] md: md0 stopped.
[ 1760.436138] md: bind<sdb2>
[ 1760.440224] md: bind<sda2>
[ 1760.446281] md: raid1 personality registered for level 1
[ 1760.452798] md/raid1:md0: active with 2 out of 2 mirrors
[ 1760.459142] md0: detected capacity change from 0 to 492093702144
+ info[ 1760.466371] dracut: mdadm: /dev/md0 has been started with 2 drives.
[ 1760.466957]  md0: unknown partition table
 mdadm: /dev/md0 has been started with 2 drives.
+ check_quiet
+ [ -z yes ]
+ echo <6>dracut: mdadm: /dev/md0 has been started with 2 drives.
+ [ yes != yes ]
+ read line

Comment 11 Harald Hoyer 2011-01-11 10:57:48 UTC

(In reply to comment #8)

> So I guess Harald's hint "You might have used /dev/sdb and /dev/sdd as a raid
> device once (without
> partitions) and have stale raid signatures on them." might actually apply here,
> too.  If so, how can I fix that?

# mdadm --zero-superblock <disk/partition>

Comment 12 Wolfgang Denk 2011-01-11 13:52:01 UTC

Is this a safe (i.e. non-destructive) operation?  I have active RAID arrays running on these disks.

Comment 13 Harald Hoyer 2011-01-11 13:56:58 UTC

reassigning to mdadm in the hope comment 12 can be answered

Comment 14 Doug Ledford 2011-01-13 03:57:45 UTC

Depends on your raid device setup.  Can you post the contents of /proc/mdstat on a system that's fully up and running?

Comment 15 Wolfgang Denk 2011-01-13 19:55:12 UTC

# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid1] 
md1 : active raid1 sde1[0] sdf1[1]
      256896 blocks [2/2] [UU]
      
md2 : active raid1 sde3[0] sdf3[1]
      484118656 blocks [2/2] [UU]
      
md0 : active raid5 sda1[0] sdd1[3] sdc1[2] sdb1[1]
      735334848 blocks level 5, 16k chunk, algorithm 2 [4/4] [UUUU]
      
unused devices: <none>

Comment 16 Doug Ledford 2011-01-13 20:50:09 UTC

You have the old version 0.90 superblocks for your raid devices.  Depending on the exact layout of the partitions on the drive, the kernel may not be able to tell the difference between a superblock that should belong to sda versus one that should belong to sda1 (they might both occupy the same position at the end of the disk).  So, no, zeroing the superblock is not necessarily safe in your situation.  However, you can get a more detailed breakdown of your raid array's makeup (aka, mdadm -E /dev/sda1 for the full details) and then recreate your raid array using the version 1.0 superblock instead.  You would need to use a create command that mimicked your current setup exactly and passed the assume-clean flag as well.

However, this could be related to a race condition in the f13 mdadm.  Try booting the f13 system using the f12 kernel/initrd, then update the f13 mdadm using yum, then run dracut to rebuild a new f13 initramfs and then try booting from that initramfs and see if it solves the problem.

Comment 17 Wolfgang Denk 2011-01-13 21:08:37 UTC

I've updated to F14 long ago; the ramdisk has been rebuilt using current F14 tools. This did not help.

Comment 18 Doug Ledford 2011-07-15 00:05:55 UTC

Does this still happen, and if so what version of mdadm tools are you using?

Comment 19 Doug Ledford 2011-10-07 14:06:07 UTC

As per comment #16, the problem here is the old version 0.90 superblocks combined with the new udev way of doing things (specifically, udev now checks for a raid signature on the bare scsi disk before it checks for partitions on the scsi disk, and actually udev has done that all along, this change was caused when udev took over disk scanning as the kernel used to do partitions and created partition devices and by the time mdadm was ever requested to handle a disk, the partitions were already allocated, with udev we first allocate the device, then scan the device (and find an mdadm superblock that we then mistakenly think belongs to the whole device because it's the old 0.90 version format that can't tell between a whole disk superblock and a partition superblock) and only if the disk isn't claimed by an existing raid superblock does it then read the partition table and process partitions).  The user needs to rebuild his arrays using the newer 1.0 superblock format to resolve the issue.  As such, closing out as CANTFIX.

Note You need to log in before you can comment on or make changes to this bug.