Bug 465542 - mkinitrd drops md raid members after first (F) or (S) disk
Summary: mkinitrd drops md raid members after first (F) or (S) disk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: mkinitrd
Version: 10
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 469611
TreeView+ depends on / blocked
 
Reported: 2008-10-03 19:42 UTC by Alexandre Oliva
Modified: 2008-12-05 04:02 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2008-12-05 04:02:48 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Patches are nicer as attachments :-) (827 bytes, patch)
2008-10-03 19:43 UTC, Alexandre Oliva
no flags Details | Diff
Improved version of the previous patch, covering write-mostly (W) too (828 bytes, patch)
2008-10-18 00:17 UTC, Alexandre Oliva
no flags Details | Diff

Description Alexandre Oliva 2008-10-03 19:42:38 UTC
When mdstat lists spare or failed raid members before active ones, mkinitrd won't see them.  If they're necessary to bring up the root device, you lose.  Big problem, especially when the member became failed because of bug 465539, or it became spare because something went wrong in the current udev(?)-based progressive bringing up of array members (that often causes members brought in a bit too late to come up as spares needing resync).

This patch strips the (S)s and (F)s from the list of md raid components, enabling a successful probing of all member devices.  Without it, findstoragedriver returns after failing to locate a /sys/block/$device for say sda1([SF]).  I found it odd that it would return rather than continue on to the next argument, so I fixed that too.  If there's a stronger reason to return, please drop that part of the patch.

--- /tmp/mkinitrd.orig	2008-10-03 05:56:07.000000000 -0300
+++ /tmp/mkinitrd	2008-10-03 16:40:46.000000000 -0300
@@ -384,7 +384,7 @@
                 sysfs=$(for x in /sys/block/* ; do findone -type d $x/ -name $device; done)
             fi
         fi
-        [ -z "$sysfs" ] && return
+        [ -z "$sysfs" ] && continue
         qpushd $sysfs
         findstoragedriverinsys
         qpopd
@@ -582,7 +582,7 @@
     fi
 
     levels=$(awk "/^$1[	 ]*:/ { print\$4 }" /proc/mdstat)
-    devs=$(gawk "/^$1[	 ]*:/ { print gensub(\"\\\\[[0-9]*\\\\]\",\"\",\"g\",gensub(\"^md.*raid[0-9]*\",\"\",\"1\")) }" /proc/mdstat)
+    devs=$(gawk "/^$1[	 ]*:/ { print gensub(\"\\\\[[0-9]*\\\\](\\\\([SF]\\\\))?\",\"\",\"g\",gensub(\"^md.*raid[0-9]*\",\"\",\"1\")) }" /proc/mdstat)
 
     for level in $levels ; do
         case $level in

Comment 1 Alexandre Oliva 2008-10-03 19:43:24 UTC
Created attachment 319410 [details]
Patches are nicer as attachments :-)

Comment 2 Alexandre Oliva 2008-10-18 00:17:06 UTC
Created attachment 320733 [details]
Improved version of the previous patch, covering write-mostly (W) too

Can we please have this patch installed?  Without it, write-mostly is broken, and spares may get you in trouble unless you notice they took over before a reboot.  This is not just for failure cases, write-mostly is present in regular operation.

Comment 3 Alexandre Oliva 2008-11-03 05:45:10 UTC
Could this trivial patch be integrated, pretty please?

Comment 4 Peter Jones 2008-11-03 22:30:52 UTC
Patch will be applied in 6.0.70-1 .

Comment 5 Bug Zapper 2008-11-26 03:31:35 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 6 Alexandre Oliva 2008-12-05 04:02:48 UTC
Confirmed fixed in F-10 GOLD, thanks.


Note You need to log in before you can comment on or make changes to this bug.