Red Hat Bugzilla – Bug 447818
mdadm --incremental doesn't start partitionable arrays
Last modified: 2008-11-29 09:51:06 EST
Description of problem:
With Fedora <= 8, rc.sysinit called "mdadm -A -s" to start all arrays in the
mdadm.conf file. Fedora 9 removed that code and now starts arrays using udev
rules that attempt to run "mdadm --incremental" as each new block device is
recognized. Unfortunately, "--incremental" doesn't create partitionable arrays
correctly whereas "--assemble" does.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create a partitionable array
mdadm --create /dev/md_d0 -n 2 -l 0 -c 32 --auto=mdp /dev/sdb /dev/sdc
Create mdadm.conf entries
ARRAY /dev/md_d0 level=raid0 num-devices=2 auto=mdp2
Check that 5 device nodes were created
md_d0, md_d0p1, md_d0p2, md_d0p3, md_d0p4
2. Stop the array: mdadm --stop /dev/md_d0
Delete the 5 device nodes.
mdadm --incremental /dev/sdb
mdadm --incremental /dev/sdc
A single device node is created for md_d0 with a major number of 9 which is an
Device nodes md_d0, md_d0p1, md_d0p2, md_d0p3, md_d0p4 should have bbeen
created with a major number of 254 (or another high number).
I think no one noticed before because mdadm --assemble works fine and that's how
it had been done in the past.
Created attachment 306322 [details]
Patch to Incremental.c to fix.
*** Bug 446998 has been marked as a duplicate of this bug. ***
Oops, there was a typo in the steps.
With auto=mdp2 in the mdadm.conf, there should only be 3 device nodes.
md_d0 md_d0p1 md_d0p2
I have this problem as well. Seems like the md associated with the root
partition is started fine, but others with raid10 are not.
Right now I have a md device that isn't automatically assembling on boot in
Fedora 9. What is the advice here? Do I apply the patch to the SRPM and rebuild
an RPM? Do I switch back to mdadm --assemble in /etc/rc.*? Do I wait for a new
mdadm to be on updates/9 or updates/testing/9? The last update looks to be from
May 21st. Thanks for any advice. Cheers.
In response to comment #4, for now I would add the line to rc.sysinit to
assemble the devices in the /etc/mdadm.conf file until a new package is released.
Same problem w/brand new Fedora 9 clean install. Mdadm segfaults with "error 4"
apparently due to "mdadm --incremental". Looking at /proc/mdstat shows that
raid10 arrays are defunct, but they start fine (in repair mode) with "mdadm
The bad news is that until this is fixed, Fedora 9 cannot do first boot.
I added the --scan option to the udev rule so that mdadm will look for the array
in /etc/mdadm.conf when starting it, and if found, it will get information about
that array (such as it's a partitioned array) from that file. This should solve
mdadm-2.6.7-1.fc9 has been submitted as an update for Fedora 9
(In reply to comment #7)
> I added the --scan option to the udev rule so that mdadm will look for the array
> in /etc/mdadm.conf when starting it, and if found, it will get information about
> that array (such as it's a partitioned array) from that file. This should solve
> your problem.
adding --scan (and the necessary --run) to --incremental doesn't solve the
problem by itself. The device nodes for the partitions are still not created.
What specifically did you have in mind?
Can you paste your mdadm.conf file here for me to look at?
ARRAY /dev/md_d0 level=raid0 num-devices=2 auto=mdp2
Thanks, I'll see if I can reproduce here.
mdadm-2.6.7-1.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update mdadm'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-5804
Sorry but mdadm-2.6.7-1.fc9 does not fix the problem.
1. The udev rule will only fire if the array is composed of partitions with a
type of 'fd'. I.E. The block devices sdb and sdc each have a partition table
with a partition set to 'fd', then the array is composed of sdb1 and sdc1. The
udev rule does NOT fire if the array is composed of the sdb and sdc devices
directly because the rule looks for a fs type of "linux_raid*".
2. If I create partitions for sdb1 and sdc1 with the types set to 'fd', and
create the array out of them, the rule fires but the nodes for the array
partitions (md_d0p1 and md_d0p2) are still not created.
(In reply to comment #14)
> Sorry but mdadm-2.6.7-1.fc9 does not fix the problem.
> 1. The udev rule will only fire if the array is composed of partitions with a
> type of 'fd'. I.E. The block devices sdb and sdc each have a partition table
> with a partition set to 'fd', then the array is composed of sdb1 and sdc1. The
> udev rule does NOT fire if the array is composed of the sdb and sdc devices
> directly because the rule looks for a fs type of "linux_raid*".
I don't have a spare disk to remove the partition table on and test this, but I
did test this by trying to see how udev responded to loopback devices that had
no disk partition on them. In that case, udev was smart enough to know it was a
linux raid device *without* a partition table, and without a partition table
type of 0xfd. Instead, it was reading from the disk and looking for a raid
superblock (I specifically test version 0.90 and 1.1 superblocks and it found
both types, and it even changed ID_FS_VERSION to match the raid superblock type).
You can duplicate my tests yourself by running:
and checking the output. If you could please run the above command on a whole
disk device under your setup and verify whether or not it correctly identifies
the whole disk device, I would appreciate it.
Also, generally speaking, it's best to have the mdadm.conf file not specify any
device names. So, this would be a better mdadm.conf file in terms of being
resilient to possible device name changes in the event of addition/removal of a
drive, that sort of thing:
ARRAY /dev/md_d0 level=raid0 num-devices=2 auto=mdp2 UUID=<uuid of array>
In addition, I'm not positive but I don't think mdadm will automatically
associate an array entry in the mdadm.conf file with an array being started
unless there is something like UUID specified in the array line (I think it will
accept UUID or super-minor, but as super-minor won't work with a partitioned
array, UUID may be the only thing it will accept, I'm relatively certain it does
*not* accept the array device name). What this means is that the --scan portion
of the rule likely wouldn't work with your mdadm.conf file because there is not
an iron-clad match between the array being started and your array line in your
mdadm.conf file. Can you change your array line to be the output of `mdadm -Db
/dev/md_d0` and change the DEVICE line to partitions and then see if when the
rule fires, it picks up the fact that the array is supposed to be partitioned?
Still 2 strikes but I've got some additional info...
The udev rule does not fire for raw block devices I think because nothing ever
triggers vol_id to run. When I do a udevadm --test /block/sdb/sdb1, I can see
in the output that vol_id is run and ID_FS_TYPE is set correctly to
linux_raid_member. I can also see the mdadm rule run. When I do the same test
on /block/sdb, vol_id is never run so ID_FS_TYPE is not set and mdadm is not
run. It looks like 60-persistent-storage.rules only runs vol_id for partitions.
So, I also changed the mdadm.conf as you suggested but --incremental still
doesn't create the correct device nodes.
Thanks for checking on that. So, for the first issue, I don't think that's an
mdadm or md raid bug if udev is configured not to run vol_id on whole block
devices. I know the upstream maintainer of mdadm/md kernel driver actually
prefers running his raid arrays on the bare block devices, so I think that's
something that needs to be supported. However, I could also imagine that udev
might have problems with whole disk devices and hitting false positive matches,
which might be why they only do partitions. I think it's something that needs a
new bug under the udev component, so I'll go ahead and clone this bug for that
As to the --incremental mode still not working with the corrected mdadm.conf,
thanks for that, I'll replicate here and see if I can get it fixed.
Just note from your friendly bug triager -- we don't use FALS_QA state in Fedora
bugs. I think, that the correct status of this bug according to
https://fedoraproject.org/wiki/BugZappers/BugStatusWorkFlow is ASSIGNED. Please,
correct this bug to the right state, if I am wrong.
I removed one disk for replacement and F9 no longer assembles raid arrays and
For now I added mdadm -As to rc.sysinit.
ARRAY /dev/md0 level=raid1 num-devices=2
ARRAY /dev/md1 level=raid1 num-devices=2
ARRAY /dev/md2 level=raid1 num-devices=2
ARRAY /dev/md8 level=raid1 num-devices=2
The patch required to fix this has been written, but it's rather larger than I would like. However, it's passed my testing of incremental assembly of both partitioned and normal devices with standard and non-standard names.
mdadm-126.96.36.199-1.fc9 has been submitted as an update for Fedora 9.
mdadm-188.8.131.52-1.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update mdadm'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9325
mdadm-184.108.40.206-1.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.
I am not sure if this is my problem, but I have upgraded to Fedora 10 from F9 (where I also had problems and had to do the F8->F9 upgrade with /home missing from fstab) and my RAID5 /home array is not working. Well, it is working now, in a degraded state, but basically on bootup the third partition (sdc1) of the three drive array of partitions (sd[abc]1) is being bound/registered as "md_d0" when it should be part of "md0". I had a nightmare getting the system booting at all here, but I think that was due to my fstab having the UUID. Now that I have /dev/RAID5/RAID5 in fstab the system boots, but with sdc1 being bound to md_d0.
If I unmount /home, then do "mdadm --stop /dev/md0", and "mdadm --stop /dev/md_d0", then "mdadm -As", the third disk is recognised properly, or if I just do "mdadm --stop /dev/md_d0" then "mdadm --add /dev/md0 /dev/sdc1", it brings the third disk back online and does a 3hr resync to that disk..
I can see in dmesg a similar mdadm sefault (error 4) as discussed in Comment #6:
mdadm: segfault at 10e ip 00007f0ff8aade20 sp 00007fff00dcd5f8 error 4 in
I appear to be running mdadm-220.127.116.11-1.fc10.x86_64
Upon looking through the boot information (dmesg), it looks as though the whole disk (sdc) is being recognised as part of a raid system, when it should only be the partition:
[root@mediaxp etc]# dmesg|grep bind|grep md
(this is a problem state. Then I stop /dev/md_d0 and add /dev/sdc1 to /dev/md0, as below)
Any thoughts? Did I do my RAID wrong all that time ago? :)
Created attachment 324722 [details]
output of mdadm --examine of all three whole disks and all three partitions
Attached is the output of mdadm --examine of each disk and each partition.
Phew! That was scary! I zeroed the superblock on /dev/sdc, but then when I rebooted mdadm claimed sda1 and sdb1(only 1% rebuilt) for md0 and claimed sdb for md_d0. Then I zeroed the superblock on sdb and rebooted and everything is hunky dorey!
Sorry for wasting your time. I guess when I created the array I must have begun by using the whole disks, then changed by mind to using partitions. Something has changed in the way F10 recognises disks/partitions for assembling arrays and it was grabbing some of the whole disks for additional arrays.
I suppose it's not a bug then.
Sorry, I meant sda1 and sdc1(only 1% rebuilt), so the array was 2/3 failed disks.
I had the same problem as Carl as soon as I upgraded mdadm to 18.104.22.168-1.fc9.i386:
"on bootup the third partition (sdc1) of the three drive array of partitions (sd[abc]1) is being bound/registered as 'md_d0' when it should be part of 'md0'"
I believe this binding issue is happening due to the changes to udev rules in mdadm-2.6.7. See https://bugzilla.redhat.com/show_bug.cgi?id=444237 for more info about 70-mdadm.rules.
In my case, I was having to stop md_d0 and 'mdadm /dev/md0 --add /dev/sdc1' every time I rebooted the machine. Before I saw Carl's workaround about zeroing the superblock(s), I came across this thread discussing partionable vs non-partitionable raid devices:
The fix for me was to edit my /etc/mdadm.conf and /etc/fstab and replace /dev/md0 with /dev/md_d0. All was good after a reboot. I don't know if zeroing the superblocks and calling the array /dev/md0 is better than renaming the device in a couple places and calling it /dev/md_d0. It seems like both will do the job.
Carl, I don't think you used a disk and 2 partitions when you created the array. I say that because I let the F9 installer create my array; I'm sure I told it to use 3 partitions, yet I wound up with /dev/md0 and the same problem you saw.
I don't know if it's worthwhile to reopen this bug, but I think mdadm and/or udev rules should be able to correctly identify array members regardless of the device name.
Hi Kevin. In my case, the output of mdadm --examine on the disks as a whole showed that there was a superblock on /dev/sdb and /dev/sdc, but not /dev/sda. There shouldn't have been superblocks on any of the disks as a whole. mdadm --examine on /dev/sd[abc]1 (the partitions), showed superblocks on the partitions as expected. I am not sure if it is a good idea to just zero valid superblocks, but the superblocks that I wiped out were ones that weren't supposed to be there anyway, since I know that my raid5 volume is made up of /dev/sda1, /dev/sdb1 and /dev/sdc1 - i.e. the partitions. I think there's a good chance that I did something wrong though because I didn't really know what I was doing when I created the array and the lvm volume on it all that time ago..
It's very likely that a minor update to your mdadm.conf file will resolve these issues. Specifically, most array lines in the mdadm.conf file don't contain an auto= item. This means that mdadm is guessing about whether or not your array is a partitioned array, and it's obviously getting it wrong. Adding a specific item of auto=md (for a regular, non-partitioned array) or auto=mdp (for a partitioned array) will cause mdadm to only try and create the correct type of array.