Bug 447818

Summary: mdadm --incremental doesn't start partitionable arrays
Product: [Fedora] Fedora Reporter: George Joseph <g.devel>
Component: mdadmAssignee: Doug Ledford <dledford>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 9CC: carl, dev, jon.fairbairn, mark, mcepl, mcepl, nerijus, notting
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-19 14:47:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 453686    
Bug Blocks:    
Attachments:
Description Flags
Patch to Incremental.c to fix.
none
output of mdadm --examine of all three whole disks and all three partitions none

Description George Joseph 2008-05-21 22:13:41 UTC
Description of problem:

With Fedora <= 8, rc.sysinit called "mdadm -A -s" to start all arrays in the
mdadm.conf file.  Fedora 9 removed that code and now starts arrays using udev
rules that attempt to run "mdadm --incremental" as each new block device is
recognized.  Unfortunately, "--incremental" doesn't create partitionable arrays
correctly whereas "--assemble" does.

Version-Release number of selected component (if applicable):

mdadm-2.6.4-4.fc9

How reproducible:

Always


Steps to Reproduce:
1.  Create a partitionable array
      mdadm --create /dev/md_d0 -n 2 -l 0 -c 32 --auto=mdp /dev/sdb /dev/sdc
    Create mdadm.conf entries
      DEVICE /dev/sd[bc]
      ARRAY /dev/md_d0 level=raid0 num-devices=2 auto=mdp2
        devices=/dev/sdb,/dev/sdc
    Check that 5 device nodes were created
    md_d0, md_d0p1, md_d0p2, md_d0p3, md_d0p4

2.  Stop the array: mdadm --stop /dev/md_d0
    Delete the 5 device nodes.

3.  Run 
      mdadm --incremental /dev/sdb
      mdadm --incremental /dev/sdc
  
Actual results:

A single device node is created for md_d0 with a major number of 9 which is an
unpartitionable array.

Expected results:

    Device nodes md_d0, md_d0p1, md_d0p2, md_d0p3, md_d0p4 should have bbeen
created with a major number of 254 (or another high number).


Additional info:

I think no one noticed before because mdadm --assemble works fine and that's how
it had been done in the past.

Comment 1 George Joseph 2008-05-21 22:13:41 UTC
Created attachment 306322 [details]
Patch to Incremental.c to fix.

Comment 2 George Joseph 2008-05-21 22:15:31 UTC
*** Bug 446998 has been marked as a duplicate of this bug. ***

Comment 3 George Joseph 2008-05-21 22:18:03 UTC
Oops, there was a typo in the steps.
With auto=mdp2 in the mdadm.conf, there should only be 3 device nodes.
md_d0 md_d0p1 md_d0p2


Comment 4 Mark Mielke 2008-06-16 00:01:55 UTC
I have this problem as well. Seems like the md associated with the root
partition is started fine, but others with raid10 are not.

Right now I have a md device that isn't automatically assembling on boot in
Fedora 9. What is the advice here? Do I apply the patch to the SRPM and rebuild
an RPM? Do I switch back to mdadm --assemble in /etc/rc.*? Do I wait for a new
mdadm to be on updates/9 or updates/testing/9? The last update looks to be from
May 21st. Thanks for any advice. Cheers.

Comment 5 Doug Ledford 2008-06-17 13:39:39 UTC
In response to comment #4, for now I would add the line to rc.sysinit to
assemble the devices in the /etc/mdadm.conf file until a new package is released.

Comment 6 josip 2008-06-21 19:36:40 UTC
Same problem w/brand new Fedora 9 clean install.  Mdadm segfaults with "error 4"
apparently due to "mdadm --incremental".  Looking at /proc/mdstat shows that
raid10 arrays are defunct, but they start fine (in repair mode) with "mdadm
--assemble".

The bad news is that until this is fixed, Fedora 9 cannot do first boot.

Comment 7 Doug Ledford 2008-06-26 23:55:05 UTC
I added the --scan option to the udev rule so that mdadm will look for the array
in /etc/mdadm.conf when starting it, and if found, it will get information about
that array (such as it's a partitioned array) from that file.  This should solve
your problem.

Comment 8 Fedora Update System 2008-06-27 00:07:27 UTC
mdadm-2.6.7-1.fc9 has been submitted as an update for Fedora 9

Comment 9 George Joseph 2008-06-27 20:51:16 UTC
(In reply to comment #7)
> I added the --scan option to the udev rule so that mdadm will look for the array
> in /etc/mdadm.conf when starting it, and if found, it will get information about
> that array (such as it's a partitioned array) from that file.  This should solve
> your problem.

adding --scan (and the necessary --run) to --incremental doesn't solve the
problem by itself.  The device nodes for the partitions are still not created. 
What specifically did you have in mind?



Comment 10 Doug Ledford 2008-06-27 20:57:51 UTC
Can you paste your mdadm.conf file here for me to look at?

Comment 11 George Joseph 2008-06-27 21:01:43 UTC
You bet...

DEVICE /dev/sd[bc]
ARRAY /dev/md_d0 level=raid0 num-devices=2 auto=mdp2
   devices=/dev/sdb,/dev/sdc



Comment 12 Doug Ledford 2008-06-27 21:12:02 UTC
Thanks, I'll see if I can reproduce here.

Comment 13 Fedora Update System 2008-06-28 22:15:41 UTC
mdadm-2.6.7-1.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-5804

Comment 14 George Joseph 2008-06-29 04:39:19 UTC
Sorry but mdadm-2.6.7-1.fc9 does not fix the problem.

1.  The udev rule will only fire if the array is composed of partitions with a
type of 'fd'.  I.E.  The block devices sdb and sdc each have a partition table
with a partition set to 'fd', then the array is composed of sdb1 and sdc1.  The
udev rule does NOT fire if the array is composed of the sdb and sdc devices
directly because the rule looks for a fs type of "linux_raid*".

2.  If I create partitions for sdb1 and sdc1 with the types set to 'fd', and
create the array out of them,  the rule fires but the nodes for the array
partitions (md_d0p1 and md_d0p2) are still not created.


Comment 15 Doug Ledford 2008-07-01 21:09:42 UTC
(In reply to comment #14)
> Sorry but mdadm-2.6.7-1.fc9 does not fix the problem.
> 
> 1.  The udev rule will only fire if the array is composed of partitions with a
> type of 'fd'.  I.E.  The block devices sdb and sdc each have a partition table
> with a partition set to 'fd', then the array is composed of sdb1 and sdc1.  The
> udev rule does NOT fire if the array is composed of the sdb and sdc devices
> directly because the rule looks for a fs type of "linux_raid*".

I don't have a spare disk to remove the partition table on and test this, but I
did test this by trying to see how udev responded to loopback devices that had
no disk partition on them.  In that case, udev was smart enough to know it was a
linux raid device *without* a partition table, and without a partition table
type of 0xfd.  Instead, it was reading from the disk and looking for a raid
superblock (I specifically test version 0.90 and 1.1 superblocks and it found
both types, and it even changed ID_FS_VERSION to match the raid superblock type).

You can duplicate my tests yourself by running:

/lib/udev/vol_id /dev/<devicename>

and checking the output.  If you could please run the above command on a whole
disk device under your setup and verify whether or not it correctly identifies
the whole disk device, I would appreciate it.

Comment 16 Doug Ledford 2008-07-01 21:20:50 UTC
Also, generally speaking, it's best to have the mdadm.conf file not specify any
device names.  So, this would be a better mdadm.conf file in terms of being
resilient to possible device name changes in the event of addition/removal of a
drive, that sort of thing:


DEVICE partitions
ARRAY /dev/md_d0 level=raid0 num-devices=2 auto=mdp2 UUID=<uuid of array>

In addition, I'm not positive but I don't think mdadm will automatically
associate an array entry in the mdadm.conf file with an array being started
unless there is something like UUID specified in the array line (I think it will
accept UUID or super-minor, but as super-minor won't work with a partitioned
array, UUID may be the only thing it will accept, I'm relatively certain it does
*not* accept the array device name).  What this means is that the --scan portion
of the rule likely wouldn't work with your mdadm.conf file because there is not
an iron-clad match between the array being started and your array line in your
mdadm.conf file.  Can you change your array line to be the output of `mdadm -Db
/dev/md_d0` and change the DEVICE line to partitions and then see if when the
rule fires, it picks up the fact that the array is supposed to be partitioned?

Comment 17 George Joseph 2008-07-01 22:15:42 UTC
Still 2 strikes but I've got some additional info...

The udev rule does not fire for raw block devices I think because nothing ever
triggers vol_id to run.  When I do a udevadm --test /block/sdb/sdb1, I can see
in the output that vol_id is run and ID_FS_TYPE is set correctly to
linux_raid_member.  I can also see the mdadm rule run.  When I do the same test
on /block/sdb, vol_id is never run so ID_FS_TYPE is not set and mdadm is not
run.  It looks like 60-persistent-storage.rules only runs vol_id for partitions.

So, I also changed the mdadm.conf as you suggested but --incremental still
doesn't create the correct device nodes.






Comment 18 Doug Ledford 2008-07-01 22:24:43 UTC
Thanks for checking on that.  So, for the first issue, I don't think that's an
mdadm or md raid bug if udev is configured not to run vol_id on whole block
devices.  I know the upstream maintainer of mdadm/md kernel driver actually
prefers running his raid arrays on the bare block devices, so I think that's
something that needs to be supported.  However, I could also imagine that udev
might have problems with whole disk devices and hitting false positive matches,
which might be why they only do partitions.  I think it's something that needs a
new bug under the udev component, so I'll go ahead and clone this bug for that
purpose.

As to the --incremental mode still not working with the corrected mdadm.conf,
thanks for that, I'll replicate here and see if I can get it fixed.

Comment 19 Matěj Cepl 2008-07-02 07:00:57 UTC
Just note from your friendly bug triager -- we don't use FALS_QA state in Fedora
bugs. I think, that the correct status of this bug according to
https://fedoraproject.org/wiki/BugZappers/BugStatusWorkFlow is ASSIGNED. Please,
correct this bug to the right state, if I am wrong.



Comment 20 Nerijus Baliūnas 2008-07-15 02:43:49 UTC
I removed one disk for replacement and F9 no longer assembles raid arrays and
cannot boot.
For now I added mdadm -As to rc.sysinit.
mdadm.conf:
DEVICE partitions
ARRAY /dev/md0 level=raid1 num-devices=2
UUID=ac076057:ef44f505:144dc698:81306580
ARRAY /dev/md1 level=raid1 num-devices=2
UUID=e63c97da:6d0e4d90:17dfe16b:5c5fccf9
ARRAY /dev/md2 level=raid1 num-devices=2
UUID=3aa88406:7acc86d2:bcc1fee4:559a377d
...
ARRAY /dev/md8 level=raid1 num-devices=2
UUID=1667179c:e7ae8626:9a9c6758:a015e6ea


Comment 21 Doug Ledford 2008-10-29 18:26:51 UTC
The patch required to fix this has been written, but it's rather larger than I would like.  However, it's passed my testing of incremental assembly of both partitioned and normal devices with standard and non-standard names.

Comment 22 Fedora Update System 2008-10-30 13:55:03 UTC
mdadm-2.6.7.1-1.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/mdadm-2.6.7.1-1.fc9

Comment 23 Fedora Update System 2008-10-31 10:26:05 UTC
mdadm-2.6.7.1-1.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9325

Comment 24 Fedora Update System 2008-11-19 14:47:26 UTC
mdadm-2.6.7.1-1.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 25 Carl Farrington 2008-11-26 14:23:53 UTC
I am not sure if this is my problem, but I have upgraded to Fedora 10 from F9 (where I also had problems and had to do the F8->F9 upgrade with /home missing from fstab) and my RAID5 /home array is not working. Well, it is working now, in a degraded state, but basically on bootup the third partition (sdc1) of the three drive array of partitions (sd[abc]1) is being bound/registered as "md_d0" when it should be part of "md0". I had a nightmare getting the system booting at all here, but I think that was due to my fstab having the UUID. Now that I have /dev/RAID5/RAID5 in fstab the system boots, but with sdc1 being bound to md_d0.

If I unmount /home, then do "mdadm --stop /dev/md0", and "mdadm --stop /dev/md_d0", then "mdadm -As", the third disk is recognised properly, or if I just do "mdadm --stop /dev/md_d0" then "mdadm --add /dev/md0 /dev/sdc1", it brings the third disk back online and does a 3hr resync to that disk..

I can see in dmesg a similar mdadm sefault (error 4) as discussed in Comment #6:
mdadm[1605]: segfault at 10e ip 00007f0ff8aade20 sp 00007fff00dcd5f8 error 4 in 
libc-2.9.so[7f0ff8a2d000+168000]
md: bind<sdc>

I appear to be running mdadm-2.6.7.1-1.fc10.x86_64

Upon looking through the boot information (dmesg), it looks as though the whole disk (sdc) is being recognised as part of a raid system, when it should only be the partition:
[root@mediaxp etc]# dmesg|grep bind|grep md
md: bind<sdc>
md: bind<sdb1>
md: bind<sda1>
(this is a problem state. Then I stop /dev/md_d0 and add /dev/sdc1 to /dev/md0, as below)
md: unbind<sdc>
md: bind<sdc1>


Any thoughts? Did I do my RAID wrong all that time ago? :)

Comment 26 Carl Farrington 2008-11-26 14:37:49 UTC
Created attachment 324722 [details]
output of mdadm --examine of all three whole disks and all three partitions

Attached is the output of mdadm --examine of each disk and each partition.

Comment 27 Carl Farrington 2008-11-26 14:53:08 UTC
Phew! That was scary! I zeroed the superblock on /dev/sdc, but then when I rebooted mdadm claimed sda1 and sdb1(only 1% rebuilt) for md0 and claimed sdb for md_d0. Then I zeroed the superblock on sdb and rebooted and everything is hunky dorey!

Sorry for wasting your time. I guess when I created the array I must have begun by using the whole disks, then changed by mind to using partitions. Something has changed in the way F10 recognises disks/partitions for assembling arrays and it was grabbing some of the whole disks for additional arrays.

I suppose it's not a bug then.

cheers

Comment 28 Carl Farrington 2008-11-26 15:00:31 UTC
Sorry, I meant sda1 and sdc1(only 1% rebuilt), so the array was 2/3 failed disks.

Comment 29 Kevin Monroe 2008-11-27 16:11:37 UTC
I had the same problem as Carl as soon as I upgraded mdadm to 2.6.7.1-1.fc9.i386:

"on bootup the third partition (sdc1) of the three drive array of partitions (sd[abc]1) is being bound/registered as 'md_d0' when it should be part of 'md0'"

I believe this binding issue is happening due to the changes to udev rules in mdadm-2.6.7. See https://bugzilla.redhat.com/show_bug.cgi?id=444237 for more info about 70-mdadm.rules.

In my case, I was having to stop md_d0 and 'mdadm /dev/md0 --add /dev/sdc1' every time I rebooted the machine. Before I saw Carl's workaround about zeroing the superblock(s), I came across this thread discussing partionable vs non-partitionable raid devices:

http://www.linuxfromscratch.org/pipermail/livecd/2007-September/004937.html

The fix for me was to edit my /etc/mdadm.conf and /etc/fstab and replace /dev/md0 with /dev/md_d0. All was good after a reboot. I don't know if zeroing the superblocks and calling the array /dev/md0 is better than renaming the device in a couple places and calling it /dev/md_d0. It seems like both will do the job.

Carl, I don't think you used a disk and 2 partitions when you created the array. I say that because I let the F9 installer create my array; I'm sure I told it to use 3 partitions, yet I wound up with /dev/md0 and the same problem you saw.

I don't know if it's worthwhile to reopen this bug, but I think mdadm and/or udev rules should be able to correctly identify array members regardless of the device name.

Comment 30 Carl Farrington 2008-11-27 16:24:34 UTC
Hi Kevin. In my case, the output of mdadm --examine on the disks as a whole  showed that there was a superblock on /dev/sdb and /dev/sdc, but not /dev/sda. There shouldn't have been superblocks on any of the disks as a whole. mdadm --examine on /dev/sd[abc]1 (the partitions), showed superblocks on the partitions as expected. I am not sure if it is a good idea to just zero valid superblocks, but the superblocks that I wiped out were ones that weren't supposed to be there anyway, since I know that my raid5 volume is made up of /dev/sda1, /dev/sdb1 and /dev/sdc1 - i.e. the partitions. I think there's a good chance that I did something wrong though because I didn't really know what I was doing when I created the array and the lvm volume on it all that time ago..

Comment 31 Doug Ledford 2008-11-29 14:51:06 UTC
It's very likely that a minor update to your mdadm.conf file will resolve these issues.  Specifically, most array lines in the mdadm.conf file don't contain an auto= item.  This means that mdadm is guessing about whether or not your array is a partitioned array, and it's obviously getting it wrong.  Adding a specific item of auto=md (for a regular, non-partitioned array) or auto=mdp (for a partitioned array) will cause mdadm to only try and create the correct type of array.