i have two machines with 3 RAID1 for boot, sysroot and data originally they had boot: /dev/md0 /: /dev/md1 /data: /dev/md2 due a dist-upgrade in the past they went to md125, md126, md127 and i acceptd that and changed /etc/mdadm.conf recently upgraded to Fedora 20 as you can see in the snippet of my rebuild script now again changed what is which partition and frankly /etc/mdadm.conf seems to be ignored because it contains the UUID which are not honored interesting is that i have 4 other physical machines with the same partitioning originally installed with Fedora 14 which never changed the /dev/md0,1,2 for me that are two issues: * i want the mdX and partition connection stable * i want my /dev/md0, /dev/md1 and /dev/md2 back _________________________________________________________ /dev/md126 ext4 9,5G 1,2G 8,3G 13% / /dev/md127 ext4 488M 73M 407M 16% /boot /dev/md125 ext4 907G 47G 851G 6% /tmp _________________________________________________________ Personalities : [raid1] md125 : active raid1 sda2[1] sdb2[2] 965989244 blocks super 1.1 [2/2] [UU] bitmap: 1/8 pages [4KB], 65536KB chunk md126 : active raid1 sda3[1] sdb3[2] 10238908 blocks super 1.1 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md127 : active raid1 sda1[2] sdb1[3] 524276 blocks super 1.0 [2/2] [UU] _________________________________________________________ part of my script after replace a drive # start RAID recovery mdadm /dev/md125 --add ${BAD_DISK}2 mdadm /dev/md126 --add ${BAD_DISK}1 mdadm /dev/md127 --add ${BAD_DISK}3 _________________________________________________________ [root@localhost:~]$ cat /etc/mdadm.conf MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md126 level=raid1 num-devices=2 UUID=4a6f983f:bd1e7e97:e1d3d286:96812c9a ARRAY /dev/md127 level=raid1 num-devices=2 UUID=a69fd22b:1a9b9b26:119d5887:18062cab ARRAY /dev/md125 level=raid1 num-devices=2 UUID=d9bb7fc7:9df30300:7df27a6f:f2b875b8 _________________________________________________________ [root@localhost:~]$ /sbin/mdadm --detail /dev/md127 /dev/md127: Version : 1.0 Creation Time : Mon Mar 19 13:48:16 2012 Raid Level : raid1 Array Size : 524276 (512.07 MiB 536.86 MB) Used Dev Size : 524276 (512.07 MiB 536.86 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Fri Aug 15 11:19:49 2014 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : localhost.localdomain:0 UUID : 4a6f983f:bd1e7e97:e1d3d286:96812c9a Events : 595 Number Major Minor RaidDevice State 3 8 17 0 active sync /dev/sdb1 2 8 1 1 active sync /dev/sda1
This is not a bug. You are attempting to use static numbering with a version 1.0 array. Any version 1.x array is name based, and the consistent name of the array is stored in /dev/md/ and the md device number used is variable, starting at 127 and counting backwards. I suggest you rename the arrays to something like home, root, and data, and then use /dev/md/home, /dev/md/root, and /dev/md/data as your access method. I would change the mdadm.conf entries to reflect the new names, and use mdadm --grow to change the name.
how can that be not a bug if it worked with F16 over all mdadm and other system upgrades and besides the case with 3 RAID1 arrays on 4 machines with /boot as RAID1 it did not lose the /dev/md0 over 6 dist-upgrades
(In reply to Harald Reindl from comment #2) > how can that be not a bug if it worked with F16 over all mdadm and other > system upgrades and besides the case with 3 RAID1 arrays on 4 machines with > /boot as RAID1 it did not lose the /dev/md0 over 6 dist-upgrades Because if you installed the system long enough ago, then the raid is still a version 0.90 superblock raid device. It was the old version 0.90 superblocks that used the /dev/md0, /dev/md1, etc. numbering. This machine that you linked the mdadm -E output from was not installed using version 0.90 arrays, it was installed using version 1.x arrays. They use names instead of numbers. For a period of time there was a back compatibility hack that caused version 1.x arrays that had the special name of just a number to be assembled using that number for the array. That hack might or might not still be in place, I'm not sure. You can try setting the homehost to an empty string, and the name to just 0 and see if the array assembles as /dev/md0. But even if that works, understand that it is a back compatibility hack that may go away in the future. If you are using version 1.x arrays, you really need to be using names.
the whole world switches from names and labels to UUID and mdadm.conf stop's to support "UUID=4a6f983f:bd1e7e97:e1d3d286:96812c9a" which is pretty clear and unique? seriously? ________________________________ "you really need to be using names" - where? ARRAY /dev/md126 level=raid1 num-devices=2 UUID=4a6f983f:bd1e7e97:e1d3d286:96812c9a the "/dev/md126" is a name and "4a6f983f:bd1e7e97:e1d3d286:96812c9a" clear and uniqe
> Because if you installed the system long enough ago, then > the raid is still a version 0.90 superblock raid device no, see below # mdadm.conf written out by anaconda MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md0 level=raid1 num-devices=4 UUID=0ee23bc1:9e046a9e:4ded3b32:7c9abd42 ARRAY /dev/md1 level=raid10 num-devices=4 UUID=87032cb1:f737b2cc:246cbf64:47ce0f8b ARRAY /dev/md2 level=raid10 num-devices=4 UUID=08abb623:db8da3e0:dd1c29ed:fc5347dc /dev/md0: Version : 1.0 Creation Time : Mon Dec 10 16:59:16 2012 Raid Level : raid1 Array Size : 511988 (500.07 MiB 524.28 MB) Used Dev Size : 511988 (500.07 MiB 524.28 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Personalities : [raid1] [raid10] md1 : active raid10 sdd2[2] sdc2[5] sdb2[0] sda2[4] 40956928 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU] bitmap: 1/1 pages [4KB], 65536KB chunk md2 : active raid10 sdd1[2] sdc1[5] sda1[4] sdb1[0] 1904636928 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU] bitmap: 2/15 pages [8KB], 65536KB chunk md0 : active raid1 sda3[4] sdd3[2] sdb3[0] sdc3[5] 511988 blocks super 1.0 [4/4] [UUUU]
I take it you haven't read the man page in a while: mdadm --create <device> --name <name> --homehost <homehost> The UUID is used to get the right devices into the right array, and you can specify the array by UUID and dracut will know to assemble that array. The name and homehost are used in concert with each other to provide consistent, predictable, human readable naming of arrays whether they are assembled on their native host (in which case they get /dev/md/<name>) or on a guest host (in which case they get /dev/md/<homehost>:<name>). You can set HOMEHOST to <any> in the mdadm.conf file if you want us to ignore the homehost for naming purposes. But if you create an array as so: mdadm -C /dev/md/root -l1 -n2 --name root --homehost `hostname -s` /dev/sda1 /dev/sda2 You can then do: mdadm -Db /dev/md/root >> /etc/mdadm.conf and you are good to go. In that case, the ARRAY line created by mdadm -Db will say /dev/md/root and the UUID of the array. But even without an ARRAY line, the array would still automatically assemble as /dev/md/root because of the --name entry you used when creating it.
how does that help in case of 18 *existing* arrays on 6 machines (2 of them remote and hundrets of kilometers away) - the current behavior is a regression and you don't want to modify arrays with the risk of damage something because someone changed behavior
You have some machines honoring the old back compatibility hack of creating /dev/md0 arrays because of a name of 0, and some not. This is likely due to conflicting hostname entries in the arrays, or in the mdadm.conf file, or in your startup environment. You cut out the full detail of md0 in comment #5, so I don't know why that one is being honored and the other one isn't. Regardless, when using version 1.x superblocks, getting /dev/md# names is a back compatibility hack and it relies on several things being just exactly so and as such it is not what I would call reliable. Little things you wouldn't think of as mattering, do matter. So if you want stability and reliability in your device naming, you should use names. There are two options I would suggest. 1) If you really want to stick with your /dev/md0 style naming, then modify your /etc/mdadm.conf file to include this line: HOMEHOST <any> then use mdadm --grow to change the name and homehost of each array to this: --homehost '' --name '<number you want array to have without the md prefix>' then remove the ARRAY lines from /etc/mdadm.conf and simply leave them out (they aren't needed with version 1.x arrays). Change your fstab or any other files that reference /dev/md127 to go back to the old /dev/md0 entries instead. Change any needed kernel boot arguments in the grub config file. Rebuild your initramfs using dracut. Then reboot. Things should come up as you want. 2) Switch to using names. Use mdadm --grow to change the homehost and name setting of your arrays. The homehost is optional and only really matters if you move arrays from machine to machine, like if you have an external drive tower that you plug into multiple machines at different times. The name is whatever name in /dev/md/ you want the device to have. The rest of the instructions are identical to option 1. You said you have 6 machines, 2 remote. That means you have 4 chances to perfect this into the exact specific series of steps that makes the machines come up exactly the way you want. You have your option of which way you would like it to go. BTW, on the machine that is assembling the arrays as /dev/md127, what do you see if you ls /dev/md/?
well, the machines have different age and superblocks > BTW, on the machine that is assembling the arrays > as /dev/md127, what do you see if you ls /dev/md/? [root@localhost:~]$ ls /dev/md/ insgesamt 0 lrwxrwxrwx 1 root root 8 2014-08-15 11:10 localhost.thelounge.net:0 -> ../md127 lrwxrwxrwx 1 root root 8 2014-08-15 11:10 localhost.thelounge.net:1 -> ../md125 lrwxrwxrwx 1 root root 8 2014-08-15 11:10 localhost.thelounge.net:2 -> ../md126 "localhost" is a replaced real hostname both machines where that happens are clones, means: * sent a blank disk * shutdown via SSH * asked a normal user to pull of the two RAID1 disks * replace with the blank one * rebuilt the array * sent the disk to my office * inserted it in the new machine, booted and rebuilt arrays * customized some things that works too with RAID10 and two disks, it's faster than dd over network and booted both from Live media, been there done that both ways with 4x2 TB :-)
(In reply to Harald Reindl from comment #9) > well, the machines have different age and superblocks > > > BTW, on the machine that is assembling the arrays > > as /dev/md127, what do you see if you ls /dev/md/? > > [root@localhost:~]$ ls /dev/md/ > insgesamt 0 > lrwxrwxrwx 1 root root 8 2014-08-15 11:10 localhost.thelounge.net:0 -> > ../md127 > lrwxrwxrwx 1 root root 8 2014-08-15 11:10 localhost.thelounge.net:1 -> > ../md125 > lrwxrwxrwx 1 root root 8 2014-08-15 11:10 localhost.thelounge.net:2 -> > ../md126 > > "localhost" is a replaced real hostname This is why your devices are not coming up as /dev/md0, etc. Their homehost value in the superblock does not match the output of hostname or hostname -s when the machine is booting up (meaning at initramfs time), so the array is seen as foreign and it is assembled using the /dev/md/<homehost>:<name> naming and assigned a random /dev/md# number. This isn't a regression, this is the fact that the back compatibility feature in mdadm requires specific things to be met and the homehost entry no longer meets the requirements. > both machines where that happens are clones, means: > > * sent a blank disk > * shutdown via SSH > * asked a normal user to pull of the two RAID1 disks > * replace with the blank one > * rebuilt the array > * sent the disk to my office > * inserted it in the new machine, booted and rebuilt arrays > * customized some things Probably somewhere in this "customized some things" step we got new arrays with new names, or the homehost got changed, or *something* happened to break the triggers for the back compatibility hack. This is one of the reasons I strongly suggest using names instead. > that works too with RAID10 and two disks, it's faster than dd > over network and booted both from Live media, been there done > that both ways with 4x2 TB :-) Cloning a raid disk and shipping the clone around is a valid replication method, but there are some dragons lurking in there as you are very aptly demonstrating in this bug ;-)
>> "localhost" is a replaced real hostname > > This is why your devices are not coming up as /dev/md0, etc. > Their homehost > value in the superblock does not match > the output of hostname or hostname -s when the machine is > booting up (meaning at initramfs time) nope - that would possibly be true for the clone, not for the origin for the machine from which i posted the data it is the name "hostname" says, it has even that as DNS and PTR and it has that name from the first moment hence is why "/sbin/mdadm --detail" says like below i only replace the hostname with "localhost" because it is public reachable Name : localhost.thelounge.net:1 (local to host localhost.thelounge.net) Name : localhost.thelounge.net:2 (local to host localhost.thelounge.net) Name : localhost.thelounge.net:0 (local to host localhost.thelounge.net)
I had a hard time getting someone else to understand this distinction before, let's hope this goes easier ;-) Your machine's hostname is not set from "the first moment". The hostname is set during the bootup sequence. And the point it is set in the boot sequence is after the point at which you assemble your arrays. At the point in time the initramfs assembles your arrays, your hostname is localhost.localdomain. If you want to use your post-bootup hostname on arrays that are assembled early in the boot process, and you want to set the homehost to your hostname, then you need to have a HOMEHOST <hostname> line in your /etc/mdadm.conf file, and that file needs to be on the initramfs. That way, even though the system hostname is still localhost.localdomain, mdadm on the initramfs will know that your eventual host name will be <hostname> and it will treat arrays with that HOMEHOST entry in their superblock as local arrays. Your arrays are not being treated as local arrays.
looks like https://bugzilla.redhat.com/show_bug.cgi?id=1015204 still leaded to not put mdadm.conf in the initrd, thanks for the hint * i also added "HOMEHOST lcalhost.thelounge.net" to /etc/mdadm.conf * mdadmconf="yes" to /etc/dracut.conf.d/92-mdadm.conf * dracut -f * voila the config below is respected again [root@localhost:~]$ cat /etc/mdadm.conf MAILADDR root HOMEHOST localhost.thelounge.net AUTO +imsm +1.x -all ARRAY /dev/md0 level=raid1 num-devices=2 UUID=4a6f983f:bd1e7e97:e1d3d286:96812c9a ARRAY /dev/md1 level=raid1 num-devices=2 UUID=a69fd22b:1a9b9b26:119d5887:18062cab ARRAY /dev/md2 level=raid1 num-devices=2 UUID=d9bb7fc7:9df30300:7df27a6f:f2b875b8