Bug 787832

Summary:

after update to mdadm-3.2.3-3.fc15.x86_64 server does not boot and goes to maintenance mode

Product:

[Fedora] Fedora

Reporter:

Thomas <bugzilla>

Component:

mdadm

Assignee:

Doug Ledford <dledford>

Status:

CLOSED NOTABUG

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

CC:

agk, dledford, Jes.Sorensen, mbroz

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2012-02-07 09:43:20 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
/etc/mdadm.conf file	none

Description Thomas 2012-02-06 21:15:04 UTC

Description of problem:

After update to mdadm-3.2.3-3.fc15.x86_64 on Fedora 15 I can not boot my server. It failes because a Raid1 is not assembled.

The server das 3 disks:
sda - a ssd disk with the os
sdb/sdc disks assembled to a raid1 which conains my data

After power on the server boots from the ssd. Due to the fact that the raid1 is not assembled, the lvm stuff on the raid1 is not detected. This results in missing devices. I have to go to the maintenance mode. After giving the root password and doing 

mdadm --auto-detect
pvscan
vgscan
lvscan
^d

the server boots and everything is fine. Further investigation result is that 
[ -r /proc/mdstat ] && [ -r /dev/md/md-device-map ] && /sbin/mdadm -IRs
in /lib/systemd/fedora-storage-init does not work. There's neither a /dev/md/md-device-map file on my server nor does /sbin/mdadm -IRs work. /proc/mdstat exists.

Downgrading to mdadm-3.2.2-15.fc15.x86_64 fixed the problem. Everything works as expected.


Version-Release number of selected component (if applicable):
mdadm-3.2.3-3.fc15.x86_64


How reproducible:
As described above with 3 disks. One single disk as boot disk, the other two disks assembled to a raid1. The raid1 is uses as physical lvm volume with a volumegroup and several logical volumes on it.


Steps to Reproduce:
1.
2.
3.
  
Actual results:
server does not boot and goes to maintenance mode


Expected results:
server boots without problems


Additional info:
I could not downgrade using yum downgrade mdadm
The package offered was the package from the release repository. In the update repository there was no aoler version of mdadm. This makes downgrade not work. I had to search for the older package in the internet and download the rpm to make a downgrade.

Comment 1 Doug Ledford 2012-02-06 21:31:48 UTC

Jes, we evidently need to coordinate with the systemd people on moving the md-device-map file from /dev/md to /run/mdadm.  I didn't realize they had any hooks on that file in systemd.

Comment 2 Doug Ledford 2012-02-06 21:43:04 UTC

Of course, that line is also supposed to be a failsafe, there is also the question of why the array didn't simply come up during the udev device scans.  We really need answers to both of those questions I think.

Thomas, can you please verify that you have a /lib/udev/rules.d/65-md-incremental.rules file on your system.  Then can you please post the output of fdisk -l /dev/sdb here as well as the output of blkid -o udev /dev/sdb, and if you have any partitions on /dev/sdb, then blkid -o udev /dev/sdb? as well.

That should get us started trying to figure out why your system didn't boot up properly.

Jes, since an update is possibly rendering machines unbootable, this is going to need to be a very high priority and we likely should file a release-engineering ticket to remove the existing update from the repos and push a new update ASAP.

In fact Jes, I think we need to move the location of the md-device-map file back to /dev/md/ for f15 and f16.  I think that change is too risky for either of those releases.  The reason, which just came to mind, is that if people need early boot md work, and they don't rebuild all their existing initramfs images, then the old mdadm on the initramfs and the new mdadm on the root filesystem won't agree about where the md-device-map file should be and we could get races and other bad things happening.

Comment 3 Jes Sorensen 2012-02-06 22:04:24 UTC

Doug,

Was just checking this before going to bed.

mdadm-3.2.3-3 does *not* have the move of /dev/md/md-device-map, that is
only in mdadm-3.2.3-5 which I haven't pushed upstream yet.

Why there is no /dev/md/md-device-map showing up on Thomas' system will
be due to something else.

Thomas, did you run dracut -f "" after updating the mdadm package?
Otherwise you will end up with a different version of mdadm/mdmon in
the initramfs compared to what is on the root file system.

Jes

Comment 4 Jes Sorensen 2012-02-06 22:06:13 UTC

Thomas,

Could you please provide the output of /proc/mdstat and
mdadm --detail /dev/md<X> of the raid from the case where it booted
correctly.

Thanks,
Jes

Comment 5 Doug Ledford 2012-02-06 22:11:38 UTC

Jes, whew, that's good to hear ;-)

Comment 6 Thomas 2012-02-06 22:16:42 UTC

Yes there is a udev rules file:
# ll /lib/udev/rules.d/65-md-incremental.rules
-rw-r--r-- 1 root root 3348 Nov 22 11:32
/lib/udev/rules.d/65-md-incremental.rules

Here's the output of fdisk -l for both raid1 disks:

# fdisk -l /dev/sdb

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x7cf498aa

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *          63      409662      204800   83  Linux
/dev/sdb2          409663  1953520064   976555201   fd  Linux raid autodetect


# fdisk -l /dev/sdc

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0005be44

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1              63      409662      204800   83  Linux
/dev/sdc2          409663  1953520064   976555201   fd  Linux raid autodetect

On sdb and sdc there is an old boot partition. I had these two disks only a
longer time before. After adding sda as boot disk I setup fc15 from scratch and
used the sdb/sdc as raid1 for my data. The os resides on sda completely.


The blkid output:

# blkid -o udev /dev/sdb
no output

blkid for the partitions:
# blkid -o udev /dev/sdb1
ID_FS_UUID=5d2bece0-e614-46bc-8bae-bad30faa6795
ID_FS_UUID_ENC=5d2bece0-e614-46bc-8bae-bad30faa6795
ID_FS_SEC_TYPE=ext2
ID_FS_TYPE=ext3

# blkid -o udev /dev/sdb2
ID_FS_UUID=5cf496a1-e554-585d-bfe7-8010bc810f04
ID_FS_UUID_ENC=5cf496a1-e554-585d-bfe7-8010bc810f04
ID_FS_TYPE=linux_raid_member

# blkid -o udev /dev/sdc1
ID_FS_UUID=a403ca3a-4f43-4986-8c8d-aee3f6d3564d
ID_FS_UUID_ENC=a403ca3a-4f43-4986-8c8d-aee3f6d3564d
ID_FS_SEC_TYPE=ext2
ID_FS_TYPE=ext3

# blkid -o udev /dev/sdc2
ID_FS_UUID=5cf496a1-e554-585d-bfe7-8010bc810f04
ID_FS_UUID_ENC=5cf496a1-e554-585d-bfe7-8010bc810f04
ID_FS_TYPE=linux_raid_member


Doug, during my investigations I tried the /sbin/mdadm -IRs on the maintenance
console. This command didn't assemble the raid1. A mdadm --auto-detect does the
job. As far as I know the options -IRs mean incremental, run, scan. Why the
scan didn't work out and the --auto-detect works? Maybe that helps.


Jes, I didn't run a dracut -f after updating mdadm. It was included in a yum
update. I searched for the changes after having problems during boot and
noticed a change in mdadm. I went back to the prior version and everything
works again.

The answers to your questions:

# cat /proc/mdstat 
Personalities : [raid1] 
md127 : active raid1 sdb2[0] sdc2[1]
      976555136 blocks [2/2] [UU]

unused devices: <none>


# mdadm --detail /dev/md127
/dev/md127:
        Version : 0.90
  Creation Time : Fri Oct 23 17:51:12 2009
     Raid Level : raid1
     Array Size : 976555136 (931.32 GiB 999.99 GB)
  Used Dev Size : 976555136 (931.32 GiB 999.99 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Mon Feb  6 23:13:41 2012
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 5cf496a1:e554585d:bfe78010:bc810f04
         Events : 0.1963

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8       34        1      active sync   /dev/sdc2

Comment 7 Jes Sorensen 2012-02-06 22:27:36 UTC

Thomas,

We think we know what might be wrong. Can you provide a copy of your
/etc/mdadm.conf

Thanks,
Jes

Comment 8 Thomas 2012-02-06 22:34:52 UTC

Created attachment 559780 [details]
/etc/mdadm.conf file

requested mdadm.conf file

Comment 9 Doug Ledford 2012-02-06 22:57:17 UTC

Thomas, we have it figured out. Your system is configured not to boot. We think that the recent mdadm update might have started honoring something that the old mdadm was not, and hence the change.

However, here's the deal. You have an old version 0.90 superblock array. In the mdadm.conf file, the line AUTO +imsm +1.x -all tells mdadm -I invocations to start any imsm or version 1.x superblock arrays, but not ddf or version 0.90 arrays. Now, if you had an ARRAY line in the mdadm.conf file with your array's uuid, then it would override the fact that the conf file is telling mdadm -I not to assemble version 0.90 arrays and it would assemble your version 0.90 array.

The fact that mdadm --auto-detect works is because you are using version 0.90 arrays, and the kernel will start those itself. But, that support is going away in the future, so best to have your array getting started by mdadm properly, not via the kernel.

Actually, the big question in my mind, and the one we'll work on here, is why did your system *ever* work.

So, in order to get your system booting properly again, there are three options (listed in order of preference):

1) Recreate your array with a version 1.0 superblock (do not use 1.1 or 1.2, those superblocks would not occupy the same space on disk as the version 0.90 superblock does and would in fact overwrite some of your data...to switch to one of them means you must do a backup and restore of your data). It can be done over your existing superblocks without damaging your data (raid1 is easy to do this with, other raids you can too, but you have to make sure and match things like chunk sizes and such). You should use the size option if you do this just to make sure that the new superblock doesn't try to utilize more space than the 0.90 superblock (aka, check the device's current size, that of the md raid device, and when you recreate the raid1 array, pass --size= and make it match, that will force mdadm to use exactly the same number of data blocks on the new array as on the old). You can also pass --assume-clean to avoid a resync. Once this is done, your array should start coming up automatically on each boot by virtue of it being a version 1.x array. If you do this you should also change the partition type for the raid partitions to type 0xfd non-fs data. The old raid autodetect partition type is being deprecated raid along with autodetection. Something like the following should do the trick:

mdadm -S /dev/md127
fdisk /dev/sd[bc] (change type to 0xfd)
mdadm -C /dev/md/home (or whatever you want to call it, version 1.x arrays are named instead of numbered, so you would also want to update fstab to use the array by name and not number) -e1.0 --size=976555136 -l1 -n2 --assume-clean /dev/sd[bc]2

2) Add an ARRAY line to mdadm.conf for your existing array (and you might want to do this step even if you also do the first option).

mdadm -Db /dev/md127 >> /etc/mdadm.conf will do the trick

3) Enable auto assembly of unknown version .90 arrays by changing the AUTO line in the mdadm.conf file to read AUTO +all -ddf or AUTO +imsm +1.x +0.90 -all

Any of those options should solve your problem. In the mean time, we need to figure out why your system went from working to not-working and make sure that this isn't going to bite more people than we think it is.

Comment 10 Thomas 2012-02-07 05:02:56 UTC

Doug, thanx for the analysis and the options fixing the problem. I think 3) is the choise I will use. I've running some jobs at the moment and will test the solution with the current mdadm package after the jobs completed. I will update the bug with my results.

I checked an old config backup of my server (backup before installing Fedora 15 on the server). The /etc/mdadm.conf contained the same lines as today. 

A little bit of history which perhaps help to figure out if there maybe are more people having the same problem.

The raid1 array I created with a Fedora 13 release. I had only two disks in my server wich are forming the raid. The raid1 was created by the Ferdora installer.

A couple of month ago I added a new ssd disk to my server and made this disk the boot disk having the os on it. The raid1 I used for my data only from this time on. During install of Fedora 15 the raid1 was not touched by the installer. But the raid1 was detected automatically until the last update of mdadm.

Comment 11 Thomas 2012-02-07 05:18:25 UTC

Doug, another question. Is there an explanation why yum downgrade mdadm results in the version 3.1.5 of mdadm? I did not find the previous version mdadm-3.2.2-15.fc15.x86_64 in the updates repository. I had to search the net for this particular package and install it locally after downloading it.

Comment 12 Thomas 2012-02-07 07:47:59 UTC

Doug, here are the results of my tests. Adding +0.90 in the mdadm.conf fixes the problem for me. My system boots now with the current version of mdadm.

Here's a suggestion to solve the problem:
in the post section of the package check if there are existing raid arrays on the system. Check the superblock of each array. From the superblock versions you can decide how to change the mdadm.conf to work with the current available raid arrays. My suggestion assumes that mdadm will be backward compatible and able to work with older superblock versions in the future.

Comment 13 Jes Sorensen 2012-02-07 08:03:45 UTC

Hi Thomas,

Glad to hear it is indeed working for you with the update. Detecting arrays
in the post install script and then modifying people's mdadm.conf is risky
as some things could have been put in there on purpose.

I suspect the best thing we can do here is to provide documentation, and
possible emit a warning in the post install script if a 0.90 array without
a line is detected.

Cheers,
Jes

Comment 14 Jes Sorensen 2012-02-07 09:43:20 UTC

Thomas,

I have determined the exact culprit for the problems you were seeing.
Basically mdadm-3.2.2 had a bug where any auto information in mdadm.conf
was not respected. This was fixed by the commit below, which is included
in the mdadm-3.2.3 update that I pushed into Fedora 15 and Fedora 16.

Given that the actual bug was that mdadm didn't behave as documented, and
it does so now, I am reluctant to try and make changes to it, but rather
make a note about this in the documentation.

Cheers,
Jes

commit b451aa4846c5ccca5447a6b6d45e5623b8c8e961
Author: NeilBrown <neilb>
Date:   Thu Oct 6 13:00:28 2011 +1100

    Fix handling for "auto" line in mdadm.conf
    
    Two problems.
    
    1/ pol_merge was ignoring the pol_auto tag so any 'auto' information
       was lost
    2/ If a device had not path (e.g. loop devices) or if there were no
       path-based policies, we didn't bother looking for policy at all.
       So path-independant policies were ignored.
    
    Reported-by: Christian Boltz <suse-beta>
    Signed-off-by: NeilBrown <neilb>