Bug 543746 - System can be rendered unbootable by adding an extra device to the rootfs RAID1 array
System can be rendered unbootable by adding an extra device to the rootfs RAI...
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: mdadm (Show other bugs)
12
All Linux
low Severity high
: ---
: ---
Assigned To: Doug Ledford
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-12-02 19:33 EST by David Howells
Modified: 2010-04-27 23:09 EDT (History)
1 user (show)

See Also:
Fixed In Version: initscripts-9.09-1.fc13
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-04-27 23:09:33 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description David Howells 2009-12-02 19:33:27 EST
Description of problem:

If you have a filesystem on an MD raid1 array (mirrored), in particular the rootfs, you can render your system unbootable by adding an extra live device to the array.

Version-Release number of selected component (if applicable):

mdadm-3.0.3-1.fc12.x86_64
dracut-002-13.4.git8f397a9b.fc12.noarch
kernel-2.6.31.5-127.fc12.x86_64

How reproducible:

Very.

Steps to Reproduce:

I have my desktop machine's rootfs on a raid1 (mirror) array with normally assembled from two live devices and a spare device.  Before upgrading to F12, I made the spare live to take a backup of the rootfs, but I forgot to remove it before rebooting into the installer.

The installer was running of an installation CD, installing over the network, and was able to mount the rootfs with no problems.

However, when I booted into the system after upgrading it, it failed to boot.  It appeared to find all the drives and appeared to assemble and stop all the MD arrays.  It then just said that it wasn't able to boot and was stopping.  I could then reboot it with C-A-D.  No obvious reason was given.

Investigating further showed that pulling out the spare drive caused the rootfs MD (md1) array not appear.  I could see the other two MD arrays being detected and assembled, but md1 wasn't mentioned at all.  The kernel now reported that the rootfs could not be found.

I put the spare disk back in and booted into the installation CD's rescue mode, extracted out the initramfs image and attempted to assemble md1 manually.  At that point, mdadm reported that it wasn't going to even try assembling the array because the config file said the array had two devices, but the on-disk metadata said three.

I then got mdadm to remove the spare device from the array and rebooted, and all was fine.
  
Actual results:

The system failed to boot with no reason given why.

Expected results:

The system should either report why it's refusing to assemble the MD array, or, if the devices are consistent, it should boot anyway and warn that the MD array is running in an unexpected state.  It should not just fail to boot with no reason given.
Comment 1 Doug Ledford 2009-12-02 20:04:30 EST
This is because whenever you have an ARRAY line in mdadm.conf, and the UUID of that line matches the UUID of an array, then *all* items on the line must match or else it is a failed match and the device is not assembled.  For certain items, items which might change by the use of the --grow option to mdadm or adding disks to arrays, you do not want to specify those items on the ARRAY line or else changes do render the array unassemblable.  This has already been changed in mdadm so that mdadm -Eb or mdadm -Db will not put changable items into the ARRAY line it outputs and instead sticks to just the minimal data needed to positively identify the array.  Unfortunately, on upgrade, the array lines are not rewritten.  So, I'm not sure there is much that can be done about this other than to caution people to edit their mdadm.conf files and reduce the ARRAY lines to just the needed data.  However, I'm positive that whatever else might be done would need to be in anaconda, not mdadm.
Comment 2 David Howells 2009-12-03 07:01:55 EST
At the very least, the boot process must _say_ why it is refusing to assemble the array.  Just silently refusing to assemble the array isn't very nice.

The way I found out was to mount the system under a live CD, extract the initramfs image and attempt to assemble the array myself.  I only did that because I noticed an anomaly when I pulled out the hot-swap drive.
Comment 3 Doug Ledford 2010-02-19 14:35:29 EST
The latest mdadm (in rawhide now, will build here too) resolves this issue, but not by issuing the warning you suggest.  Instead, the standard way of creating an array line using mdadm is to do one of:

mdadm -Db /dev/md? or mdadm -Eb /dev/sd?? >> /etc/mdadm.conf which will create a new ARRAY line for your device.  This method used to print out information that could legitimately change (such as number of devices).  The output of this operation has been reduced to just the information that should never change so that if a person later grows an array to a slightly different configuration it will no longer be rendered broken by an overly strict ARRAY line in the config file.
Comment 4 Fedora Update System 2010-02-19 19:02:30 EST
mdadm-3.1.1-0.gcd9a8b5.3.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/mdadm-3.1.1-0.gcd9a8b5.3.fc12
Comment 5 Fedora Update System 2010-02-19 19:02:43 EST
mdadm-3.1.1-0.gcd9a8b5.3.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/mdadm-3.1.1-0.gcd9a8b5.3.fc13
Comment 6 Fedora Update System 2010-02-19 22:49:01 EST
mdadm-3.1.1-0.gcd9a8b5.3.fc13 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F13/FEDORA-2010-1714
Comment 7 Fedora Update System 2010-02-20 02:35:11 EST
mdadm-3.1.1-0.gcd9a8b5.3.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1891
Comment 8 Fedora Update System 2010-04-09 16:13:15 EDT
mdadm-3.1.2-9.fc13,initscripts-9.09-1.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/mdadm-3.1.2-9.fc13,initscripts-9.09-1.fc13
Comment 9 Fedora Update System 2010-04-12 21:40:44 EDT
mdadm-3.1.2-9.fc13, initscripts-9.09-1.fc13 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm initscripts'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.2-9.fc13,initscripts-9.09-1.fc13
Comment 10 Fedora Update System 2010-04-27 23:08:43 EDT
initscripts-9.09-1.fc13, mdadm-3.1.2-10.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.