This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 471741 - Can't install Fedora on precreated software RAID
Can't install Fedora on precreated software RAID
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: mdadm (Show other bugs)
10
All Linux
medium Severity medium
: ---
: ---
Assigned To: Doug Ledford
Fedora Extras Quality Assurance
:
: 471739 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-15 08:25 EST by m.cencora
Modified: 2009-07-26 20:37 EDT (History)
8 users (show)

See Also:
Fixed In Version: 2.6.9-1.fc10
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-07-22 18:04:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
strace of incremental add of /dev/sda2 on boot (242.47 KB, text/plain)
2008-11-21 13:27 EST, Bill Nottingham
no flags Details
strace of incremental add of /dev/sda3 on boot (10.55 KB, text/plain)
2008-11-21 13:27 EST, Bill Nottingham
no flags Details
Patch to make udev create md[0-9] style raid devices (323 bytes, patch)
2008-12-13 11:07 EST, Leo Bergolth
no flags Details | Diff

  None (edit)
Description m.cencora 2008-11-15 08:25:28 EST
Description of problem:
I'm unable to install Fedora on precreated software RAID1.

Steps to Reproduce:
1. Create software RAID devices before starting Fedora installation.
2. Run installer.
  
I created software RAID1 devices in Ubuntu, then run Fedora installation. At first it didn't even recognize those devices. I found out that I need to mark partitions (that are part of these RAID devices) as 0xfd to make them show up in anaconda, but then another problem arised.

When partitions are marked as 0xfd(linux software RAID autodetection) the kernel sets RAID devices automatically as /dev/md_d[0-n], but anaconda is trying to use /dev/md[0-n] names.

I tried disassembling kernel created md devices, then selected custom layout in anaconda, and chose RAID partition as root. After confirming the layout anaconda rereads partition tables (even though partition layout didn't changed) which causes kernel to recreate and assemble raid devices under /dev/md_d[0-n] names.
In the same time anaconda is trying to assemble the same raid devices under /dev/md[0-n] names.

Example:
my RAID partitions (created before running installation):
/dev/md0 RAID1 (/dev/sda1, /dev/sdb1)
/dev/md1 RAID1 (/dev/sda2, /dev/sdb2)
/dev/md2 RAID1 (/dev/sda5, /dev/sdb5)
/dev/md3 RAID1 (/dev/sda6, /dev/sdb6)

After confirming the custom layout in anaconda it looks like this:
/dev/md0 RAID1 (/dev/sda1, missing)
/dev/md1 RAID1 (/dev/sda2, missing)
/dev/md2 RAID1 (/dev/sda5, missing)
/dev/md3 RAID1 (/dev/sda6, missing)

/dev/md_d0 RAID1 (missing, /dev/sdb1)
/dev/md_d1 RAID1 (missing, /dev/sdb2)
/dev/md_d2 RAID1 (missing, /dev/sdb5)
/dev/md_d3 RAID1 (missing, /dev/sdb6)

So anaconda assembled RAID partitions are degraded and installation is aborted.

I tried adding raid=noautodetect to kernel boot up options, but kernel still created /dev/md_d[0-n] devices automatically. And when I mark partitions that are part of RAID devices as i.e. 0x83 or 0x82, they won't even show up in anaconda during custom partition layout creation. So in the end I'm unable to install Fedora on software RAID devices.

Expected behaviour:
1. If RAID kernel autoassembly is enabled, anaconda should use device names given by kernel.
2. Anaconda should allow to choose RAID devices during custom layout even when partitions that are part of those RAID devices are not marked as 0xfd (detect if they are part of any RAID devices with mdadm -E /dev/sd[a-z][0-n])
Comment 1 m.cencora 2008-11-15 08:28:20 EST
*** Bug 471739 has been marked as a duplicate of this bug. ***
Comment 2 m.cencora 2008-11-15 08:30:10 EST
To sum up there's are race condition (between kernel and anaconda) when assembling RAID devices
Comment 3 Chris Lumens 2008-11-20 17:42:28 EST
Can you attach /tmp/anaconda.log and /tmp/syslog to this bug report?  Thanks.
Comment 4 m.cencora 2008-11-20 18:12:42 EST
I'm sorry I can't. I've already installed Fedora on normal partition, then migrated to RAID so I can't test it anymore.
Comment 5 Bill Nottingham 2008-11-21 13:26:18 EST
AFAICT, this mdadm auto-assembly failing when there's no mdadm.conf, or the device isn't listed there. I'll attach the mdadm runs on boot from an example case of this. In this case, I have a two-partition RAID-1 array (sda2 and sda3). After booting /proc/mdstat just shows:

Personalities:
md_d0: inactive sda3[1](S)

Attempting to (post-boot) add the other partition yields:

# /sbin/mdadm -I --auto=yes /dev/sda2
mdadm: failed to open /dev/md/d0: File exists
Comment 6 Bill Nottingham 2008-11-21 13:27:04 EST
Created attachment 324331 [details]
strace of incremental add of /dev/sda2 on boot
Comment 7 Bill Nottingham 2008-11-21 13:27:35 EST
Created attachment 324332 [details]
strace of incremental add of /dev/sda3 on boot
Comment 8 Bill Nottingham 2008-11-21 13:29:39 EST
This is with mdadm-2.6.7.1-1.fc10.
Comment 9 Bug Zapper 2008-11-26 00:27:00 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 10 Leo Bergolth 2008-12-13 11:04:27 EST
The md_d[0-n] devices are created by the udev-rule 70-mdadm.rules:

SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \
        RUN+="/sbin/mdadm -I --auto=yes $root/%k"

If you change auto=yes to auto=md, mdadm will autocreate md[0-n] style devices and  automatically start the array using this device name.
Having applied this modification, installation from a f10-live-cd works as expected. (At least on my system.)

--leo
Comment 11 Leo Bergolth 2008-12-13 11:07:52 EST
Created attachment 326826 [details]
Patch to make udev create md[0-9] style raid devices
Comment 12 Doug Ledford 2009-03-18 13:01:06 EDT
While this patch will make normal arrays auto assemble correctly, it will break assembly of partitionable arrays.  I'm looking into a fix that will solve both cases.
Comment 13 Phil Anderson 2009-05-09 05:18:54 EDT
Please let me know if this should be reported as a different bug.

I recenty installed rawhide, and reused my existing FC 10 software raid partitions:
Raid-1 for /boot & / (reformatted during installation)
Raid-0 /data

Installation went fine, however it didn't boot.  After a lot of screwing around, I worked out that it was because mkinitrd didn't include the raid1 driver so it couldn't find the root filesystem.  I booted from a recovery CD and re-ran mkinitrd, then everything was fine.

btw.... I found it extremly difficult to track this problem down, as the new boot process seems to hide error messages, even if you remove quiet & rhgb from the kernel command line.  Very frustrating indeed.  Another bug report.
Comment 14 Doug Ledford 2009-06-29 16:15:53 EDT
There are two causes for split array hysteresis that we found during the F11 development cycle:

1) specific to the install environment, running mdadm -I on array members fails because /var/run/mdadm/mdadm.map can't be written to.  Unless mdadm can write to that file, it can not confirm that the first device already allocated to /dev/md0 and the device being added belong to the same array, so it fails with device already exists.

2) there is a race condition between mdadm -A and mdadm -I in the initscripts and udev files, in this situation mdadm -I is trying to start the array one device at a time, while mdadm -A is trying to start the array all in one go.  mdadm -I will manage to get one or more of the devices prior to mdadm -A being able to lock them all down, and you end up with partially assembled arrays.

I've built a new mdadm (2.6.9) with a new udev rules file that should address the second problem.  However, the first problem would require respun install images to be solved and is beyond the scope of what I can do.
Comment 15 Fedora Update System 2009-06-29 16:21:27 EDT
mdadm-2.6.9-1.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/mdadm-2.6.9-1.fc10
Comment 16 Fedora Update System 2009-07-02 01:52:02 EDT
mdadm-2.6.9-1.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-7263
Comment 17 Fedora Update System 2009-07-22 18:04:36 EDT
mdadm-2.6.9-1.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 18 Bill McGonigle 2009-07-22 18:10:34 EDT
Should problem #1 from comment #14 be filed a a separate bug to be addressed for F12?
Comment 19 Doug Ledford 2009-07-25 10:10:07 EDT
No, that problem was addressed already in mdadm-3.0 (problem + patch sent upstream prior to 3.0's release, slightly different implementation that still works was included upstream).
Comment 20 Bill McGonigle 2009-07-26 20:37:42 EDT
Super, thanks Doug.

Note You need to log in before you can comment on or make changes to this bug.