Bug 471741 - Can't install Fedora on precreated software RAID
Summary: Can't install Fedora on precreated software RAID
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: mdadm
Version: 10
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 471739 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-11-15 13:25 UTC by m.cencora
Modified: 2009-07-27 00:37 UTC (History)
8 users (show)

Fixed In Version: 2.6.9-1.fc10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-07-22 22:04:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
strace of incremental add of /dev/sda2 on boot (242.47 KB, text/plain)
2008-11-21 18:27 UTC, Bill Nottingham
no flags Details
strace of incremental add of /dev/sda3 on boot (10.55 KB, text/plain)
2008-11-21 18:27 UTC, Bill Nottingham
no flags Details
Patch to make udev create md[0-9] style raid devices (323 bytes, patch)
2008-12-13 16:07 UTC, Leo Bergolth
no flags Details | Diff

Description m.cencora 2008-11-15 13:25:28 UTC
Description of problem:
I'm unable to install Fedora on precreated software RAID1.

Steps to Reproduce:
1. Create software RAID devices before starting Fedora installation.
2. Run installer.
  
I created software RAID1 devices in Ubuntu, then run Fedora installation. At first it didn't even recognize those devices. I found out that I need to mark partitions (that are part of these RAID devices) as 0xfd to make them show up in anaconda, but then another problem arised.

When partitions are marked as 0xfd(linux software RAID autodetection) the kernel sets RAID devices automatically as /dev/md_d[0-n], but anaconda is trying to use /dev/md[0-n] names.

I tried disassembling kernel created md devices, then selected custom layout in anaconda, and chose RAID partition as root. After confirming the layout anaconda rereads partition tables (even though partition layout didn't changed) which causes kernel to recreate and assemble raid devices under /dev/md_d[0-n] names.
In the same time anaconda is trying to assemble the same raid devices under /dev/md[0-n] names.

Example:
my RAID partitions (created before running installation):
/dev/md0 RAID1 (/dev/sda1, /dev/sdb1)
/dev/md1 RAID1 (/dev/sda2, /dev/sdb2)
/dev/md2 RAID1 (/dev/sda5, /dev/sdb5)
/dev/md3 RAID1 (/dev/sda6, /dev/sdb6)

After confirming the custom layout in anaconda it looks like this:
/dev/md0 RAID1 (/dev/sda1, missing)
/dev/md1 RAID1 (/dev/sda2, missing)
/dev/md2 RAID1 (/dev/sda5, missing)
/dev/md3 RAID1 (/dev/sda6, missing)

/dev/md_d0 RAID1 (missing, /dev/sdb1)
/dev/md_d1 RAID1 (missing, /dev/sdb2)
/dev/md_d2 RAID1 (missing, /dev/sdb5)
/dev/md_d3 RAID1 (missing, /dev/sdb6)

So anaconda assembled RAID partitions are degraded and installation is aborted.

I tried adding raid=noautodetect to kernel boot up options, but kernel still created /dev/md_d[0-n] devices automatically. And when I mark partitions that are part of RAID devices as i.e. 0x83 or 0x82, they won't even show up in anaconda during custom partition layout creation. So in the end I'm unable to install Fedora on software RAID devices.

Expected behaviour:
1. If RAID kernel autoassembly is enabled, anaconda should use device names given by kernel.
2. Anaconda should allow to choose RAID devices during custom layout even when partitions that are part of those RAID devices are not marked as 0xfd (detect if they are part of any RAID devices with mdadm -E /dev/sd[a-z][0-n])

Comment 1 m.cencora 2008-11-15 13:28:20 UTC
*** Bug 471739 has been marked as a duplicate of this bug. ***

Comment 2 m.cencora 2008-11-15 13:30:10 UTC
To sum up there's are race condition (between kernel and anaconda) when assembling RAID devices

Comment 3 Chris Lumens 2008-11-20 22:42:28 UTC
Can you attach /tmp/anaconda.log and /tmp/syslog to this bug report?  Thanks.

Comment 4 m.cencora 2008-11-20 23:12:42 UTC
I'm sorry I can't. I've already installed Fedora on normal partition, then migrated to RAID so I can't test it anymore.

Comment 5 Bill Nottingham 2008-11-21 18:26:18 UTC
AFAICT, this mdadm auto-assembly failing when there's no mdadm.conf, or the device isn't listed there. I'll attach the mdadm runs on boot from an example case of this. In this case, I have a two-partition RAID-1 array (sda2 and sda3). After booting /proc/mdstat just shows:

Personalities:
md_d0: inactive sda3[1](S)

Attempting to (post-boot) add the other partition yields:

# /sbin/mdadm -I --auto=yes /dev/sda2
mdadm: failed to open /dev/md/d0: File exists

Comment 6 Bill Nottingham 2008-11-21 18:27:04 UTC
Created attachment 324331 [details]
strace of incremental add of /dev/sda2 on boot

Comment 7 Bill Nottingham 2008-11-21 18:27:35 UTC
Created attachment 324332 [details]
strace of incremental add of /dev/sda3 on boot

Comment 8 Bill Nottingham 2008-11-21 18:29:39 UTC
This is with mdadm-2.6.7.1-1.fc10.

Comment 9 Bug Zapper 2008-11-26 05:27:00 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Leo Bergolth 2008-12-13 16:04:27 UTC
The md_d[0-n] devices are created by the udev-rule 70-mdadm.rules:

SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \
        RUN+="/sbin/mdadm -I --auto=yes $root/%k"

If you change auto=yes to auto=md, mdadm will autocreate md[0-n] style devices and  automatically start the array using this device name.
Having applied this modification, installation from a f10-live-cd works as expected. (At least on my system.)

--leo

Comment 11 Leo Bergolth 2008-12-13 16:07:52 UTC
Created attachment 326826 [details]
Patch to make udev create md[0-9] style raid devices

Comment 12 Doug Ledford 2009-03-18 17:01:06 UTC
While this patch will make normal arrays auto assemble correctly, it will break assembly of partitionable arrays.  I'm looking into a fix that will solve both cases.

Comment 13 Phil Anderson 2009-05-09 09:18:54 UTC
Please let me know if this should be reported as a different bug.

I recenty installed rawhide, and reused my existing FC 10 software raid partitions:
Raid-1 for /boot & / (reformatted during installation)
Raid-0 /data

Installation went fine, however it didn't boot.  After a lot of screwing around, I worked out that it was because mkinitrd didn't include the raid1 driver so it couldn't find the root filesystem.  I booted from a recovery CD and re-ran mkinitrd, then everything was fine.

btw.... I found it extremly difficult to track this problem down, as the new boot process seems to hide error messages, even if you remove quiet & rhgb from the kernel command line.  Very frustrating indeed.  Another bug report.

Comment 14 Doug Ledford 2009-06-29 20:15:53 UTC
There are two causes for split array hysteresis that we found during the F11 development cycle:

1) specific to the install environment, running mdadm -I on array members fails because /var/run/mdadm/mdadm.map can't be written to.  Unless mdadm can write to that file, it can not confirm that the first device already allocated to /dev/md0 and the device being added belong to the same array, so it fails with device already exists.

2) there is a race condition between mdadm -A and mdadm -I in the initscripts and udev files, in this situation mdadm -I is trying to start the array one device at a time, while mdadm -A is trying to start the array all in one go.  mdadm -I will manage to get one or more of the devices prior to mdadm -A being able to lock them all down, and you end up with partially assembled arrays.

I've built a new mdadm (2.6.9) with a new udev rules file that should address the second problem.  However, the first problem would require respun install images to be solved and is beyond the scope of what I can do.

Comment 15 Fedora Update System 2009-06-29 20:21:27 UTC
mdadm-2.6.9-1.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/mdadm-2.6.9-1.fc10

Comment 16 Fedora Update System 2009-07-02 05:52:02 UTC
mdadm-2.6.9-1.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-7263

Comment 17 Fedora Update System 2009-07-22 22:04:36 UTC
mdadm-2.6.9-1.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 18 Bill McGonigle 2009-07-22 22:10:34 UTC
Should problem #1 from comment #14 be filed a a separate bug to be addressed for F12?

Comment 19 Doug Ledford 2009-07-25 14:10:07 UTC
No, that problem was addressed already in mdadm-3.0 (problem + patch sent upstream prior to 3.0's release, slightly different implementation that still works was included upstream).

Comment 20 Bill McGonigle 2009-07-27 00:37:42 UTC
Super, thanks Doug.


Note You need to log in before you can comment on or make changes to this bug.