Bug 151653

Summary: anaconda fails to bring up raid device whose members have moved
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: anacondaAssignee: Peter Jones <pjones>
Status: CLOSED RAWHIDE QA Contact: Mike McLean <mikem>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: arequipeno, cje, cra, cweyl, gilboad, jarod, jburgess777, k.georgiou, mattdm, tmraz, wtogami
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-27 20:05:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150226    

Description Alexandre Oliva 2005-03-21 14:38:55 UTC
Although anaconda finds all raid members correctly, it uses a deprecated ioctl
to start raid arrays.  It appears that this deprecated ioctl is very dumb, in
that it takes a single member name, and uses information from the superblock to
locate the other members, without verification.  As a result, it may bring up
incomplete arrays, fail to bring them up entirely, and even bring up unrelated
arrays.

As an example, I had two raid 1 arrays, one with two members (say md2), one with
a single member (say md1).  This is just a simplified scenario; I ran into the
the error with multiple 2+-component devices.  md2 had say hda5 and sda5 as
components; md1 had say sdb5 as the only component.  mdadm --examine confirmed
the array memberships.

As it turned out, I recabled the box such that sda become sdb and vice-versa. 
From that point on, anaconda refused to re-install the box because one of the
arrays was degraded.  When I issued `raidstart /dev/md2' from rescue mode,
starting without any running arrays, it brought up not only a degraded md2 with
hda5 only, but also md1, which was very puzzling.

The reason appears to be that the kernel (or anaconda C-level isys; I haven't
completed my investigation) almost-blindly follows the information it finds in
the superblock of the named member to locate the other members.  So,  when the
dev nodes for partitions change, the kernel doesn't notice the change; it just
goes ahead bringing up all the arrays whose members are listed in the named
block device.  Neat, huh?

I suppose that's why this ioctl is said to be deprecated.  Could we perhaps
change anaconda to use mdadm to bring up raid devices, instead of duplicating
this functionality incorrectly in its own code?  I may try to code this up if
you agree with the general idea.

Version-Release number of selected component (if applicable):
anaconda-10.2.0.28-1

Comment 1 Peter Jones 2005-03-28 19:18:49 UTC
The idea doesn't sound awful.  If you're volunteering to do it, go right ahead ;)

Comment 2 Christian Iseli 2007-01-22 11:08:12 UTC
This report targets the FC3 or FC4 products, which have now been EOL'd.

Could you please check that it still applies to a current Fedora release, and
either update the target product or close it ?

Thanks.

Comment 3 Matthew Miller 2007-04-06 15:15:53 UTC
Fedora Core 3 and Fedora Core 4 are no longer supported. If you could retest
this issue on a current release or on the latest development / test version, we
would appreciate that. Otherwise, this bug will be marked as CANTFIX one month
from now. Thanks for your help and for your patience.


Comment 4 Ian Pilcher 2007-04-18 21:26:03 UTC
This sounds very much like what I am seeing when I try to install Fedora 7
test 3 on my home system.  I have a number of software RAID devices spread
across 4 IDE drives.  From Fedora Core 6 to Fedora 7 test 2, these drives
have been renamed as follows:

    hde --> sda
    hdg --> sdb
    hdi --> sdc
    hdk --> sdd

As a result, anaconda is unable to see any of my software RAID devices, and
I have bee unable to install any Fedora 7 test release.  This is going to
bite anyone with software RAID devices on IDE drives very hard.

Comment 5 Tomas Mraz 2007-04-19 07:32:19 UTC
Seems like it is still an issue.


Comment 6 Will Woods 2007-05-15 21:36:52 UTC
*** Bug 238926 has been marked as a duplicate of this bug. ***

Comment 7 Gilboa Davara 2007-05-16 12:33:20 UTC
Does it work when you manually start the MD device?

- Gilboa

Comment 8 Ian Pilcher 2007-05-16 16:03:53 UTC
I can start the devices with the --uuid= and --scan options.  Trying to start
them by specifying the partitions gives a "no devices found" error.

Comment 9 Jeremy Katz 2007-05-21 17:39:53 UTC
This should be better now that we've switched to using mdadm for
starting/stopping arrays

Comment 10 Ian Pilcher 2007-05-21 18:22:11 UTC
(In reply to comment #9)
> This should be better now that we've switched to using mdadm for
> starting/stopping arrays

Is there installation media that can be used to test this?


Comment 11 Ian Pilcher 2007-05-23 17:31:10 UTC
I just tried installing today's Rawhide.  Some progress has been made, but
this is still not completely fixed.

What works:

* anaconda is able to use mdadm to start the software RAID devices.
* All of the software RAID devices are listed in Disk Druid, and their sizes
  are shown correctly.
* Software RAID devices that are LVM PVs are correctly identified as such,
  and the correct VG is listed in the "Mount Point/RAID/Volume" column.

What doesn't work:

* RAID members are not listed.
* Software RAID devices with an ext3 filesystem on them are not correctly
  identified as such.  Instead they are listed as "software RAID" in the
  "Type" column.  Disk Druid will not allow me to assign a mount point to
  one of these devices unless I also choose to format it.

Since my RAID-1 /boot device is shared between multiple installations, a
reformat is obviously unacceptable.  Net result, I'm still unable to install
on this system.

I disagree with closing this bug.

Comment 12 Peter Jones 2007-05-23 21:25:36 UTC
Should be fixed in anaconda-11.2.0.62-1 .

Comment 13 Peter Jones 2007-05-23 21:35:14 UTC
Sorry, .64-1 .

Comment 14 cje 2007-05-24 08:40:09 UTC
*** Bug 240952 has been marked as a duplicate of this bug. ***

Comment 15 cje 2007-05-24 08:43:52 UTC
bug 240952 was regarding an error placed in /etc/mdadm.conf by anaconda during a
fresh install of F7T4, not an upgrade.  so not 100% certain it really is a dup.
 can someone confirm? - i'm a bit lost in the comments on this bug and bug 238926!

Comment 16 Ian Pilcher 2007-05-25 19:12:51 UTC
Hal-ee-loo-yah!  This appears to be truly fixed in the RC.  Disk Druid
correctly detected ext3 filesystems on pre-existing software RAID devices
and allows me to use them without a reformat.

The component partitions are still not shown for software RAID devices, but
that's certainly not a blocker.

Comment 17 Will Woods 2007-05-27 20:05:18 UTC
(In reply to comment #15)
> bug 240952 was regarding an error placed in /etc/mdadm.conf by anaconda
> during a fresh install of F7T4, not an upgrade.  so not 100% certain it
> really is a dup.

Creating mdadm.conf files incorrectly is a different problem than this. If we're
still creating mdadm.conf files incorrectly then bug 240952 should be reopened,
but this particular bug is fixed.