Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 596227 - Anaconda swaps names of md devices
Anaconda swaps names of md devices
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: anaconda (Show other bugs)
6.0
All Linux
low Severity medium
: rc
: ---
Assigned To: Hans de Goede
Release Test Team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-26 08:24 EDT by Alexander Todorov
Modified: 2010-07-02 16:50 EDT (History)
8 users (show)

See Also:
Fixed In Version: anaconda-13.21.48-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 596224
Environment:
Last Closed: 2010-07-02 16:50:21 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alexander Todorov 2010-05-26 08:24:14 EDT
+++ This bug was initially created as a clone of Bug #596224 +++

+++ This bug was initially created as a clone of Bug #591970 +++

Description of problem:
Anaconda seems to hang when detecting storage devices if the disk contains RAID partitions which are member of non existing RAID array (I think so). If you wait longer than 5 minutes the UI shows the devices/partitions available.

Version-Release number of selected component (if applicable):
anaconda-13.21.39-1.el6 / -0512.0 compose

How reproducible:
Always

Steps to Reproduce:
1. Install a KVM domU as described in https://bugzilla.redhat.com/show_bug.cgi?id=587442#c12
2. Anaconda will hit the above bug.
3. Start a second install for KVM domU with only one of the disks (i.e. vda)
  
Actual results:
Anaconda seems to hang while discovering storage devices. It actually takes too long (more than 5 minutes). 

Expected results:
Anaconda sees /dev/vda right away and let's the user partition the disk

Additional info:
After waiting long enough I was able to proceed to the partitioning UI. There I selected Custom partitioning and I see the following devices available:
md0 - type unknown
md1 - type unknown
vda1 - ext4 (this was /boot)
vda2 - md1 - raid member
vda3 - md0 - raid member
vda4 - extended
vda5 - swap


Why is anaconda taking so long to find vda disk (no other disks present) and why is it showing the RAID devices when there's only 1 raid partition available.

Also vda2 and vda3 seem to belong to the wrong array or md devices swapped order.

--- Additional comment from atodorov@redhat.com on 2010-05-13 18:12:48 EEST ---

In storage.log you can notice the 5 minutes delay between log messages.

14:42:23,239 DEBUG   : md1 state is inactive
14:42:23,256 DEBUG   :               MDRaidArrayDevice.teardown: md1 ; status: False ;
14:42:23,264 DEBUG   : md1 state is inactive
14:42:23,446 DEBUG   :                 PartitionDevice.teardown: vda2 ; status: True ;
14:42:23,487 DEBUG   :                  MDRaidMember.teardown: device: /dev/vda2 ; status: False ; type: mdmember ;
14:42:23,557 DEBUG   :                  MDRaidMember.teardown: device: /dev/vda2 ; status: False ; type: mdmember ;
14:42:23,560 DEBUG   :                  PartitionDevice.teardown: vda2 ; status: True ;
14:42:23,569 DEBUG   :                   MDRaidMember.teardown: device: /dev/vda2 ; status: False ; type: mdmember ;
14:42:23,577 DEBUG   :                   MDRaidMember.teardown: device: /dev/vda2 ; status: False ; type: mdmember ;
14:47:25,129 DEBUG   :                    DiskDevice.teardown: vda ; status: True ;
14:47:25,159 DEBUG   :                     DiskLabel.teardown: device: /dev/vda ; status: False ; type: disklabel ;
14:47:25,190 DEBUG   :                     DiskLabel.teardown: device: /dev/vda ; status: False ; type: disklabel ;
14:52:28,039 DEBUG   :                  Ext4FS.supported: supported: True ;
14:52:28,188 DEBUG   :              PartitionDevice.setup: vda1 ; status: True ; orig: False ;
14:52:28,202 DEBUG   :               PartitionDevice.setupParents: kids: 0 ; name: vda1 ; orig: False ;
14:52:28,214 DEBUG   :                DiskDevice.setup: vda ; status: True ; orig: False ;
14:52:28,235 DEBUG   :               DiskLabel.setup: device: /dev/vda ; status: False ; type: disklabel ;
14:52:28,249 DEBUG   :                DiskLabel.setup: device: /dev/vda ; status: False ; type: disklabel ;
14:52:28,276 INFO    : set SELinux context for mountpoint /mnt/sysimage to None
14:52:28,439 INFO    : set SELinux context for newly mounted filesystem root at /mnt/sysimage to None
14:52:28,496 DEBUG   :              PartitionDevice.teardown: vda1 ; status: True ;
14:52:28,629 DEBUG   :               PartitionDevice.teardown: vda1 ; status: True ;
14:57:31,356 DEBUG   :                 DiskDevice.teardown: vda ; status: True ;
14:57:31,382 DEBUG   :                  DiskLabel.teardown: device: /dev/vda ; status: False ; type: disklabel ;
14:57:31,397 DEBUG   :                  DiskLabel.teardown: device: /dev/vda ; status: False ; type: disklabel ;

--- Additional comment from pm-rhel@redhat.com on 2010-05-13 18:19:49 EEST ---

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

--- Additional comment from hdegoede@redhat.com on 2010-05-14 10:58:53 EEST ---

Please reproduce and attach complete logs.

--- Additional comment from atodorov@redhat.com on 2010-05-14 11:18:05 EEST ---

Created an attachment (id=413966)
logs from the system

Oops, thanks for reminding me to attach those.

--- Additional comment from hdegoede@redhat.com on 2010-05-14 11:48:51 EEST ---

Looking at the logs it is stuck for 5 minutes in an udev_settle call, which is likely caused by some mdadm udev rules taking ages in this scenario, moving this over to mdadm.

--- Additional comment from dledford@redhat.com on 2010-05-26 03:01:06 EEST ---

OK, there were multiple questions brought up in the original bug, so first to address the "why is this taking so long question" (which is the only real bug present, more on that later):

This happens when anaconda attempts to tear down the raid devices.  I don't think it's a udev rule at all as far as I can tell.  It seems like it's a kernel issue instead.  However, debugging the issue is slow because the installer environment contains so few things that would be useful in figuring the issue out (like strace for example).

Anyway, here's what I know so far, and I'm detailing it here in case any of this rings a bell to Hans because if it does then he might know something I don't and be able to short cut the time to solve this issue:

Anaconda brings up the md devices to inspect them, then tries to stop them before proceeding
When anaconda calls mdadm to stop the device, the device is stopped immediately
Anaconda then calls udevadm settle, this then waits 5 minutes because things never actually settle
Immediately after the md device is stopped, a continuous loop of kernel add/remove actions and subsequent udev add/remove events starts happening
If you do ls /sys/block over and over again, you will see devices popping into and out of existence repeatedly, so unless udev has the ability to force the kernel to actually create kernel structures and devices, I don't think this can be attributed to udev rules, instead I think udev is simply trying to process the endless stream of kernel events and so we see udev looping forever and blame udev when in fact the kernel is driving this issue
The installed system kernel does not exhibit this behaviour

That's my analysis so far, I'll continue working on it.

As for "why do the raid devices show up when not all member disks are present?"  Each device in an array contains a complete superblock that details the array.  Anaconda simply notices that the device has a superblock and pulls the raid array info from that superblock.  There is no attempt to verify that there are sufficient members present for the array to be successfully started before the array info is added into the list of devices to install on.

As for "md0 and md1 seem to be the wrong order" that's because they *are* in the wrong order.  The arrays are added to the device list when we first find a device with a superblock.  Which device we find first (/dev/vda2 or /dev/vda3) is random because we find them in the order that udev processes them and udev can process the partitions on a drive in any order.  According to the storage.log, /dev/vda3 was found first, and according to the blkid information it has a raid superblock that clearly indicates that it *wants* the name /dev/md1, but because it was found first, anaconda blindly renamed it to /dev/md0.  And when /dev/vda2 was processed, the logs show it has a superblock that clearly indicated it wanted the name /dev/md0, but because it was already taken, it was blindly renamed to /dev/md1.

I consider this a *SERIOUS* shortcoming in how anaconda processes md raid devices, but I was told that changing anaconda so that it didn't renumber arrays in the order that it finds them was too drastic of a change for f13/rhel6 and it would have to wait for f14 (or later) and rhel7.  All I know is that since we switched to letting udev find devices for us, and we can no longer count on devices being found "in order", the decision to let anaconda renumber raid devices like this is a recipe for disastrous data loss.  Somewhere, someone is going to blindly think they are preserving their /home filesystem when they select /dev/md0 for upgrade and leave /dev/md1 unformatted and then we are going to wipe out everything they care about.  Personally, I think this is important enough to either A) open a new bug about this behaviour and make it an RC blocker or B) disable upgrades on md raid arrays entirely and instead require that if any md raid arrays exist and are to be preserved, then the install can only happen on newly created partitions, in that way we can tell users that they can boot up with the DVD in rescue mode and hand select their raid arrays they want to upgrade on and hand zero their superblocks, that way they won't be in any md device listings and their space will be free, that way it will be impossible for anaconda to confuse the user by putting them in the wrong order and allow the user to do the install simply by reusing the now empty partitions for a brand new md raid array.

--- Additional comment from dledford@redhat.com on 2010-05-26 03:05:05 EEST ---

*** Bug 589981 has been marked as a duplicate of this bug. ***

--- Additional comment from atodorov@redhat.com on 2010-05-26 15:21:45 EEST ---

Hi Doug and others,
let's use this bug to track the issue where anaconda is swapping the names of md devices. 

As Doug says this can confuse the user and cause a data loss which is bad thing.
Comment 1 Alexander Todorov 2010-05-26 08:24:51 EDT
Cloning for RHEL6 (some update) or RHEL7.
Comment 2 RHEL Product and Program Management 2010-05-26 08:36:47 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 3 Hans de Goede 2010-05-26 08:52:54 EDT
(In reply to comment #0)
> As for "md0 and md1 seem to be the wrong order" that's because they *are* in
> the wrong order.  The arrays are added to the device list when we first find a
> device with a superblock.  Which device we find first (/dev/vda2 or /dev/vda3)
> is random because we find them in the order that udev processes them and udev
> can process the partitions on a drive in any order.  According to the
> storage.log, /dev/vda3 was found first, and according to the blkid information
> it has a raid superblock that clearly indicates that it *wants* the name
> /dev/md1, but because it was found first, anaconda blindly renamed it to
> /dev/md0.  And when /dev/vda2 was processed, the logs show it has a superblock
> that clearly indicated it wanted the name /dev/md0, but because it was already
> taken, it was blindly renamed to /dev/md1.
> 
> I consider this a *SERIOUS* shortcoming in how anaconda processes md raid
> devices, but I was told that changing anaconda so that it didn't renumber
> arrays in the order that it finds them was too drastic of a change for
> f13/rhel6 and it would have to wait for f14 (or later) and rhel7.

I'm afraid this is a misunderstanding. What we wanted to delay till Fedora-14 /
a later RHEL is allowing specifying a name for an mdraid array, like we allow for volgroups or logical volumes. We do however try to take the current superblock device name into account when it matches the md# scheme and if we fail to do that, we've a bug.

Here is the code in question:

            # try to name the array based on the preferred minor
            md_info = devicelibs.mdraid.mdexamine(device.path)
            md_path = md_info.get("device", "")
            md_name = devicePathToName(md_info.get("device", ""))
            if md_name:
                try:
                    minor = int(md_name[2:])     # strip off leading "md"
                except (IndexError, ValueError):
                    minor = None  
                    md_name = None
                else:
                    array = self.getDeviceByName(md_name)
                    if array and array.uuid != md_uuid:
                        md_name = None

Where the essence is the call to devicelibs.mdraid.mdexamine(device.path),
which does:
mdadm --examine --brief /dev/vda#

Of which we then parse the output, we seem to have a bug somewhere in
this code, and I fully agree this is something which we should fix for RHEL-6.
Comment 4 Hans de Goede 2010-05-26 08:56:15 EDT
And here we have our problem, from program.log from bug 591970:

ARRAY /dev/md/1 metadata=1.1 UUID=

Notice the /dev/md/1 rather then /dev/md1, that is not what anaconda expects. One patch coming up.
Comment 5 Hans de Goede 2010-05-26 09:31:28 EDT
A patch which should fix this has been send to the mailinglist for review, adding a devel-ack.
Comment 6 Hans de Goede 2010-05-26 10:25:11 EDT
This is fixed in anaconda-13.21.48-1, moving to modified.
Comment 7 Doug Ledford 2010-05-26 12:31:56 EDT
Thanks Hans, the idea of devices silently swapping around was a scary one ;-)
Comment 9 Alexander Todorov 2010-06-28 09:19:21 EDT
Tested with anaconda-13.21.50-9.el6 (0622.1 tree) and steps to reproduce from comment #0. Now anaconda correctly shows that vda2 belongs to md0 and vda3 belongs to md1. Moving to VERIFIED.
Comment 10 releng-rhel@redhat.com 2010-07-02 16:50:21 EDT
Red Hat Enterprise Linux Beta 2 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.