Bug 645283 - livecd-tools should create a /etc/mdadm.conf so that Intel BIOS RAID arrays get auto started
Summary: livecd-tools should create a /etc/mdadm.conf so that Intel BIOS RAID arrays g...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: livecd-tools
Version: 14
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
Assignee: Brian Lane
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-21 08:35 UTC by Hans de Goede
Modified: 2011-02-17 18:13 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-02-17 18:13:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Hans de Goede 2010-10-21 08:35:30 UTC
Hi,

As discussed in various other places currently the livecd fails to activate Intel BIOS RAID sets on boot. This is a rather serious problems as in unactivated state
Nautilus may mount partitions on the raw memberdisk(s), which will make the set get out of sync and could potentially corrupt data.

I've investigated this a bit this morning and the solution is simple.

The activation (assembling) of the raid sets happens from udev rules and
we have 3 ways of booting Fedora which are relevant here:
1) Start the installation DVD. In this case we don't want the arrays to
   be automatically started, and anaconda's init disables the udev rules
   in question before starting udev -> ok

2) Start an installed system, the udev rules are active here, but they
   depend on their being an mdadm.conf with the following line in it:
   AUTO +imsm +1.x -all

   That is not a problem either since anaconda writes out an mdadm.conf
   with this in there when installing the system -> ok

3) Start a livecd the udev rules are active here, but there is no mdadm.conf


3) Is the problem which we are seeing here. The fix is to teach livecd-tools to
   write an /etc/mdadm.conf when creating the livecd with the following 
   contents:

# mdadm.conf for livecd
MAILADDR root
AUTO +imsm +1.x -all


Regards,

Hans

Comment 1 Hans de Goede 2010-10-21 10:50:39 UTC
Hmm,

Further testing has revealed that the sets do get started from the livecd without an mdadm.conf. Still it would be good to have an mdadm.conf as described above present.

Comment 2 Fedora Admin XMLRPC Client 2010-10-21 17:56:22 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 3 Adam Williamson 2010-10-21 18:00:22 UTC
So I built a live image with the recommended mdadm.conf and the entire mdadm line taken out of rc.sysinit . Doesn't change the behaviour on my test system: I still get /dev/md126 and /dev/md127 with no partitions. /var/log/messages looks the same AFAICT, I still get the Buffer I/O Error lines and stuff.

The /etc/mdadm.conf on the installed system (which works, remember) has UUID lines for each device.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 4 Adam Williamson 2010-10-21 19:17:28 UTC
I don't think this is a blocker, given the practical experience we have with RAID testing. I also don't think it should be NTH, given the uncertainty we see with comment #1. It's so late we should probably leave things as they are unless we find an actual case of data loss or broken installation.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 5 James Laska 2010-10-21 19:30:54 UTC
(In reply to comment #4)
> I don't think this is a blocker, given the practical experience we have with
> RAID testing. I also don't think it should be NTH, given the uncertainty we see
> with comment #1. It's so late we should probably leave things as they are
> unless we find an actual case of data loss or broken installation.

Certainly sounds like an area of concern we should consider investing in.  However, I agree with Adam, I don't think we have enough understanding of the core issue yet, and therefore, not enough to determine whether this is of high enough impact to users.

Does the creation/presence of the mdadm.conf have any impact to non-intel BIOS RAID systems that may have previously not had this issue?

As a workaround, can the liveuser manually create the mdadm.conf as described in comment#0 and rerun a udev scan?

Comment 6 Hans de Goede 2010-10-22 07:27:15 UTC
(In reply to comment #3)
> So I built a live image with the recommended mdadm.conf and the entire mdadm
> line taken out of rc.sysinit . Doesn't change the behaviour on my test system:
> I still get /dev/md126 and /dev/md127 with no partitions. /var/log/messages
> looks the same AFAICT, I still get the Buffer I/O Error lines and stuff.
> 

Ok, so I guess that without an AUTO line in mdadm it will auto assemble anything, which is not quite what I expected.

Could you perhaps make this specific livecd (with the mdadm.conf) available to red_alert, to see if fixes his issue of the set not being activated by the livecd at all ?

> The /etc/mdadm.conf on the installed system (which works, remember) has UUID
> lines for each device.

I don't think the UUID lines are relevant (you could try putting them in though), I think this (problems reading the partition table) is a timing issue which is causes by certain things running slower / faster from the livecd. It is weird though.


(In reply to comment #4)
> I don't think this is a blocker, given the practical experience we have with
> RAID testing. I also don't think it should be NTH, given the uncertainty we see
> with comment #1. It's so late we should probably leave things as they are
> unless we find an actual case of data loss or broken installation.

Agreed. It would be nice to have a proper fix for this, but given the amount of remaining time I think we should CommonBugs 645293, and leave it at that.


(In reply to comment #5)
> Does the creation/presence of the mdadm.conf have any impact to non-intel BIOS
> RAID systems that may have previously not had this issue?

No

> As a workaround, can the liveuser manually create the mdadm.conf as described
> in comment#0 and rerun a udev scan?

Well it seems that having the mdadm.conf (although still the right thing to do) does not help with the issues we are seeing.

Comment 7 James Laska 2010-10-22 16:12:52 UTC
Thanks for the feedback Hans.  We discussed this during the 2010-10-22 F-14-Final blocker bug review meeting.  For reasons discussed in previous comments, the group decided that this bug shouldn't qualify as a Blocker or nice-to-have bug.

I'm adding CommonBugs keyword so we can appropriately document this issue for users.  Thanks!

Comment 8 James Laska 2010-11-01 15:47:08 UTC
I have documented this issue at https://fedoraproject.org/wiki/Common_F14_bugs#intel_bios_raid.  Please feel free to adjust the documentation as needed.

Comment 9 Adam Williamson 2010-11-01 16:47:08 UTC
I don't think the note is accurate. Testing indicated that the presence or absence of such a config file didn't actually make any difference to whether or not the live environment attempts to construct the array. It fails to construct certain arrays correctly, but this has nothing to do with the presence or absence of mdadm.conf.

(moral: don't always trust the initial bug description. :>)



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 10 James Laska 2010-11-01 16:58:21 UTC
(In reply to comment #9)
> I don't think the note is accurate. Testing indicated that the presence or
> absence of such a config file didn't actually make any difference to whether or
> not the live environment attempts to construct the array. 
>
> <snip>
> 
> (moral: don't always trust the initial bug description. :>)

Thanks for the feedback, I wasn't sure about this issue so I definitely wanted to confirm with folks on this bug.

> It fails to construct
> certain arrays correctly, but this has nothing to do with the presence or
> absence of mdadm.conf.

Indeed, it seems I missed that bit of feedback from Hans.  If someone can point out a workaround, I'll be happy to update the common bugs link.  Thanks!

Comment 11 Adam Williamson 2010-11-01 17:13:27 UTC
I don't think there's really any need to have a common bugs entry related to this bug at all, in fact. The bug the common bugs entry should relate to is https://bugzilla.redhat.com/show_bug.cgi?id=645293 , and I see you already wrote an entry for that, so concentrate on that one and forget this one.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 12 Brian Lane 2011-02-17 18:13:52 UTC
Closing as WONTFIX for now, since it doesn't look like we have a good idea of the problem or the solution.


Note You need to log in before you can comment on or make changes to this bug.