105598 – installer and rescue refuse to bring up degraded raid 1 devices

Bug 105598 - installer and rescue refuse to bring up degraded raid 1 devices

Summary: installer and rescue refuse to bring up degraded raid 1 devices

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	anaconda
Sub Component:
Version:	rawhide
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jeremy Katz
QA Contact:	Mike McLean
Docs Contact:
URL:
Whiteboard:
Duplicates (3):	177894 452441 570865 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-09-25 21:36 UTC by Alexandre Oliva
Modified:	2011-02-01 14:27 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-09-21 20:40:49 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Support for --raid-devices option (5.30 KB, patch) 2011-02-01 14:25 UTC, Grigory Batalov	no flags	Details \| Diff
Support for --raid-devices option (pykickstart) (1.88 KB, patch) 2011-02-01 14:27 UTC, Grigory Batalov	no flags	Details \| Diff
View All

Description Alexandre Oliva 2003-09-25 21:36:26 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

Description of problem:
If any of the replicas of a raid 1 filesystem is missing, anaconda refuses to
bring up the corresponding raid device.  This makes it tricky (impossible?) to
install when an external disk enclosure isn't properly detected, and makes
rescue mode far less useful.

Comment 1 Jeremy Katz 2003-09-25 23:39:00 UTC

We're not going to support doing things with degraded raid1 devices, at least
not at any point in the forseeable future.

Comment 2 Alexandre Oliva 2003-09-26 01:26:56 UTC

See, the array is not degraded because I wanted it to be.  It was another
installer's problem that caused it to be so.  Still, the usefulness of the
rescue mode is highly limited for RAID systems if it falls appart just because
some raid device is missing one of its replicas.  You can't even use it to get
things back into a usable shape.  Mind if I keep this open, and assigned to
myself?  I'd really like to have this feature, and I hope there isn't any reason
for this feature to actually be undesirable for Fedora Core to the point of
being outright rejected.  Is there?

Comment 3 Jeremy Katz 2003-09-29 20:53:20 UTC

If it's done without significant code complications, I'm not against adding it.
 I just know that that area is already pretty fragile.

Comment 4 Alexandre Oliva 2003-10-19 22:11:09 UTC

For the rescue CD, it seems to me that it would be enough to disable the test
for len(Devices) < totalDevices, but how should this test be disabled?  Only in
expert mode?  In expert or rescue mode?  With a new keyword that says it's ok to
start arrays even in degraded mode?

BTW, wouldn't it be nice if class flags was able to self-initialize with keys
present in the kernel command line?  Then even commands started from the command
line, such as raidstart, would be able to key on say `expert' or `rescue' mode
to decide how to proceed.

Comment 5 Alexandre Oliva 2003-10-19 22:51:37 UTC

FWIW, I seem to have confirmation that this is enough for rescue, and that it
might be enough for update and maybe kickstart installs, but it's definitely not
enough for disk druid to accept degraded raid, but this is good enough for me. 
So...  can we take out that test, or at least disable it when in expert mode? 
Here's the exact test I'm talking about:

	if len(devices) < totalDisks:
            log("missing components of raid device md%d.  The "
                "raid device needs %d drive(s) and only %d (was/were) found. "
                "This raid device will not be started.", mdMinor,
                totalDisks, len(devices))
	    continue

Comment 6 Jeremy Katz 2003-10-21 23:27:23 UTC

No, if it doesn't work in disk druid, then people are going to file bugs because
people boot with 'linux expert' all the time because it makes them feel like
they're going to get more out of their install and I then spend days trying to
figure out what's going on before they mention they booted with 'linux expert'.

Comment 7 Alexandre Oliva 2003-10-22 02:44:11 UTC

Well, it doesn't work in that the installer explicitly says it can't proceed,
and why.  It's still *very* useful for the rescue disk.  Is there any way to tell?

Comment 8 Alexandre Oliva 2005-04-19 22:22:13 UTC

One of the two disks holding raid 1 members for /boot and / failed the other
day.  The system would no longer boot, because the other disk didn't have its
MBR properly set up.  Fine, just boot into the rescue cd, run grub, and I'm all
set.  Right?

Wrong.  Rescue mode wouldn't bring up the raid devices because they were
degraded, so the /-containing volume group wasn't detected, so nothing came up.
 And since grub isn't in the rescue disk, what was supposed to be a 2-minute job
took me several minutes to do by hand what the rescue mode could have done for
me in order to offer me a full-featured root filesystem.

Just having grub in the rescue cd would have done wonders but I still think
bringing up degraded raid devices for rescue is a very reasonable idea. 
(Personally, I find it a reasonable idea for install as well, but I know we
differ on this :-)

Comment 9 Jeremy Katz 2005-09-21 20:40:49 UTC

We're not currently going to add support for this.  If you want to and send a
patch, you can send to anaconda-devel-list for hashing it out.  But as it
stands, this is being done deliberately and we're not looking to actively change
the behavior.

Comment 10 Alexandre Oliva 2006-01-13 15:46:22 UTC

I noticed that anaconda managed to use degraded RAID 1 physical volumes for an
install, but was surprised when rescue mode failed to recognize the same volume
groups after the install.  Is this intentional?  IMHO degraded RAID 1 at rescue
time is *way* more important than at install time, especially given that, with
bug 158426, if your primary /boot replica dies, you're toast.

Comment 11 Jeremy Katz 2006-01-18 17:06:52 UTC

*** Bug 177894 has been marked as a duplicate of this bug. ***

Comment 12 Dan Hollis 2006-01-23 12:18:39 UTC

i can understand maybe installer refusing to install to degraded raid, but
having _rescue_ refuse to bring up degraded raid is astonishing.

Comment 13 Chris Lumens 2008-06-23 11:30:01 UTC

*** Bug 452441 has been marked as a duplicate of this bug. ***

Comment 14 Chris Lumens 2010-03-05 20:31:40 UTC

*** Bug 570865 has been marked as a duplicate of this bug. ***

Comment 15 Grigory Batalov 2011-02-01 14:25:49 UTC

Created attachment 476406 [details]
Support for --raid-devices option

What for me, I'm interested in implementation of degraded arrays in kickstart scripts. I suggest just to add --raid-devices=N option to raid command, similar to mdadm program. See my patches, now I can use following settings:

clearpart --all --initlabel

partition raid.01 --asprimary --size=1024 --onbiosdisk=80
partition raid.03 --asprimary --size=20480 --onbiosdisk=80

raid /boot --level=RAID1 --device=md0 --raid-devices=2 raid.01
raid pv.01 --level=RAID1 --device=md1 --raid-devices=2 raid.03

volgroup vg01 pv.01

logvol /    --vgname=vg01 --size=4096 --fstype=ext3 --name=root
logvol /usr --vgname=vg01 --size=4096 --fstype=ext3 --name=usr
logvol swap --vgname=vg01 --size=4096 --fstype=swap --name=swap
logvol /var --vgname=vg01 --size=8192 --fstype=ext3 --name=var

After installation on first hard drive I've got two degraded arrays (one with LVM) and added second drive later (manually).

Comment 16 Grigory Batalov 2011-02-01 14:27:34 UTC

Created attachment 476407 [details]
Support for --raid-devices option (pykickstart)

Pyckickstart module should be updated as well

Note You need to log in before you can comment on or make changes to this bug.