Red Hat Bugzilla – Bug 105598
installer and rescue refuse to bring up degraded raid 1 devices
Last modified: 2011-02-01 09:27:34 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703
Description of problem:
If any of the replicas of a raid 1 filesystem is missing, anaconda refuses to
bring up the corresponding raid device. This makes it tricky (impossible?) to
install when an external disk enclosure isn't properly detected, and makes
rescue mode far less useful.
We're not going to support doing things with degraded raid1 devices, at least
not at any point in the forseeable future.
See, the array is not degraded because I wanted it to be. It was another
installer's problem that caused it to be so. Still, the usefulness of the
rescue mode is highly limited for RAID systems if it falls appart just because
some raid device is missing one of its replicas. You can't even use it to get
things back into a usable shape. Mind if I keep this open, and assigned to
myself? I'd really like to have this feature, and I hope there isn't any reason
for this feature to actually be undesirable for Fedora Core to the point of
being outright rejected. Is there?
If it's done without significant code complications, I'm not against adding it.
I just know that that area is already pretty fragile.
For the rescue CD, it seems to me that it would be enough to disable the test
for len(Devices) < totalDevices, but how should this test be disabled? Only in
expert mode? In expert or rescue mode? With a new keyword that says it's ok to
start arrays even in degraded mode?
BTW, wouldn't it be nice if class flags was able to self-initialize with keys
present in the kernel command line? Then even commands started from the command
line, such as raidstart, would be able to key on say `expert' or `rescue' mode
to decide how to proceed.
FWIW, I seem to have confirmation that this is enough for rescue, and that it
might be enough for update and maybe kickstart installs, but it's definitely not
enough for disk druid to accept degraded raid, but this is good enough for me.
So... can we take out that test, or at least disable it when in expert mode?
Here's the exact test I'm talking about:
if len(devices) < totalDisks:
log("missing components of raid device md%d. The "
"raid device needs %d drive(s) and only %d (was/were) found. "
"This raid device will not be started.", mdMinor,
No, if it doesn't work in disk druid, then people are going to file bugs because
people boot with 'linux expert' all the time because it makes them feel like
they're going to get more out of their install and I then spend days trying to
figure out what's going on before they mention they booted with 'linux expert'.
Well, it doesn't work in that the installer explicitly says it can't proceed,
and why. It's still *very* useful for the rescue disk. Is there any way to tell?
One of the two disks holding raid 1 members for /boot and / failed the other
day. The system would no longer boot, because the other disk didn't have its
MBR properly set up. Fine, just boot into the rescue cd, run grub, and I'm all
Wrong. Rescue mode wouldn't bring up the raid devices because they were
degraded, so the /-containing volume group wasn't detected, so nothing came up.
And since grub isn't in the rescue disk, what was supposed to be a 2-minute job
took me several minutes to do by hand what the rescue mode could have done for
me in order to offer me a full-featured root filesystem.
Just having grub in the rescue cd would have done wonders but I still think
bringing up degraded raid devices for rescue is a very reasonable idea.
(Personally, I find it a reasonable idea for install as well, but I know we
differ on this :-)
We're not currently going to add support for this. If you want to and send a
patch, you can send to anaconda-devel-list for hashing it out. But as it
stands, this is being done deliberately and we're not looking to actively change
I noticed that anaconda managed to use degraded RAID 1 physical volumes for an
install, but was surprised when rescue mode failed to recognize the same volume
groups after the install. Is this intentional? IMHO degraded RAID 1 at rescue
time is *way* more important than at install time, especially given that, with
bug 158426, if your primary /boot replica dies, you're toast.
*** Bug 177894 has been marked as a duplicate of this bug. ***
i can understand maybe installer refusing to install to degraded raid, but
having _rescue_ refuse to bring up degraded raid is astonishing.
*** Bug 452441 has been marked as a duplicate of this bug. ***
*** Bug 570865 has been marked as a duplicate of this bug. ***
Created attachment 476406 [details]
Support for --raid-devices option
What for me, I'm interested in implementation of degraded arrays in kickstart scripts. I suggest just to add --raid-devices=N option to raid command, similar to mdadm program. See my patches, now I can use following settings:
clearpart --all --initlabel
partition raid.01 --asprimary --size=1024 --onbiosdisk=80
partition raid.03 --asprimary --size=20480 --onbiosdisk=80
raid /boot --level=RAID1 --device=md0 --raid-devices=2 raid.01
raid pv.01 --level=RAID1 --device=md1 --raid-devices=2 raid.03
volgroup vg01 pv.01
logvol / --vgname=vg01 --size=4096 --fstype=ext3 --name=root
logvol /usr --vgname=vg01 --size=4096 --fstype=ext3 --name=usr
logvol swap --vgname=vg01 --size=4096 --fstype=swap --name=swap
logvol /var --vgname=vg01 --size=8192 --fstype=ext3 --name=var
After installation on first hard drive I've got two degraded arrays (one with LVM) and added second drive later (manually).
Created attachment 476407 [details]
Support for --raid-devices option (pykickstart)
Pyckickstart module should be updated as well