Bug 500004 - preupgrade/anaconda can't use install.img (stage2) on RAID
Summary: preupgrade/anaconda can't use install.img (stage2) on RAID
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: preupgrade
Version: 16
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Richard Hughes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 504826 608193 (view as bug list)
Depends On:
Blocks: 494832
TreeView+ depends on / blocked
 
Reported: 2009-05-10 02:41 UTC by Gerry Reno
Modified: 2014-01-21 23:09 UTC (History)
47 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-14 02:35:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Gerry Reno 2009-05-10 02:41:51 UTC
Description of problem:
Ran preupgrade.  It downloaded all packages and the kernel, initrd, stage2.img.  After reboot, the installer complains that it cannot find the iso9660 image and it gives you a list of devices to choose from (the only devices in the list are basic disk partitions) and a path box.  Since /boot is mounted on /dev/md0 there is no way to tell the installer how to mount the array as only disk partitions are available as choices.  If you choose one of the array member partitions, it will fail since the array is probably active in the background which prevents any member elements from being directly mounted.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. see description
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Gerry Reno 2009-05-10 16:52:03 UTC
The workaround is to download the DVD install iso, burn it and run a regular anaconda upgrade from the DVD.  That at least gets you by the roadblock.

Comment 2 Will Woods 2009-05-10 19:15:52 UTC
Using preupgrade with /boot on RAID doesn't work anymore. Preupgrade 1.1.x will notify you of this fact and offer a workaround - downloading install.img from a wired, DHCP internet connection after reboot.

If you'd like, you could try preupgrade-1.1.0pre3 from here to confirm this:
http://koji.fedoraproject.org/koji/buildinfo?buildID=100963

There are also some non-network workarounds for this problem. For example, you should be able to burn boot.iso (or netinst.iso, or CD1) to a CD, and then remove the "stage2=http://..." argument from the preupgrade boot commandline. Anaconda should autodetect the installer media and use that. (You could do a similar thing with install.img on a USB stick, if you change the stage2=XXX parameter to the correct value). 

In the future these workarounds will be documented and supported, but right now that's still a work in progress. If you're not comfortable trying untested code and helping to document and test these new workarounds, the simplest path forward is just to do a standard upgrade from the DVD image, as Gerry suggests.

Comment 3 Gerry Reno 2009-05-10 21:33:20 UTC
With preupgrade we're talking about an existing installation and a simple "mount /boot" should show the installer how /boot is mounted.  I don't understand why this is so hard to support when it's trivial to figure out from the command line.

Comment 4 Will Woods 2009-05-11 16:41:52 UTC
It's not that simple. The installer starts up in two "stages". The kernel and initrd.img (stage1) have just enough code to find and load install.img (stage2), which contains all the mdraid/dmraid/LVM code, the installer, a shell, etc.

Stage1 does *not* know how to handle LVM or RAID partitions. So if stage2 is on a RAID partition, we can't load it, so the installer doesn't start.

Previous versions of preupgrade could direct anaconda stage1 to mount a single member of a mirrored mdraid array, copy install.img to a ramdisk, and then unmount the RAID member. But this required a *lot* of extra RAM and caused the mdraid array to become degraded. 

I believe current anaconda will instead mount the install.img in-place - which means the RAID member does not get unmounted, which means anaconda can't assemble your RAID set when it starts up, which means you can't upgrade it.

Much easier - and safer - to just burn boot.iso and let stage1 fetch install.img from the CD instead.

Comment 5 Gerry Reno 2009-05-11 17:26:40 UTC
Having /boot mounted on RAID is getting fairly common.  How about adding the mdadm to stage1 tools?  It doesn't take up all that much space.  And it would definitely simplify the process.

Comment 6 Will Woods 2009-05-11 17:56:32 UTC
We use the same initrd for new installs and upgrades, and that would *only* useful for upgrades that also have /boot on RAID, which is very much a non-default setup. So now we're bloating the initrd with code that's useless for the vast majority of cases.

Furthermore, the tools are still useless without mdadm.conf.. which is on the RAID set. So now you need some extra code to fetch mdadm.conf from the system *before* rebooting and put it into the initrd somehow. 

That's a lot of extra code across two different packages just to handle your special case - and I don't believe that's the right way to solve the general problem.

You chose a special setup, so for now you're going to have do some extra work. Sorry about that.

Comment 7 Gerry Reno 2009-05-12 18:53:18 UTC
(In reply to comment #6)
> We use the same initrd for new installs and upgrades, and that would *only*
> useful for upgrades that also have /boot on RAID, which is very much a
> non-default setup. So now we're bloating the initrd with code that's useless
> for the vast majority of cases.

We're talking a very small amount of code here.  And anyone who wants to still have a bootable system even after a disk failure is going to put /boot on RAID.

> 
> Furthermore, the tools are still useless without mdadm.conf.. which is on the
> RAID set. So now you need some extra code to fetch mdadm.conf from the system
> *before* rebooting and put it into the initrd somehow. 

As I pointed out in another anaconda bug, you do not need mdadm.conf.
In fact, you can generate mdadm.conf like this:
mdadm -E -s > /etc/mdadm.conf

You only need 'mdadm' and you do not need special tools to read/copy/whatever mdadm.conf.  As I showed mdadm can generate an mdadm.conf for you.

If you don't want to look into this that's fine.  But it really would be rather simple to add.

Comment 8 Axel Thimm 2009-06-30 08:03:18 UTC
I'm reopening for two reasons, first it isn't CANTFIX, it would be WONTFIX. :)
And second I think this is a valid request in a wider range of setups.

The people using preupgrade instead of a regular DVD reinstall/upgrade are usually the ones that value the absence of a DVD in the first place. My use case for example are several hosted Fedora servers that could be easily preupgraded w/ minimal downtime instead of going the cobbler/koan way.

Comment 9 Kevin R. Page 2009-06-30 10:20:36 UTC
Firstly to note that, yes, I value the absence of a DVD (which is why I'm trying to use preupgrade), and that my /boot is on RAID 1 (and has been for many years, along with all my partitions).

Using preupgrade-1.1.0-2.fc10 from koji to try and go from F10 to F11 I now get the message:

"/boot is on RAID device md0
The installer can download this file once it starts, but this requires a wired
network connection during installation.
If you do not have a wired network connection available, you should quit now
Continue / Quit"

(I didn't get the message dialog at all before upgrading to the koji version)

However, when I press continue the UI stalls, no network traffic, and:

  File "/usr/lib/python2.5/site-packages/preupgrade/dev.py", line 87, in
bootpath_to_anacondapath
    raise PUError, "/boot is on RAID device %s" % bootdev
preupgrade.error.PUError: /boot is on RAID device md0


on the xterm. I can cancel out of preupgrade.

Re-starting preupgrade and resuming, I get the same dialog; but clicking continue closes the UI with the same error on the xterm.

preupgrade-cli also fails with the same error.

Workarounds welcome, including getting to the point of the second stage installer being downloaded over the network on install.

Comment 10 Will Woods 2009-07-10 16:19:19 UTC
Bug 504826 is the bug for the *traceback* when attempting to use preupgrade on a system with /boot on RAID.

You're supposed to just get a warning dialog and preupgrade will set up anaconda to fetch install.img from the internet during the install.

F11 anaconda *cannot* complete an upgrade with install.img on a RAID /boot. This is not a bug that can be fixed in preupgrade.

Comment 11 Ray Todd Stevens 2009-07-10 16:31:53 UTC
Actually I am finding that f11 can't deal with a /boot on a /dev/md0 device even if run off of a cdrom.

I am finding huge numbers of problems and bugs in fc11 with the use of md0 if it is running RAID1.   Interestingly enough I did find that these seem to go away with RAID5.

Comment 12 Bug Zapper 2009-07-14 18:21:36 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 Axel Thimm 2009-07-15 10:42:50 UTC
(In reply to comment #12)
> Fedora 9 changed to end-of-life (EOL) status

We had moved to discussing F11 and I see that in rawhide as well, so I'll put this to rawhide, everyone OK with that?

Comment 14 Ray Todd Stevens 2009-07-15 13:01:47 UTC
I am wondering if the solution here is to in some manner use a flash drive.  USB ports are pretty standard now and have been for a while.   I wonder how complicated it would be to have preupgrade be able to place the image and stage2 on a flash drive if boot doesn't have enough root, OR if it is a md0 drive.

Comment 15 seth vidal 2009-07-15 13:28:29 UTC
Doing so would be unreliable I suspect.

Comment 16 Ray Todd Stevens 2009-07-15 13:41:52 UTC
Probably right.   Actually I was thinking, and I am involved in several preupgrade bug reports.   Anyway maybe what we need as a start is to have access to the ability to modify the "command line" that runs preupgrade for things like where to find these files, and also to manage static ip addresses.   Then give some guidance in how to do this but specify that it is an experts only solution.   Generally only experts will have most of these problems anyway.

Comment 17 seth vidal 2009-07-15 13:50:01 UTC
I don't really think it makes sense for experts to use preupgrade. If they need precision and control then they should just use kickstart.

Comment 18 Greg Morgan 2009-07-19 03:53:40 UTC
Long story short:  I don't thing the "expert" or "non-expert" should be the criteria here.  If they are lazy, then they will use it.  ;-)

* I am a home user that had an NFS server on 3 400gig pata drives. The lead drive with the boot sector wouldn't always boot.
Moved all the data to 4 750gig drives using software raid because of the first scare.

* Being lazy I saw that you could put the /boot partition on raid and won't have to carve out a special /boot partition on ext3.  Moreover, I felt it would protect me from the pata drive experience.

* If I wanted less upgrades, then I'd go with centos. Being lazy, I thought preupgrade was a good deal that handles the frequent fedora upgrade issue to make wifie happy i.e. it would take less time on my part and her web surfing wouldn't be interrupted.

* I've seen many people on the mailing list that I would consider as experts using preupgrade.

* Hence, I am missing how software mirroring is a special case.  I am also missing how the f10 anaconda let me install raid-1 yet the both f11 anaconda and f11 preupgrade as executed on the f10 system going to f11 use the same images based on the comment 1, comment 2, comment 3, comment 4, and comment 5.  If preupgrade is passing the stuff to anaconda, then why would a DVD upgrade or install be any different?

* I believe it is a great complement to all the developers out there that the tools are more mature for both expert and non-expert to use.  I am thinking that things like initrd.img and install.img will have to grow to support the "general os" design target of Fedora.


yum update # for the f10 system with software raid-1



F10 to F11 for cli first attempt
preupgrade-cli "Fedora 11 (Leonidas)"
...
treeinfo timestamp: Tue Jun  2 15:15:53 2009
.treeinfo                                                | 1.2 kB     00:00
vmlinuz                                                  | 3.0 MB     00:03
initrd.img                                               |  19 MB     00:18
install.img                                              | 111 MB     01:36
Traceback (most recent call last):
## see the same errors below.


F10 to F11 for cli second attempt.
preupgrade-cli "Fedora 11 (Leonidas)"
...
treeinfo                                                | 1.2 kB     00:00
/boot/upgrade/vmlinuz checksum OK
/boot/upgrade/initrd.img checksum OK
/boot/upgrade/install.img checksum OK
Traceback (most recent call last):
  File "/usr/share/preupgrade/preupgrade-cli.py", line 305, in <module>
    pu.main(myrelease)
  File "/usr/share/preupgrade/preupgrade-cli.py", line 206, in main
    bootdevpath = bootpath_to_anacondapath(stage2_abs,UUID=True)
  File "/usr/lib/python2.5/site-packages/preupgrade/dev.py", line 86, in bootpath_to_anacondapath
    raise PUError, "/boot is on RAID device %s" % bootdev
NameError: global name 'PUError' is not defined

Comment 19 Bug Zapper 2009-11-16 09:58:42 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 20 Fedora Admin XMLRPC Client 2010-04-16 14:35:19 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 21 Bug Zapper 2010-11-04 11:15:38 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 22 Andrew Meredith 2010-11-04 19:32:38 UTC
I can confirm that the Fedora 13 version of preupgrade seems to suffer from exactly the same issues as the Fedora 12 version. I am not empowered to up-version this report, so could the maintainer or the reporter please take this up to Fedora 13.

Thanks

Comment 23 Ray Todd Stevens 2010-11-04 20:17:38 UTC
Sorry to have to point this out, but it is also a problem on 14.

Comment 24 Ray Todd Stevens 2010-11-22 19:40:29 UTC
By the way I just reread comment 2.   Would it be possible to automate this process.   Frankly for almost all of us the advantage of preupgrade is it downloads all of the packages needed for the upgrade and in doing so cuts the time off line needed to upgrade to about 1/4 of what it otherwise would be.

So would it be possible to have preupgrade as a part of the process detect the raid 1 problem, and instead of just giving the you must have a network to actually do this message, have an option to have the cd in the drive and then have the installer use that for the stage 2 file.   I would suspect that would meet 99% of the needs here.

Comment 25 Ray Todd Stevens 2010-11-22 19:45:15 UTC
This seems to be the same bug as 608193

Comment 26 Ray Todd Stevens 2010-11-25 19:09:26 UTC
OK I just tried method from comment 2.   This actually works great.   For me it certainly solves the problems involved here.

How about this as a real solution to this and the other bug mentioned here.  When stage2 can't be loaded preupgrade generates two grub.conf entries.   The first one is exactly as it is right now, the second is of the form mentioned in the comment and is labeled "upgrade with help from cdrom drive" or something like that.   Those of us who need to do this upgrade and don't want to take the time it takes to download over the net can hit the space bar and bring up the menu, and they insert the cd and select the second option and do our upgrades this way.

Then just remember to remove both entries as a part of the completion process of the upgrade.

Comment 27 Pavel Urban 2010-11-25 20:07:49 UTC
Well, after having my notebook's system effectively destroyed by preupgrade, I prefer yum-way (http://fedoraproject.org/wiki/YumUpgradeFaq). Preupgrade has one veeery nice issue - when it doesn't like just one package you have on your system, it is able to crash just in the middle of installation. Just imagine - it is installing let's say 1500 packages and crashes right before 'cleanup' phase - so you are left with a mess of duplicate packages both old and new, no postinstall actions passed (no modules.dep, for example), no statefile for yum (so forget about yum-complete-transaction) and so on. Thank you very much, I no longer think this whole thing is useful.

Comment 28 Ray Todd Stevens 2010-11-25 22:06:02 UTC
I don't think this is a preupgrade bug, but an upgrade bug.   I have several systems hosed this way with a normal upgrade. 

But yes a method for a restart would be nice.   Why don't you fill out a RFE requesting this?

Comment 29 Bug Zapper 2011-06-02 18:06:06 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 30 udo 2011-06-04 15:56:30 UTC
preupgrade.error.PUError: Not enough space in /boot/upgrade to download initrd.img.

That is what I get when I run preupgrade-cli on Fedora 14.
Yes, there was room for vmlinuz.
No, the initrd.img must be retrieved later.
The tip from http://fedoraproject.org/wiki/PreUpgrade#Method_2:_Trick_preupgrade_into_downloading_the_installer doesn't have any influence.

Comment 31 udo 2011-06-04 16:03:54 UTC
I see this happen on two x86_64 F14 boxes.
With preupgrade and preupgrade-cli.
Result is we cannot upgrade this way which used to be the most easy method after fixing previous issues.

Comment 32 aake 2011-06-25 08:26:27 UTC
I tried to upgrade 14 -> 15. Not successful:

Traceback (most recent call last):
  File "/usr/share/preupgrade/preupgrade-cli.py", line 329, in <module>
    pu.main(release)
  File "/usr/share/preupgrade/preupgrade-cli.py", line 219, in main
    extra_args += " ks=%s" % self.generate_kickstart(extra_cmds=self.kickstart_cmds)
  File "/usr/lib/python2.7/site-packages/preupgrade/__init__.py", line 601, in generate_kickstart
    return dev.bootpath_to_anacondapath(targetfile, UUID=True)
  File "/usr/lib/python2.7/site-packages/preupgrade/dev.py", line 91, in bootpath_to_anacondapath
    raise PUError, "/boot is on RAID device %s" % bootdev
preupgrade.error.PUError: /boot is on RAID device dm-1

Error is misleading because /boot is not in RAID.

I have also RAID but it is not related to /boot partition. RAID is also using different disks than Fedora (/boot etc).

/dev/mapper/pdc_eaijiiddp1
                        495844    166791    303453  36% /boot


I'm new here (fedora) so let's see if I understood this correctly.
- This problem is related to RAID and LVM (many people here are only 
  talking about RAID).
- Fedora by default created /boot to LVM
  I think very small amount of users change default configuration.
--> Something like 99.9% of the users can not use preupgrade?

If this is true then something should be done before preupgrade is totally forgotten....

Comment 33 udo 2011-06-25 09:05:25 UTC
The LVM thing is new, I guess, to this bug.

The whole upgrade approach is broken.
Aside from the refusal to fix it, please do think out of the box.
If upgrading a RAID system is so hard using the reboot-into-fedora-upgrade-gizmo-setup, then why NOT reboot?
I.e.: go to single user with the running kernel, while keeping all disks mounted, and do the upgrade.
At the end we could reboot and voila: done.

Maybe this can be done by mounting install.img somewhere and starting anaconda from there?

Comment 34 aake 2011-06-25 09:39:14 UTC
(In reply to comment #33)
> The LVM thing is new, I guess, to this bug.
> 
I think it was referred in comment 4:
"Stage1 does *not* know how to handle LVM or RAID partitions."

Comment 35 udo 2011-06-25 09:50:44 UTC
Hmm. preupgrade did work well enough on my non-raid LVM box... (going to F15)

Comment 36 Bill McGonigle 2011-08-13 20:32:57 UTC
/boot on RAID-1 is a 'special case'?  Heck, every machine, aside from laptops, that I've setup with Redhat 6-9 and Fedora 1-15 in the past 10 years have had /boot on RAID.  Do others get more reliable hard drives than I do?  Can we query smolt to see just how much of an outlier this problem is?

Anyway, confirming this is still present in f14:

  File "/usr/lib/python2.7/site-packages/preupgrade/dev.py", line 91, in bootpath_to_anacondapath
    raise PUError, "/boot is on RAID device %s" % bootdev
preupgrade.error.PUError: /boot is on RAID device md0


BTW, can somebody please mark bug 504826 and bug 608193 as duplicates?

Comment 37 Andrew Meredith 2011-08-21 14:18:50 UTC
This is just to confirm comment #36 above.

The only machines I have built in the last decade or so that *didn't* have /boot on a kernel RAID 1 volume were dataless workstations, VMs (where the drives were already RAIDed), high end machines with decent h/w RAID cards and disposable test servers. Anything else (ie most of them) had kernel RAID1 volumes as /boot.

It has only been a problem more recently. Before that it was considered standard practice. It was even taught by Red Hat education ... I know, I was one of the instructors.

I know it would be very convenient to declare it as cush, but this is not a rare corner case. Outside of big iron data centres, in SME-land, this is the norm.

Comment 38 udo 2011-09-30 13:21:36 UTC
How can we make the Fedora crew see the inevitable?
I.e.: that features that were on expensive setups in the past now are commonplace at modest setups for normal people?

If the Fedora crew are afraid of RAID with their kernels, wWhy can't we work around the issue by making the preupgrade thing work with the kernel that was already running? I.e.: not booting into the preupgrade kernel at all because that is not necessary. (!)
We just stop all processes except a few, go to runlevel 1 or whatever and do the usual routine. All hardware will be usable. All storage reachable.
Then the normal upgrade cane be done, after which we /do/ reboot.

Comment 39 Paul Howarth 2011-10-01 07:29:09 UTC
I always have /boot on RAID1 too, for what it's worth.

Comment 40 udo 2011-10-01 08:58:03 UTC
w.r.t. comment #4: 
"I believe current anaconda will instead mount the install.img in-place - which
means the RAID member does not get unmounted, which means anaconda can't
assemble your RAID set when it starts up, which means you can't upgrade it."

Why then do we (re)boot? Everything is already up and running after the  download of the updates.
The excuse you explained is not valid as we can remove the need for a reboot.
Please explain why this has not been done.

In this era rebooting twice for an update like this looks excessive. A one time boot can be explained well enough, though.

W.r.t. comment #6:
"You chose a special setup,"
See my comment #38:
"How can we make the Fedora crew see the inevitable?
I.e.: that features that were on expensive setups in the past now are
commonplace at modest setups for normal people?"
I.o.w.: these setups are not special anymore.

If you consider us experts (ignoring the reply to comment #6) why can't we start the upgrade manually without booting? This would make us responsible, give a workaround and buy you time.

Comment 41 udo 2011-10-01 10:32:53 UTC
We need to change the title of this bug.
There is no stage2 anymore.
initrd and install image are one file now: https://bugzilla.redhat.com/show_bug.cgi?id=736318#c2

Comment 42 udo 2011-10-04 13:40:18 UTC
Please explain why even thinking of a solution to this very basic problem is allowed to take up more time than it currently has?

Why not decide about a solution, schedule an implementation, have it implemented, tested, released and be done with it?

Comment 43 Richard Hughes 2011-10-04 13:48:30 UTC
(In reply to comment #42)
> Why not decide about a solution, schedule an implementation, have it
> implemented, tested, released and be done with it?

As always, patches welcome.

Comment 44 Ray Todd Stevens 2011-10-04 14:56:54 UTC
Well I would start with supporting RAID1 simply by treating the first drive of a RAID 1 pair holding the boot volume as a drive and reading the data from there.   That I can tell nothing needs to be changed on boot until well after stage2 is running.

I do have to wonder who RAID is not considered a standard configuration in more cases.   Certainly in this day and age with the cost of drives, RAID is a smart idea on about any system.   Over the years I have used the RAID system in linux quite extensively and it works very well, in many cases better than hardware raid, and certainly much better than DMRAID.   But the system almost never seems to be tested with it.

Comment 45 Adam Williamson 2011-10-05 21:16:09 UTC
RAID is tested as part of Fedora validation, but the test cases specifically ensure that /boot is not made a part of the RAID set, and we specifically exclude /boot on RAID from the release criteria:

"The installer must be able to create and install to software, hardware or BIOS RAID-0, RAID-1 or RAID-5 partitions for anything except /boot "

on advice from the anaconda team.

preupgrade is just another moving part here; it has to be understood that, looked at from within Fedora, preupgrade is a kind of 'third party addon' to the install/upgrade process. preupgrade is a separate tool from anaconda, maintained by a different person, which post-dates anaconda and was written as a kind of convenience additional feature. So anaconda team don't necessarily consider the impact on preupgrade in difficult RAID cases when deciding how to improve anaconda's design.

Comment 46 Bill McGonigle 2011-10-05 21:55:29 UTC
Yeah, that criteria might be worded too tersely, but I think what it's trying to get at is that there are some environments in which a real initramfs is needed to get some RAID's online, and that anaconda will support those but won't go to Herculean efforts to try to support those on /boot.  Perhaps a reasonable thing to say is that RAID on boot may be supported (say md RAID on simple storage) but that nobody expects that boot on every type of RAID will be supported.  Maybe we can improve the experience for the common 95%.

Also, some of the original rationale has changed from early on in this bug (stage1/stage2, dracut, rarity, etc.)  Perhaps it would be useful for somebody well-versed to summarize the current work required so the right people would be able to offer patches, or find the talent required to offer patches, if interested.

And yes, I understand that Fedora still officially resists any attempts at live upgrades, despite all of its users who find it a tremendously useful thing to do.  I realize complexity has to be preserved and nothing comes for free.

Comment 47 Ray Todd Stevens 2011-10-06 02:45:28 UTC
This excluding of the /boot partition would seem to me to be becoming more and more unwise.    In the real world high reliability is becoming more and more important.   Not only is fedora free, but it is also in general more reliable than the pay alternatives.   So it is going to be used in more and more high reliability installations.

Part of high reliability is to make the boot redundant.   Frankly the part about the system not being able to reboot if boot is on raid1 and one of the pair is dead is also unwise.

I would think that going for /boot on raid 5 would be undoable in most cases, and not needed.   But RAID1/Mirroring is a good option.   It also should be pretty easy to support.   By definition a RAID1 Boot could be dealt with as a read only by simply treating it is a standard drive and reading from it in that manner.

I would think that this is something somebody in the management of the fedora project should be thinking about.

Comment 48 Dmitry S. Makovey 2011-11-05 04:42:33 UTC
since there's no voting turned on here posting ME_TOO

other than laptops all of the systems I set up with >1 disk have boot on RAID (MD or HW). This issue does need to get some more attention. Maybe even having some alternative images for RAID folks to download with the option of using those would be a big bonus. Not sure whether that should be addressed by anaconda people or preupgrade. 

It is worth mentioning that preupgrade is listed as a preferred method of in-place upgrade which does lead people to this bug in the end.

Comment 49 Will Woods 2011-11-11 15:01:08 UTC
*** Bug 504826 has been marked as a duplicate of this bug. ***

Comment 50 Will Woods 2011-11-11 15:01:36 UTC
*** Bug 608193 has been marked as a duplicate of this bug. ***

Comment 51 bart.kus 2011-12-13 01:37:29 UTC
Trying to upgrade from 15 -> 16 with /boot on RAID1 md0:


treeinfo timestamp: Wed Nov  2 20:10:12 2011                                                                                                                                        
/boot/upgrade/vmlinuz checksum OK                                                                                                                                                   
/boot/upgrade/initrd.img checksum OK                                                                                                                                                
Traceback (most recent call last):                                                                                                                                                  
  File "/usr/share/preupgrade/preupgrade-gtk.py", line 259, in on_assistant_apply                                                                                                   
    self._do_main()
  File "/usr/share/preupgrade/preupgrade-gtk.py", line 278, in _do_main
    self.main_preupgrade()
  File "/usr/share/preupgrade/preupgrade-gtk.py", line 497, in main_preupgrade
    extra_args += " ks=%s" % self.pu.generate_kickstart()
  File "/usr/lib/python2.7/site-packages/preupgrade/__init__.py", line 607, in generate_kickstart
    return dev.bootpath_to_anacondapath(targetfile, UUID=True)
  File "/usr/lib/python2.7/site-packages/preupgrade/dev.py", line 91, in bootpath_to_anacondapath
    raise PUError, "/boot is on RAID device %s" % bootdev
preupgrade.error.PUError: /boot is on RAID device md0


I guess the only other recommended method now is to boot from physical media?  I would like to request that RAID1 be supported in future.  As others have noted here, it's a pretty common thing for anyone who wants a reliable computer.

Does anyone know what is the "needinfo" requested for this bug?  I don't seem to be finding a clear request in the history.

Comment 52 Neil Bird 2012-03-24 10:09:39 UTC
I wasn't going to me-too this bug, but I find myself stuck now as I tried the alternative of upgrading from DVD and I'm getting a kernel crash before it really gets going, so that's not an option.

What's likely to happen if I just comment out that check in preupgrade as it stands?

[F14 -> F16, RAID1, preupgrade 1.1.10]

Comment 53 udo 2012-03-24 13:18:33 UTC
@Neil: your kernel crash is one more reason to at least consider the approach as described in Comment 33 for preupgrade.
Rebooting is not necessary until the upgrade is done.

Comment 54 Neil Bird 2012-03-24 17:33:13 UTC
I have to suppose that anaconda presumes its custom initramfs as an environment  and then that it can mount the system to the usual /mnt/sysimage, and works on that.

I really don't fancy my chances “simply” unpacking the installer's initramfs and trying to run anaconda out of that from the already-booted system, even if it *is* only at run level 1.

Have you tried this yourself?

If I get desperate, I suppose I could recreate the installer initramfs from my working kernel and arrange to net boot it or something.  I may end up there, I suppose.


Why is this NEEDINFO again?

Comment 55 udo 2012-03-25 04:29:02 UTC
I recall that unpacking the initramfs was not possible due to the fact that the kernel ramfs and the install ramfs were packed in one archive. Even rawhide did not provide standard tools to unpack that.
Manual hacks did not work reliable.
Perhaps the situation changed by now?

Comment 56 udo 2012-03-25 07:45:53 UTC
See https://bugzilla.redhat.com/show_bug.cgi?id=736318 for the xz situation. Not even a version/solution for Fedora 15 or 16 is released so we can at least (more easily) experiment.
Bonus question: what should we think of a distro that uses solutions that cannot be handled with the tools in the distro?

I one other bug (yes, sorry) I was explained that with/after F17 some solution would be introduced that would make the RAID situation less of an issue for upgrades.
Is that still true? If so: what are the technicalities involved?

Comment 57 udo 2012-03-25 07:55:45 UTC
xz summary solution:

- download latest GIT version (i.e. NO alpha, NO RPM, but git clone
http://git.tukaani.org/xz.git with autoconf, configure (I used --disable-shared
to get static xz binary), make,...).

- ( xz -dc --single-stream > initramfs.cpio && cat > install.cpio ) < initrd.img

(according to the info in the bug I mentioned)

This enables you to unpack the single initrd that has two ramdisks inside.

The idea is to disable any checks about RAID, etc, and have it download the rpms in the usual places.
Then we have the system go to runlevel 1 and/or kill as many processes as we safely can. Storage remains mounted where it was.
Then we fire up the actual upgrade procedure which does what it should do...

Comment 58 Gilboa Davara 2012-05-29 15:28:34 UTC
Stupid question 101:
Is F16->F17 preupgrade still impossible on F16 machines w/ /boot on RAID1?

- Gilboa

Comment 59 Chris Schanzle 2012-05-29 18:48:46 UTC
It would appear F16 -> F17 upgrades are still broken. From one of my users:

# preupgrade
...
treeinfo timestamp: Tue May 22 16:55:30 2012
/boot/upgrade/vmlinuz checksum OK
/boot/upgrade/initrd.img checksum OK
Traceback (most recent call last):
  File "/usr/share/preupgrade/preupgrade-gtk.py", line 259, in on_assistant_apply
    self._do_main()
  File "/usr/share/preupgrade/preupgrade-gtk.py", line 278, in _do_main
    self.main_preupgrade()
  File "/usr/share/preupgrade/preupgrade-gtk.py", line 497, in main_preupgrade
    extra_args += " ks=%s" % self.pu.generate_kickstart()
  File "/usr/lib/python2.7/site-packages/preupgrade/__init__.py", line 607, in generate_kickstart
    return dev.bootpath_to_anacondapath(targetfile, UUID=True)
  File "/usr/lib/python2.7/site-packages/preupgrade/dev.py", line 91, in bootpath_to_anacondapath
    raise PUError, "/boot is on RAID device %s" % bootdev
preupgrade.error.PUError: /boot is on RAID device md0

Comment 60 I. Piasecki 2012-05-29 22:59:57 UTC
:0 What a crap ! I can't upgrade my fedora 16 to fedora 17 with preupgrade, cause i have /boot on rad1 

This bug is from year 2009 and fedora 10/11. 

I can say, that there wasn't a box with info, that process of preupgarde has any errors, I see it in shell, where preupgarde ran from.

!!!!!

Comment 61 Aas 2012-05-31 08:14:36 UTC
@Piasecki: The only info I've found is one note in the wiki: https://fedoraproject.org/wiki/How_to_use_PreUpgrade

Now when upgrade to F17 using yum directly (http://fedoraproject.org/wiki/YumUpgradeFaq) is highly discouraged, how can I upgrade my remote systems? I can't imagine setting up access to data center for every box upgrade. Oh dear :'-(

Comment 62 Adam Williamson 2012-05-31 15:59:11 UTC
aas: yum upgrade works fine so long as you follow the instructions, but if the machines are remote you may have trouble forcing a reboot after the 'yum upgrade' step (it's known that clean shutdown fails at that point).

Presumably you have a way to boot the install media directly, remotely (PXE?), so you could just do that and run an upgrade from the DVD or netinst ISOs.

Comment 63 John Glotzer 2012-07-06 13:40:13 UTC
I have done preupgrade successfully twice with /boot on a *Hardware* RAID partition (vmware RAID card). I think if one reads carefully one can figure out
that this issue only pertains to Software Managed RAID but I think the distinction could be made clearer or more explicit. Thanks.

Comment 64 Guido Grazioli 2012-09-12 18:32:53 UTC
Will this be fixed for F18 release? Just asking...

Comment 65 Aas 2012-09-13 14:09:53 UTC
@Guido Grazioli: with my experiences with fedora, I doubt but desperately hope for it at the same time.

Comment 66 Bryan Burke 2012-09-29 19:39:05 UTC
So, this has been a problem on my machine for a while, though I've just noticed it. Somehow, I missed the salient points of the python stacktrace which showed why preupgrade failed to work. I did this in Fedora 14, trying to go to 16. Frustrated, I just backed up my libvirt XML files (since my backups are in a KVM virtual machine on the physical machine that I was trying to update), did a careful clean install, got KVM/libvirt working again, brought up my backups VM, and finished restoring the system to it's previous state. This did however take several hours longer than a simple in-place upgrade would have taken.

Now, I'm preparing to upgrade my VMs, but 17 is out now, so I try to preupgrade again, it fails, and after trying several things (including modifying grub directly and putting the preupgrade boot images on a TFTP/DHCP/PXE server), I finally noticed the actual problem: /boot is on an MD device (RAID1).

After reading over all comments on this bug report, it really is amazing to me that this has been open (more or less) for 3ish years, and there's no real resolution. For the record, this machine has no optical drive of any kind: it is a custom built machine, for hosting VMs, with nothing more than the minimum needed to do that (both in hardware and software). The plan was to PXE boot the machine to install it the first time (and occasionally for clean installs, as I like to alternate upgrade with clean installs) from network, then in-place upgrade.

Is the only answer really that I have to go out and buy blank DVDs and either an internal or external optical drive?

P.S. I, too, am baffled at the "NEEDINFO" status of this bug.

Comment 67 Bryan Burke 2012-09-29 19:51:57 UTC
Also, is this really even an issue anymore? Now that the stage1 and and stage2 images are one, and the default grub for fedora is grub2, which seems natively (well, through a module that comes with it) to be able to understand Linux MD devices, can we just try removing the check from the preupgrade program for /boot being on an MD device? Seems like it should "just work" going forward...

Comment 68 Sergio Pascual 2012-09-30 16:54:25 UTC
This bug was the reason I moved my system from sw raid to bios raid... but I was hit by this bug #834245 and I can't reboot safely anymore

Comment 69 Adam Williamson 2012-10-01 23:36:57 UTC
"The plan was to PXE boot the machine to install it the first time..."

"Is the only answer really that I have to go out and buy blank DVDs and either an internal or external optical drive?"

Well obviously not, no, because if you can PXE boot to do a clean install you can PXE boot to do an upgrade. This seems self-evident.

preupgrade is deprecated from F18 onwards, so we should probably just close out all/most preupgrade bugs as WONTFIX or something anyways. I don't think anyone's planning on doing any major maintenance to the preupgrade in f16/f17. richard, will?

Comment 70 udo 2012-10-02 04:52:19 UTC
what will take the place of preupgrade? (if any)
we need this to know if we should argue against wontfix.

Comment 71 Adam Williamson 2012-10-02 05:57:22 UTC
the new upgrade tool wwoods is writing.

Comment 72 udo 2012-10-02 06:25:05 UTC
wwoods is author?
the new tool will fix most if not all of preupgrades bugs like this one?

Comment 73 Bill McGonigle 2012-10-02 18:26:15 UTC
Found here:

http://fedoraproject.org/wiki/Anaconda/Work_List

>From wwoods: So the code is currently here: https://github.com/wgwoods/fedup.
> Plan for F-18 Beta is to have an F-17 based upgrade tool that fetches packages
> *or* sets up an upgrade using a local DVD/USB image.
>
> It might not have any real GUI for beta, but it will for final.
>
>It will involve a special upgrade image, but it's just a dracut-built initramfs.
>This could be built by lorax, or it could just get built by running 'dracut'
>inside a mock chroot. This might require rel-eng involvement.
>
>I might need some assistance with: i18n stuff, GUI polish, a special plymouht
>theme for the upgrade, and network monitoring (e.g. running sshd in dracut,
>like we do for s390 cmdline installs).

Comment 74 Ricky 2013-01-16 03:43:00 UTC
I have trouble in upgrading F16 to F17. It took me 3 days to figure out solution.

My boot partition is in raid0. I can use preupgrade to upgrade F14->F15, F15->F16. I believe preupgrade doesn't use 2nd stage image at that time. That's why when upgrading that they always said kickstart file can't be found (ks kernel parameter in grub configuration file)

Preupgrade from F16 to F17 used 2nd stage image. Linux kernel in preupgrade doesn't support raid, although grub2 support raid.

So my solution is to use my thousand years old 256MB SD card. I installed grub2 on my SD card and copy /boot/upgrade to it.

I modified grub2.cfg as follow:

menuentry ‘Upgrade to Fedora 17 (Beefy Miracle)’ –class gnu-linux –class gnu –class os {
linux /boot/upgrade/vmlinuz preupgrade nouveau.noaccel=1 repo=hd::/var/cache/yum/preupgrade ks=hd:UUID=C706-D245:/boot/upgrade/ks.cfg stage2=hd:UUID=C706-D245:/boot/upgrade/squashfs.img
initrd /boot/upgrade/initrd.img
}

Note: UUID above is from my SD card. You can use blkid to figure it out. My gfx card is GTX 590

After booting from SD card, it can find repo in my hard disk. Installation works like a charm again.

Comment 75 Fedora End Of Life 2013-02-14 02:35:55 UTC
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 76 Dusty Mabe 2013-05-05 22:17:32 UTC
Hey guys.. If using a software RAID1 I think you can get anaconda to mount a member device (not the actual raid device) by zeroing out the superblock on the member device. This allows /bin/mount within the stage 1 to autodetect the ext filesystem and successfully mount the device. 

I have a written a post with some more details and an example for anyone that may be interested:
http://dustymabe.com/2013/05/05/booting-anaconda-from-software-raid1-device/

Hope this helps,

Dusty


Note You need to log in before you can comment on or make changes to this bug.