Red Hat Bugzilla – Bug 735785
Upgrade Skip Bootloader broken
Last modified: 2011-09-13 17:04:15 EDT
Created attachment 521481 [details]
Description of problem:
I followed this test case:
I was upgrading Fedora 15 to Fedora 16 Beta TC1.
After installation is complete and system reboots, I receive an error message:
> Error 15: File not found
> Press any key to continue...
After key press, I'm in a GRUB menu, see screenshot. Only Fedora 15 item is present. It doesn't boot (again "Error 15: File not found" message).
I am not quite sure how it was supposed to happen (there should have been a Fedora 16 boot item I suppose?), but it clearly violates the test case, therefore I suppose there's a bug somewhere.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install Fedora 15, update
2. Upgrade to Fedora 16 and choose "Skip bootloader updating"
Proposing as Beta blocker:
The installer must be able to successfully complete an upgrade installation from a clean, fully updated default installation (from any official install medium) of the previous stable Fedora release, either via preupgrade or by booting to the installer manually. The upgraded system must meet all release criteria
Created attachment 521485 [details]
I used system rescue to retrieve upgrade logs from /root directory.
Created attachment 521486 [details]
Created attachment 521487 [details]
upgraded grub config
*** This bug has been marked as a duplicate of bug 730357 ***
This was marked as a duplicate because of:
grubby fatal error: unable to find a suitable template
grubby: doing this would leave no kernel entries. Not writing out new config.
Do you see the same error message in future kernel installations?
Do you have a /etc/grub.conf -> ../boot/grub/grub.conf symlink?
> Do you see the same error message in future kernel installations?
I used rescue mode to chroot the filesystem and upgrade kernel. I see "grubby fatal error: unable to find a suitable template" during kernel upgrade. The second error line is not printed.
grub.conf is not correctly populated.
> Do you have a /etc/grub.conf -> ../boot/grub/grub.conf symlink?
Did/do the old kernel still exist?
$ ls -l /boot/vmlinuz-126.96.36.199-27.fc15.i686
No, the old kernel is deleted. Only the new kernel 3.1.0-0.rc3 (upgraded by anaconda) and 3.1.0-0.rc4 updated by me are there.
Ok, that explains it.
Grubby will only use valid kernel entries and will thus ignore the only one you have.
You were updating from a non-working system and can't expect that it works after the update. This is thus not a blocker.
$ grubby --title=new --add-kernel=123456789 --copy-default --grub --config-file=grub.cfg -o-
grubby fatal error: unable to find a suitable template
$ grubby --title=new --add-kernel=123456789 --copy-default --grub --config-file=grub.cfg -o- --bad-image-okay
kernel 6789 ro root=/dev/mapper/VolGroup-lv_root rd_LVM_LV=VolGroup/lv_root rd_LVM_LV=VolGroup/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet
title Fedora (188.8.131.52-27.fc15.i686)
I was upgrading from a working Fedora 15. The Beta release criteria says it the upgraded system must be able to boot.
The old 184.108.40.206-27.fc15.i686 kernel was removed by anaconda, before the new kernel was installed and without updating grub.cfg?
(I assume that it shouldn't be taken literally when the update log says "Upgrading kernel-3.1.0-0.rc3.git0.0.fc16.i686", but "doing this would leave no kernel entries" looks a lot like a kernel removal ... but it happens after it failed to find a template first time.)
That do sound like an anaconda bug to me.
(In reply to comment #12)
> The old 220.127.116.11-27.fc15.i686 kernel was removed by anaconda, before the new
> kernel was installed and without updating grub.cfg?
> (I assume that it shouldn't be taken literally when the update log says
> "Upgrading kernel-3.1.0-0.rc3.git0.0.fc16.i686", but "doing this would leave no
> kernel entries" looks a lot like a kernel removal ... but it happens after it
> failed to find a template first time.)
> That do sound like an anaconda bug to me.
That's why it is reported against anaconda. Reopening bug.
Discussed at the 2011-09-09 blocker review meeting, accepted as a blocker under the criterion "The installer must be able to successfully complete an upgrade installation from a clean, fully updated default installation (from any official install medium) of the previous stable Fedora release, either via preupgrade or by booting to the installer manually. The upgraded system must meet all release criteria". We're aware this case is a slight variation on a 'stock' upgrade, but the test case has been around for many releases and the option exists in anaconda for some kind of use case, or else it wouldn't be there.
If there's some wrinkle to this which would make the test case not a great exercise of what the anaconda option actually exists for, please alert us, and we'll re-evaluate.
I just tested net upgrade from F15 to F16 Beta TC2
and it does not offer to upgrade boot loader.
"The installer is unable to detect the boot
loader currently in use on your system."
The choices offered are then:
* Skip boot loader updating
* Create new boot loader configuration
but if I choose the latter I can only install
boot loader in /dev/sdb (usb drive) or /dev/sda1. :-(
Maybe the choices are better from cdrom?
(do people still use those?;) It is strange to me
that we don't officially support net install/upgrade
via USB drive yet). So I lose.
I can boot the upgrade with grub1 by editing kernel and initramfs filenames.
I can concur that if upgrading fc15 to fc16 and selecting skip bootloader the resulting new fedora system will not boot without manually editing the grub1 entry to allow fc16 to boot. After manual edit the system does boot. I question whether anaconda is broke cause it does not perform to test case or is the test case flawed.
Yeah, I think Bob may be on the right track. I believe this option is intended to be used in the case where you're using some non-Fedora-handled bootloader and want anaconda to just leave it alone and let you deal with it, and the test case wouldn't actually be expected to lead to success.
I suspect the option that should lead to success in a straightforward F15->F16 upgrade scenario is "install new bootloader configuration". I'm testing this out ATM.
anaconda team is pretty sure this option is working as intended: it's just not what you want when doing a simple f15->f16 upgrade.
so I think there's no blocker bug here exactly, but two issues: this option probably should not be the default choice on the 'bootloader choices' screen for f16 upgrades, and we (QA) should fix our test case, because it's clearly broken as described.
I propose we change this to RejectedBlocker. note that https://bugzilla.redhat.com/show_bug.cgi?id=735730 is an accepted blocker, and is tracking the serious consequences here: the 'install new boot loader configuration' option doesn't work, no other options are available, and this 'skip boot loader configuration' option is the default when it probably shouldn't be. dropping 'AcceptedBlocker' so this gets noticed in searches etc.
We don't have another blocker review till Friday but we need to decide on blocker status of all uncertain bugs by tomorrow, so can people please vote here? I vote -1 blocker -1 nth for *this* bug, now: the key issues can be addressed in 735730.
spamming some CCs to get votes. I hope.
I wondered in the past why "skip bootloader" test case expects working system in the end. Now I know - the test case is broken. If the two issues (bad default, and broken installation of new bootloader) are going to be fixed in #753730, then I think this one can be closed as NOTABUG. Definitely -1 blocker.
Is it known _why_ it doesn't work when boot loader config is skipped?
AFAIK an upgrade with skipped bootloader _should_ work. The kernels scripts will call new-kernel-pkg which will call grubby which will patch the existing config file for the(!) existing boot loader.
I have tested that there is no problem with a plain yum distry-sync upgrade from f15 (with grub) to f16.
The new bootloader package should not be (and is not) activated before the sysadmin (or anaconda) installs it, so that shouldn't be a problem. (The EFI boot loader at /boot/efi/EFI/redhat/grub.efi will however be activated immediately which IMHO is unfortunate but a different issue.)
So _why_ do grubby fail to update the boot loader config file when anaconda is used?
f15 grub and grub2 packages didn't conflict, so "obviously" it will not be possible to update a system where both were installed but grub was used without installing the new bootloader. I assume that isn't the issue here.
Do anaconda (or something else) create /etc/grub2.cfg (and whatever it links to)? I don't see any other possible explanation.
Maybe I am misunderstanding something, but I got the impression that "Skip bootloader" option shouldn't do *any* changes to the bootloader, meaning also adding no new kernels. Thus it seems normal that the boot is broken, because the older kernel is removed and the new kernel is not added to the bootloader.
I think it would be of a great help if these options were described in detail inside anaconda. That would clear up a lot of confusion. This currently seems more like a guessing game.
I will clarify and repeat myself so it is more easy to prove me wrong ;-)
(In reply to comment #24)
> Maybe I am misunderstanding something, but I got the impression that "Skip
> bootloader" option shouldn't do *any* changes to the bootloader,
> meaning also adding no new kernels.
I think there is a clear separation between "installing bootloader" and "update boot loader configuration.
> Thus it seems normal that the boot is broken, because
> the older kernel is removed
It shouldn't. We should always keep the previous N kernels.
> and the new kernel is not added to the bootloader.
The boot loader configuration will always be updated when a kernel is installed. Ok ... apparently it isn't - but it should, without any exception.
I think the only way to debug it is to run grubby in a debugger - using the right grub config file and the right environment.
Created attachment 522932 [details]
bootloader update screen
> I think there is a clear separation between "installing bootloader" and "update
> boot loader configuration.
There are 3 options in Anaconda. My understanding is (was):
* Create new boot loader configuration - wipe out MBR and grub.cfg, install completely new bootloader (possibly grub2 in this case), populate grub.cfg
* Update boot loader configuration - don't touch MBR, add new items to grub.cfg, don't touch grub itself (e.g. don't upgrade from grub1 to grub2)
* Skip boot loader updating - don't touch MBR, don't touch grub.cfg, don't touch grub itself
See screenshot for Anaconda's explanation.
What actually happens in Anaconda when you select skip bootloader is that
anaconda will not install the bootloader. It skips the bootloader and
installbootloaer steps. But the kernel will still run new-kernel-pkg which will
run grubby which will setup the new kernel.
So in the case where you already have grub installed skipping the bootloader
*should* still result in a booting system.
If you were using non-grub (eg. extlinux or lilo or ?) you won't have a booting
system until you edit the bootloader yourself.
So I think this is +1 Blocker, *if* you are using a supported bootloader and
you skip re-installing it, grubby should be updating things correctly and you
should still be able to boot.
And this isn't an anaconda bug, it is operating as expected. The problem lies
with grubby not updating the grub and/or grub2 config correctly.
Anaconda can't stop new-kernel-pkg from updating the config files, it can only skip re-installing the bootloader to the stage1/stage2 devices (MBR, biosboot, whatever).
(In reply to comment #26)
> bootloader update screen
It seems to me like the descriptions could be improved. Without knowing Anaconda I guess it would be more descriptive with something like:
* Install new version of boot loader and update configuration
* Skip boot loader installation - update configuration and keep using the old version
* Install new boot loader overwriting old configuration
(In reply to comment #27)
> And this isn't an anaconda bug, it is operating as expected.
According to comment 9 the old kernel was removed during the update. That could explain some of the errors. That would be something that should be under anacondas control - but it sounds very unlikely.
> The problem lies
> with grubby not updating the grub and/or grub2 config correctly.
I uploaded a patch on bug 725185 that perhaps can explain some of it.
(In reply to comment #20)
> We don't have another blocker review till Friday but we need to decide on
> blocker status of all uncertain bugs by tomorrow, so can people please vote
> here? I vote -1 blocker -1 nth for *this* bug, now: the key issues can be
> addressed in 735730.
I'm also -1 blocker, -1 nth on this since it appears as if the major issues are being handled in other bugs.
My count is -2 blocker and -2 NTH right now. It would be nice to get to -3 before rejecting as a blocker.
tflink: well, bcl's comment does make me re-consider a bit, especially since 'skip bootloader' is the *default* action for upgrades currently. it might be worth keeping both bugs as blockers so we can co-ordinate an acceptable resolution for beta.
mads: btw, when I tested a yum upgrade from f15 to f16, i'm fairly sure I got the f16 kernel in the bootloader configuration. That would mean there's something unique to the anaconda environment here. I'll re-test that to be sure.
so, pjones says it's actually expected that the kernel %post doesn't update the bootloader config in anaconda, because:
<pjones> dlehman: the kernel %post won't add one in this case because the old config will be empty (because we've upgraded kernel packages rather than install + prune old
<pjones> because on upgrades we don't leave the previous OS version's package there
so this explains the existing of both 'skip' and 'update existing' options, and what they do - 'update existing' adds some kind of anaconda operation to update the existing bootloader configuration since the kernel scripts won't do it, 'skip' just doesn't change anything, as we thought.
so, indeed, this option is doing what it was designed to do. it's just not an option that was expected to be useful in very many cases, and certainly not to be the default. it's really there as a 'i have an odd bootloader configuration and i know what the hell i'm doing, leave me to do it' option. The install guide says "If your machine uses another boot loader, such as BootMagic, System Commander, or the loader installed by Microsoft Windows, then the Fedora installation system cannot update it. In this case, select Skip boot loader updating."
Proposing revision to test case that 1) the Third Party Bootloader if used was not disturbed by the install of Fedora. 2) If this option is used with Fedora Bootloader there should be no old version entries remaining. 3) When using Fedora Bootloader after manual editing the new fedora should boot normally.
Should be a NTH testcase not a blocker.
I also vote -1 blocker.
so we have 3 -1s now, and I think we can say bcl's +1 was based on a misapprehension of the situation. So, marking as rejectedblocker. the plan for Beta, as I understand it, is to make 'install new boot loader configuration' the default for upgrades, make it work, and document clearly that if you upgrade from f15 to f16 beta you can either get a brand new auto-generated grub2 config or leave everything completely untouched and fix it up yourself. this will be tracked in https://bugzilla.redhat.com/show_bug.cgi?id=735730 .
should we just close this as NOTABUG now? that seems to be the appropriate thing to do.