Bug 1474042 - Install to new partition on system with existing F26 /boot leaves broken boot options
Install to new partition on system with existing F26 /boot leaves broken boot...
Status: NEW
Product: Fedora
Classification: Fedora
Component: anaconda (Show other bugs)
26
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Anaconda Maintenance Team
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-23 07:21 EDT by "FeRD" (Frank Dana)
Modified: 2017-07-23 07:25 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description "FeRD" (Frank Dana) 2017-07-23 07:21:44 EDT
Description of problem:

When attempting to install F26 from the LiveCD environment onto a system which contains an existing installation of F26 (in a separate LV), Anaconda made rather a mess of the bootloader configuration. It left me booted into an unusable system as the booted kernel (though present on /boot) had not actually been installed. Therefore, no modules were present in /lib/modules/`uname -r`/ (in fact that directory itself was not present), which meant I couldn't even enable the WiFi adapter to reach the network and repair the problem with dnf.


Background/Rationale:
Due to the issues I encountered after upgrading F25->F26 (bug 1470038, though nothing there is of any relevance to this bug), I decided to attempt a fresh install of F26 from the Workstation Live image environment (on bootable DVD), alongside the non-functional F26 upgrade install I'd previously done. 

I realize this is an esoteric case, and that very few people will ever be affected by this issue. But I feel that it's worth addressing, becauese if Anaconda is able to make the right decisions and properly account for this type of setup, that'll probably make it a smarter, safer, and more robust installer in general.


Steps Taken/Detail:
This is a legacy-BIOS / MBR-disk laptop from ca. 2009, there's no EFI, GPT, or other modern sorcery involved. Just a single 2.5" HDD with an MBR partition table, containing an NTFS partition (Windows), an ext4 partition (/boot), and an LVM partition. The LVM partition contains a single VG, LVs for / and /home, and some free extents.

Using the advanced Blivet partitioning option, I...
1. Created a separate, secondary root LV on available VG extents. 
2. Left my previous (F25-upgraded-to-)F26 root volume untouched and unconfigured.
3. Configured the existing /boot and /home into the new install, by setting their mount points accordingly.

The install process itself went fine. The new root volume was formatted and all of the packages were installed to it, the root password was set, and my user account was created as specified. But on first boot, as I said in the introduction, the system was unusable due to the lack of kernel modules.

As I discovered...
1. `uname -r` showed I was booted into 4.11.10-300.fc26.x86_64
2. `rpm -q kernel` returned ONLY kernel-4.11.8-300.fc26.x86_64
3. `rpm -q kernel-core`, the source of /lib/modules/<version>/, also returned only kernel-core-4.11.8-300.fc26.x86_64

What anaconda had apparently done was...

1. Installed the 4.11.8 kernel packages distributed on the LiveCD

2. Rewrote ALL of the previous F26 install's menuentry items in grub.cfg, replacing the previous root LV with the new one so that they booted into the new install. Because I had used `dnf upgrade` to update the previous install post-upgrade, there were three of these: 4.11.10-300, 4.11.9-300, and 4.11.8-300. All three kernels were present on /boot, but only one (4.11.8-300) was installed to the new F26 root LV.

3. Perhaps because 4.11.10-300 is newer than 4.11.8-300, anaconda left the 4.11.10-300 menuentry as the default, pointing to the newly-installed root LV, despite the packages for that kernel not being installed there.

By rebooting the laptop and selecting the 4.11.8-300.fc26.x86_64 boot option, I was able to reach a running system with all of the hardware drivers present in /lib/modules/4.11.8-300.fc26.x86_64


Expected Results:
A bootable system.

I'm undecided myself on what anaconda should've done with the existing F26 entries in /boot/grub2/grub.cfg, when upgrading. The way I see it, either of these would be acceptable:

1. Leave the existing entries alone (or possibly renamed somehow), so that the option is still available to boot into either install. They use separate root volumes, so there's no reason that shouldn't be a valid config.

2. Delete the entries for the previous F26 install from grub.cfg completely. 

Effectively it's doing #2 anyway, since by rewriting the old entries with the new root volume it turned them into invalid boot options. Leaving those entries in place, rewritten into effectively-corrupted form, only causes confusion and problems. (bug 1456353 and bug 1456404, which relate to anaconda's management of boot entries on EFI systems, probably also figure in here.)

Above all else, anaconda DEFINITELY should've set the menuentry corresponding to the kernel _it_just_installed_ as  the default boot entry, regardless of whatever else might be present in the boot menu, or on the /boot partition. (Or even present on the system, for that matter.)
Comment 1 "FeRD" (Frank Dana) 2017-07-23 07:25:19 EDT
*sigh* I knew there'd be at least one error:

"I'm undecided myself on what anaconda should've done with the existing F26 entries in /boot/grub2/grub.cfg, when upgrading."

I meant when side-installing, specifically NOT upgrading (but also not reinstalling). Anaconda doesn't do upgrades anymore, anyway, which is why having upgrade-esque logic still in there such as the rewriting of existing boot entries seems especially strange.

Note You need to log in before you can comment on or make changes to this bug.