Bug 1099627 - grub.cfg broken after live install
Summary: grub.cfg broken after live install
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: grubby
Version: 22
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-20 19:07 UTC by Gene Czarcinski
Modified: 2016-07-19 19:02 UTC (History)
9 users (show)

Fixed In Version: grubby-8.35-1.fc21
Clone Of:
Environment:
Last Closed: 2016-07-19 19:02:00 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
snapshot of boot with bad grub.cfg (20.98 KB, image/png)
2014-05-20 19:07 UTC, Gene Czarcinski
no flags Details
the bad grub.cfg (3.62 KB, text/plain)
2014-05-20 19:07 UTC, Gene Czarcinski
no flags Details
good grub.cfg creted by grub2-mkconfig (3.61 KB, text/plain)
2014-05-20 19:08 UTC, Gene Czarcinski
no flags Details
sorted output of rpm -qa (34.99 KB, text/plain)
2014-05-20 19:09 UTC, Gene Czarcinski
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1100504 0 unspecified CLOSED rescue boot entry is listed first 2021-02-22 00:41:40 UTC

Internal Links: 1100504

Description Gene Czarcinski 2014-05-20 19:07:12 UTC
Created attachment 897696 [details]
snapshot of boot with bad grub.cfg

Description of problem:
I created by own ks file for lxde and created a livecd.  I then booted the created iso on a qemu-kvm virtual system and ran liveinst.

Anaconda went through it's dance and then I reboot.  Attached is a shapshot of what I got at bootup.  I am also attaching the kickstart file used to create the livecd, sorted output of rpm -qa on the installed system, the "bad" grub.cfg, and a "good" grub.cfg created by running:
     grub2-mkconfig -o /boot/grub2/grub.cfg

This happens with or without my updated grubby and with or without the updated anaconda.

This happens with partition, lvm or btrfs in any combination.

This only happens for livecd and liveinst.  A regular distribution or netinstall has a good grub.cfg.

Version-Release number of selected component (if applicable):
rawhide

How reproducible:
Everytime

Comment 1 Gene Czarcinski 2014-05-20 19:07:54 UTC
Created attachment 897697 [details]
the bad grub.cfg

Comment 2 Gene Czarcinski 2014-05-20 19:08:44 UTC
Created attachment 897698 [details]
good grub.cfg creted by grub2-mkconfig

Comment 3 Gene Czarcinski 2014-05-20 19:09:52 UTC
Created attachment 897699 [details]
sorted output of rpm -qa

Comment 4 Gene Czarcinski 2014-05-21 19:57:06 UTC
I believe I have found the problem and it involves the order in which new-kernel-pkg and grub2-mkconfig are run:

Here is a fragment from anaconda.program.log for a Fedora 20 LXDE livinst:
-----------------------------------------------------------------------
14:55:19,704 INFO program: Running... umount /mnt/install/source
14:55:19,866 DEBUG program: Return code: 0
14:55:20,453 INFO program: Running... new-kernel-pkg --rpmposttrans 3.11.10-301.fc20.x86_64
14:56:19,107 INFO program: Initializing machine ID from KVM UUID.
14:56:19,107 DEBUG program: Return code: 0
14:56:49,406 INFO program: Running... grub2-install --no-floppy /dev/vda
14:56:51,491 INFO program: Installation finished. No error reported.
14:56:51,494 DEBUG program: Return code: 0
14:56:51,568 INFO program: Running... grub2-set-default Fedora Linux, with Linux 3.11.10-301.fc20.x86_64
14:56:51,653 DEBUG program: Return code: 0
14:56:51,654 INFO program: Running... grub2-mkconfig -o /boot/grub2/grub.cfg
14:56:52,610 INFO program: Generating grub.cfg ...
14:56:52,610 INFO program: Found linux image: /boot/vmlinuz-3.11.10-301.fc20.x86_64
14:56:52,611 INFO program: Found linux image: /boot/vmlinuz-0-rescue-7893fd1c74c144c99626c234a33cfd09
14:56:52,611 INFO program: Found initrd image: /boot/initramfs-0-rescue-7893fd1c74c144c99626c234a33cfd09.img
14:56:52,611 INFO program: done
14:56:52,612 DEBUG program: Return code: 0
----------------------------------------------------------------

That resulted in a grub.cfg that worked.

Now here is what I got from live LXDE rawhide install:
-----------------------------------------------------------------
15:18:46,490 INFO program: Running... rsync -pogAXtlHrDx --exclude /dev/ --exclude /proc/ --exclude /sys/ --exclude /run/ --exclude /boot/*rescue* --exclude /etc/machine-id /mnt/install/source/ /mnt/sysimage
15:21:35,798 DEBUG program: Return code: 0
15:21:37,658 INFO program: Running... grub2-install --no-floppy /dev/vda
15:21:44,876 INFO program: Installing for i386-pc platform.
15:21:44,878 INFO program: Installation finished. No error reported.
15:21:44,879 DEBUG program: Return code: 0
15:21:44,984 INFO program: Running... /usr/sbin/rhcrashkernel-param
15:21:45,120 DEBUG program: Return code: 0
15:21:45,138 INFO program: Running... grub2-set-default Fedora Linux, with Linux 3.15.0-0.rc5.git2.10.fc21.x86_64
15:21:45,445 DEBUG program: Return code: 0
15:21:45,446 INFO program: Running... grub2-mkconfig -o /boot/grub2/grub.cfg
15:21:50,697 INFO program: Generating grub configuration file ...
15:21:50,700 INFO program: Found linux image: /boot/vmlinuz-3.15.0-0.rc5.git2.10.fc21.x86_64
15:21:50,701 INFO program: done
15:21:50,701 DEBUG program: Return code: 0
15:21:50,781 INFO program: Running... umount /mnt/install/source
15:21:52,736 DEBUG program: Return code: 0
15:21:52,749 INFO program: Running... new-kernel-pkg --rpmposttrans 3.15.0-0.rc5.git2.10.fc21.x86_64
15:25:09,673 INFO program: Initializing machine ID from KVM UUID.
-----------------------------------------------------------------

Something changed!

Comment 5 Peter Jones 2014-05-21 21:34:03 UTC
So that first one is unequivocally wrong, and probably was hiding something going very wrong in the first new-kernel-pkg call.

That said, what appears to be going wrong now is that grubby is getting called with "--copy-default --add-kernel=somekernel --initrd someinitrd", but it's being called *between* when the default stanza is added and when its initrd gets added.

When that happens, grubby isn't finding the initrd line in its initial template, because it doesn't exist yet, and is adding it at the wrong place with regard to the LT_END line (i.e. "}" ).

I've added a test case and a fix to grubby-8.34.

Comment 6 Gene Czarcinski 2014-05-22 09:22:58 UTC
OK, if the change to grubby fixes the problem, that scratches my itch.  However, the problem was really caused by anaconda commit f433850099e98eee50ae995d0a864b86619ee84f

which says:

    install: Move Payload postInstall() after bootloader
    
    None of the current Payload subclasses are sensitive to ordering with
    respect to the bootloader.  The forthcoming OSTreePayload class will
    require postprocessing of the bootloader, so let's just swap the
    ordering.

The assumption was incorrect that nothing depended  on the order ... Live Install did.

Comment 7 Gene Czarcinski 2014-05-22 14:24:50 UTC
OK, the updated grubby does produce a working grub.cfg.  But, the output looks strange with the rescue menuentry first and the real boot second.

I am closing this

Comment 8 Chris Murphy 2014-05-22 17:01:22 UTC
(In reply to Peter Jones from comment #5)

> but it's being called *between* when the default stanza is added and when
> its initrd gets added.

There's a bigger bug here. Since anaconda-20 with certain configurations (even without btrfs involved) it calls grub2-mkconfig before the initramfs creation has even started. It results in a grub.cfg without an initrd line. Grubby then cleans this up if the file system isn't btrfs. I think this has to do with anaconda (or maybe blivet) threading which was disabled I think during F18 testing due to other problems resulting from threading.

With F19, it always initiates grub2-mkconfig after the initramfs exists. With F20, certain configurations (including non-btrfs ones) always get grub2-mkconfig instantiated by anaconda before the initramfs exists. It's just that this is masked if the fs is not btrfs because grubby seems to clean up the mess by adding the missing initrd line. Whereas on btrfs it doesn't.

Comment 9 Gene Czarcinski 2014-05-22 20:08:33 UTC
Chris, I am not completely happy with the solution to kludge up grubby.  The change only happened recently.  A patch was submitted around the ned of April:

0002-install-Move-Payload-postInstall-after-bootloader.patch

This patch changes the order so that "post-installation setup tasks" take place AFTER the bootloader is installed (grub2-install & grub2-mkconfig) with the claim that this does not impact anything.  Well, it did impact something.

I would say "go look at the code in the git repository" but, IIRC, you do not do code ;)

I also wonder if anything is broken with respect to extlinux as the bootloader.

Comment 10 Gene Czarcinski 2014-05-22 20:21:38 UTC
While I can now boot with a grub.cfg, the wrong kernel is the default: the rescue kernel.  The problem is that /boot/grub2/grubenv does not have a valid value.

I am going to raise the question again about the change of when the bootloader install is performed.

BTW, the initrd16 line is added just fine to /boot on btrfs ... with my patches.  To make it easy of you, I have a new website with rpms in a you repsoitory and update.img files for anaconda.  See http://czarc.org/

Comment 11 Colin Walters 2014-05-22 20:40:38 UTC
I haven't closely followed this bug yet, but you think there is still a regression in Anaconda?  If so can you clone/file a new bug against anaconda and CC me please?

Comment 12 Brian Lane 2014-05-22 20:45:14 UTC
(In reply to Chris Murphy from comment #8)

> Grubby then cleans this up if the file system isn't btrfs. I think this has
> to do with anaconda (or maybe blivet) threading which was disabled I think
> during F18 testing due to other problems resulting from threading.

None of that is threaded, post install and bootloader setup, etc. are all serial. See pyanaconda/install.py for the sequence.

Comment 13 Chris Murphy 2014-05-22 22:19:39 UTC
It's bug 1012646. Relevant comments are the original description, comment 8, 18, and 21. Clearly the grub.cfg is created before new-kernel-pkg is run, and therefore why the initrd line is missing from all grub.cfg's. I don't have direct evidence grubby fixes this, I'm just not sure what else would.

Comment 14 Chris Murphy 2014-05-22 22:31:00 UTC
OK I think I know why this only sometimes happens, and it's only with live installs. Those installs are rsyncing the kernel file, but not the initramfs for obvious reasons. Anaconda calls grub2-mkconfig before new-kernel-pkg, so the grub.cfg doesn't contain an initrd line. And it's grubby, called within new-kernel-pkg that ends up fixing the problem.

Whereas with non-lives, the kernel is installed with an RPM, which itself contains new-kernel-pkg. So the initramfs is created before grub2-mkconfig gets called. This is a guess. I haven't tested it yet.

Comment 15 satellitgo 2014-05-23 06:43:11 UTC
workstation-20140522 i686 and 86_64 iso files work with liveusb-creator installed in workstation-20140522-live-(rawhide)64bit (installed to HD with 21.37-1 from DVD) The USB's boot correctly.
https://fedoraproject.org/wiki/Test_Results:Fedora_21_Rawhide_2014_05_Install#USB_stick

Comment 16 Chris Murphy 2014-05-23 16:08:04 UTC
(In reply to bcl from comment #12)
> None of that is threaded, post install and bootloader setup, etc. are all
> serial. See pyanaconda/install.py for the sequence.

install.py
"Generating initramfs"
"Running post-installation scripts"
"Performing post-installation setup tasks"
"Installing bootloader"

anaconda.log
17:26:29,829 INFO anaconda: Installing bootloader
17:27:02,433 INFO anaconda: Performing post-installation setup tasks
17:34:23,611 INFO anaconda: Thread Done: AnaInstallThread (140020449306368)
17:34:30,430 INFO anaconda: Generating initramfs
17:39:28,524 INFO anaconda: Running post-installation scripts



program.log
17:26:29,594 INFO program: rsync: rsync_xal_set: lsetxattr(""/mnt/sysimage/boot/efi/EFI/fedora/.grubx64.efi.OwWMu4"","security.selinux") failed: Operation not supported (95)
17:26:39,296 INFO program: Running... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg
17:34:30,432 INFO program: Running... new-kernel-pkg --mkinitrd --dracut --depmod --update 3.15.0-0.rc5.git4.1.fc21.x86_64


I don't see the correlation between install.py and what actually happens. The bootloader and its config clearly get done well before the initramfs, while the install.py says the bootloader is done last.

Comment 17 Gene Czarcinski 2014-05-26 13:46:52 UTC
I found the problem and have a fix.  A patch will be submitted "real soon now."

Comment 18 Jaroslav Reznik 2015-03-03 17:13:37 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 19 Fedora End Of Life 2016-07-19 19:02:00 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.