Red Hat Bugzilla – Bug 1002891
kernel %post script are too fragile in case of impromptu reboot
Last modified: 2016-07-26 00:55:05 EDT
At $dayjob, we found out that if someone for whatever reason, reboot during a huge update, there is a 100% chance that the next reboot does go well right from the kernel, which prevent any remote troubleshooting or fixing.
We have several non technical users, and while we tell "do not shutdown during upgrade", there is statistically always a few accidents ( either the user forgot about the upgrade running in background, or there is no more power, etc ). And since not all are linux-savy, they may not know how to boot on a older kernel and start to panic due to the error message.
So in order to reduce support call, I looked at it and found the root cause.
The reason is the following :
- kernel %post update grub/lilo/etc
- kernel %posttrans do create the initrd.
It make sense to create the initrd last, since we want to have the latest version of all library. But if there is a problem once the kernel is installed and before the end of update, the grub configuration point to a invalid entry, since there is no working initrd.
So I think it would be more robust if the configuration of grub was updated only once the initrd is created.
While this bug is for F19, once that's tested, i will also likely ask for the same fix on RHEL 6.
We also have a workaround for it, by disabling auto grub update in /etc/sysconfig/kernel, ( UPDATEDEFAULT=no ) and adding the bootloader in a script in /etc/kernel/postinst.d/ , but it would be better to do the right thing by default.
So what we should probably do is make %post add the new kernel's stanza, make %posttrans create the initrd, and then at the end of %posttrans, make it the default.
Peter was working on this a bit ago. It requires changes in grubby, and small change in kernel.spec when grubby is built and ready. Moving to grubby for now.
Just to add my $0.02:
I ran into this on a fedup --net 20 - there was some kind of issue during the reboot process and the whole upgrade stopped after installing ~1200 pachaes of ~2700. The computer was unresponsive, there was no remote access, so I did a hard reboot. I discovered that there was only F19 kernels in the Grub menu on reboot, and the fedup upgrade did not terminate (the "upgrade" Grub menu item had disappeared on reboot).
So I regenerated grub.cfg with grub2-knconfig -o /boot/grub2/grub.cfg which promptly broke everything because there were Grub entries for F20 kernels with no corresponding initrd.
It took a long time for me to get a reasonable config, and I still have lingering issues.
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.
More information and reason for this action is here:
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
Thank you for reporting this bug and we are sorry it could not be fixed.
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle.
Changing version to '25'.