Bug 1955099
| Summary: | Updating the kernel on leapp'ed systems doesn't create the initramfs depending on Grub BLS state | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> |
| Component: | leapp-repository | Assignee: | Leapp Notifications Bot <leapp-notifications-bot> |
| Status: | CLOSED MIGRATED | QA Contact: | upgrades-and-conversions |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.9 | CC: | cbesson, fkrska, jeperez, pstodulk |
| Target Milestone: | rc | Keywords: | MigratedToJIRA |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-09-12 11:09:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1818077, 1818088 | ||
Just for the information, before/after the removal of the kernel & kernel-workaround rpms, user will need to remove the old-kernel entry from the bootloader manually for now. E.g.
/bin/kernel-install remove 3.10.0-1160.25.1.el7.x86_64 /lib/modules/3.10.0-1160.25.1.el7.x86_64/vmlinuz
The documentation is going to be updated to cover the problem until we fix it. FYI, the documentation has been updated. Hi Petr, There is a second scenario where the issue can happen: in case /etc/default/grub doesn't end with a newline **prior** to executing "leapp upgrade". This makes the reboot fail with this: ~~~ [ 266.092340] upgrade[609]: ============================================================ [ 266.093274] upgrade[609]: ERRORS [ 266.093995] upgrade[609]: ============================================================ [ 266.094790] upgrade[609]: 2021-06-05 15:20:04.141285 [ERROR] Actor: kernelcmdlineconfig [ 266.095526] upgrade[609]: Message: Failed to append extra arguments to kernel command line. [ 266.096381] upgrade[609]: Summary: [ 266.096944] upgrade[609]: Details: Command ['grubby', '--update-kernel=/boot/vmlinuz-4.18.0-305.3.1.el8_4.x86_64', '--args=net.ifnames=0'] failed with exit code 1. [ 266.098461] upgrade[609]: ============================================================ [ 266.099562] upgrade[609]: END OF ERRORS [ 266.100421] upgrade[609]: ============================================================ ~~~ This ends up getting a broken Grub stanza in /etc/default/grub which is equivalent to not having GRUB_ENABLE_BLSCFG=true: ~~~ ... GRUB_DISABLE_RECOVERY="true"GRUB_ENABLE_BLSCFG=true ~~~ I'll be writing a KCS on this asap. Hi Petr, comment #5 is a dup of BZ #1937383 actually, linking the KCS there as well. Hi Renaud, thanks for the info and KCS. I already pinged guys about that. The second issue will be probably catched - the most probably we will write an inhibitor for upgrade when the grub file is invalid (the LF is missing). A puppet-agent pushing an old default grub file can be a problem afterwards, on RHEL 8.
But that does not explain why GRUB_ENABLE_BLSCFG was not added during the upgrade step, as there is no puppet-agent in this sequence.
Adding a customer case with the same symptoms, the rhel8 initrd has not being generated and it ends in the emergency shell.
What happened initially
- the DNF transaction was "successful" but the RHEL 8 kernel was partly installed, dracut having not being executed (post-script silently failed).
- due to this, subsequent grubby commands failed, leading to the emergency shell, and a broken upgrade.
- the BLS entries were not there in /boot/loader/entries and /etc/default/grub was unchanged.
Why?
Because "something" prevented to modify `/etc/default/grub` during the real upgrade (during the reboot step on the dedicated "RHEL-UpgrateInitramfs").
I'm indeed able to reproduce the *very same behaviour* by setting an immutable bit (`chattr +i`) on /etc/default/grub* files.
* grubby is upgraded, hence the old `/sbin/new-kernel-pkg` script is erased.
~~~
Jun 16 17:22:35 localhost upgrade[1912]: Upgrading : grubby-8.40-47.el8.x86_64 341/2351
~~~
* grub2-tools is upgraded
~~~
Jun 16 17:22:48 localhost upgrade[1912]: Upgrading : grub2-tools-1:2.02-148.el8.x86_64 399/2351
Jun 16 17:22:48 localhost upgrade[1912]: Running scriptlet: grub2-tools-1:2.02-148.el8.x86_64 399/2351
~~~
* so its post-script executes `/sbin/grub2-switch-to-blscfg`:
~~~
if [ "$1" = 2 ]; then
/sbin/grub2-switch-to-blscfg --backup-suffix=.rpmsave &>/dev/null || :
fi
~~~
* this script adds `GRUB_ENABLE_BLSCFG=true` in `/etc/default/grub` if it's not there (line 280):
~~~
269 GENERATE=0
270 if grep '^GRUB_ENABLE_BLSCFG=.*' "${etcdefaultgrub}" \
271 | grep -vq '^GRUB_ENABLE_BLSCFG="*true"*\s*$' ; then
272 if ! sed -i"${backupsuffix}" \
273 -e 's,^GRUB_ENABLE_BLSCFG=.*,GRUB_ENABLE_BLSCFG=true,' \
274 "${etcdefaultgrub}" ; then
275 gettext_printf "Updating %s failed\n" "${etcdefaultgrub}"
276 exit 1
277 fi
278 GENERATE=1
279 elif ! grep -q '^GRUB_ENABLE_BLSCFG=.*' "${etcdefaultgrub}" ; then
280 if ! echo 'GRUB_ENABLE_BLSCFG=true' >> "${etcdefaultgrub}" ; then
281 gettext_printf "Updating %s failed\n" "${etcdefaultgrub}"
282 exit 1
283 fi
284 GENERATE=1
285 fi
~~~
It didn't happen, and the error has not been caught deliberately ( &>/dev/null || : ).
That's why you had no BLS entries.
* After that the `kernel-workaround` package is installed, it only contains an **empty** `/sbin/new-kernel-pkg` script, in order to prevent a conflict while upgrading grubby, since RHEL 7 kernel package **requires** this script (in use from the RPM postscript).
~~~
Jun 16 17:24:42 localhost upgrade[1912]: Installing : kernel-workaround-0.1-1.el8.noarch 1066/2351
~~~
* And finally, at the very end, the `kernel-core` postscript is executed and fails silently.
Its postscript calls:
~~~
/bin/kernel-install add 4.18.0-477.13.1.el8_8.x86_64 /lib/modules/4.18.0-477.13.1.el8_8.x86_64/vmlinuz || exit $?
~~~
* The `kernel-install` script executes sequentially the files installed by grub/grubby/dracut in `/usr/lib/kernel/install.d`, in particular **`20-grub.install`**:
~~~
88 if [[ "x${GRUB_ENABLE_BLSCFG}" = "xtrue" ]] || [[ ! -f /sbin/new-kernel-pkg ]]; then
~~~
Due to the lack of `GRUB_ENABLE_BLSCFG` **and** the presence of `/sbin/new-kernel-pkg`, it does not enter into this code block, and then `/sbin/new-kernel-pkg is called`, but it does nothing anymore since the script is empty!
~~~
146 /sbin/new-kernel-pkg --package "kernel${flavor}" --install "$KERNEL_VERSION" || exit $?
147 /sbin/new-kernel-pkg --package "kernel${flavor}" --mkinitrd --dracut --depmod --update "$KERNEL_VERSION" || exit $?
~~~
Leading to a initrd not generated for the RHEL 8 kernel...
* The kernel having not being properly installed, grubby failed while removing the "enforcing=0" parameter from the kernel cmdline, and you fall into the emergency shell.
Why adding `GRUB_ENABLE_BLSCFG=true` before the reboot helped?
* In short this time:
- grub2-switch-to-blscfg didn't generate converted the existing entries to BLS because the variable is already set (line 270-271). So it didn't call again `grub2-mkconfig`, leading to a non-updated grub.cfg for BLS configurations, containing only RHEL 7 entries.
- due to the presence of the variable, later the kernel-core postscript entered in the code block which creates the bls entry (line 96) and then 50-dracut.install is executed, so the initramfs is created.
- the kernel being properly installed, grubby worked again.
* The solution here is to simply execute `grub2-mkconfig -o /boot/grub2/grub.cfg` (the grub2-switch-to-blscfg is really useful only if you want to keep el7 kernels).
Hello, I have found a new scenario where this issue happens: If the package grub2-tools is installed but is not present in the RPM database, the IPU will proceed as normal until the point where it fails with: ~~~ Jul 13 11:37:32 localhost upgrade[38298]: ============================================================ Jul 13 11:37:32 localhost upgrade[38298]: ERRORS Jul 13 11:37:32 localhost upgrade[38298]: ============================================================ Jul 13 11:37:32 localhost upgrade[38298]: 2023-07-13 13:37:31.928060 [ERROR] Actor: kernelcmdlineconfig Jul 13 11:37:32 localhost upgrade[38298]: Message: Failed to append extra arguments to kernel command line. Jul 13 11:37:32 localhost upgrade[38298]: Summary: Jul 13 11:37:32 localhost upgrade[38298]: Details: Command ['grubby', '--update-kernel=/boot/vmlinuz-4.18.0- 477.15.1.el8_8.x86_64', '--args', 'net.ifnames=0', '--remove-args', 'enforcing=0'] failed with exit code 1. Jul 13 11:37:32 localhost upgrade[38298]: ============================================================ Jul 13 11:37:32 localhost upgrade[38298]: END OF ERRORS Jul 13 11:37:32 localhost upgrade[38298]: ============================================================ ~~~ The error is not really caused by the package grub2-tools not being installed during the IPU, on the contrary, it will be installed. However, checking the post-install script: ~~~ if [ "$1" = 2 ]; then /sbin/grub2-switch-to-blscfg --backup-suffix=.rpmsave &>/dev/null || : fi ~~~ grub2-switch-to-blscfg will only be called when the transaction is an upgrade. In this case, as the package was not present in the RPM database, it's marked as a new installation instead of an upgrade and grub2-switch-to-blscfg is not executed. Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |
Description of problem: When upgrading a system to RHEL8 using leapp, a **kernel-workaround** package is installed that ships an empty /usr/sbin/new-kernel-pkg script. If the customer is *not** using BLS (by not adding GRUB_ENABLE_BLSCFG=true in /etc/default/grub, likely because the customer uses Puppet with outdated templates from RHEL7), this ends up not generating the initramfs when updating kernels. Exact reason is on line 80 in /usr/lib/kernel/install.d/20-grub.install: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 80 if [[ "x${GRUB_ENABLE_BLSCFG}" = "xtrue" ]] || [[ ! -f /sbin/new-kernel-pkg ]]; then 81 eval "$(grub2-get-kernel-settings)" || true 82 [[ -d "$BLS_DIR" ]] || mkdir -m 0700 -p "$BLS_DIR" : 138 /sbin/new-kernel-pkg --package "kernel${flavor}" --install "$KERNEL_VERSION" || exit $? 139 /sbin/new-kernel-pkg --package "kernel${flavor}" --mkinitrd --dracut --depmod --update "$KERNEL_VERSION" | | exit $? 140 /sbin/new-kernel-pkg --package "kernel${flavor}" --rpmposttrans "$KERNEL_VERSION" || exit $? 141 # If grubby is used there's no need to run other installation plugins 142 exit 77 -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Above, the code block on lines 81+ is not entered because of the lack of "GRUB_ENABLE_BLSCFG" in /etc/default/grub and the presence of /sbin/new-kernel-pkg shipped by **kernel-workaround**. It then goes to line 138-140 which do nothing since the script is an empty script. Version-Release number of selected component (if applicable): leapp-repository-0.13.0-2.el7_9.noarch