Bug 2166233
| Summary: | grubby fails to add the kernel entry when upgrading from RHEL6 using redhat-upgrade-tool | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> |
| Component: | kernel | Assignee: | Denys Vlasenko <dvlasenk> |
| kernel sub component: | Packaging | QA Contact: | zhijwang <zhijwang> |
| Status: | POST --- | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | bmader, bwelterl, dvlasenk, hkrzesin, jstancek, kernel-qe, lilu, mkluson, mreznik, nmurray, ppaddhar, prjagtap, pstodulk, ptalbert, rhandlin, tmeszaro, zhijwang |
| Version: | 7.9 | Keywords: | Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2108243 | ||
|
Description
Renaud Métrich
2023-02-01 08:37:47 UTC
I'm setting the Priority/Severity as HIGH because it's preventing customers from upgrading their RHEL6 systems. Using the RHEL7.9 DVD level for the upgrade and not latest bits is usually not possible when having additional repositories (optional, supplementary, etc.) because many newer packages in these repositories require more recent components that RHEL 7.9 DVD level. Hi, just confirming that Renaud is right. I've investigated the issue (https://bugzilla.redhat.com/show_bug.cgi?id=2108243#c10) and I see that the valid fix - and the best way to fix the issue - is to fix the scriptlet - either by providing more robust script or just reverting the change. As I am informed, we have customers that are nowadays upgrading or preparing for the upgrade from RHEL 6 to RHEL 7 and if they use up-to-date packages, as required officially, they will hit this crucial issue. The bug has been introduce by the fix for the following BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1893756 Testing the original fix proposed in mr 313. Interrupted install still works: # yum install kernel-3.10.0-1160.89.1.el7.kpq.test.x86_64.rpm strace tcpdump mc gimp bzip2 traceroute gdb gcc firefox ... Installing : kernel-3.10.0-1160.89.1.el7.kpq.test.x86_64 39/40 Installing : 1:mc-4.8.7-11.el7.x86_64 40/40 ^Z [1]+ Stopped yum install kernel-3.10.0-1160.89.1.el7.kpq.test.x86_64.rpm strace tcpdump mc gimp bzip2 traceroute gdb gcc firefox # reboot ... # uname -sr Linux 3.10.0-1160.89.1.el7.kpq.test.x86_64 Testing rhel6->rhel7 upgrade: Install latest rhel6 (server, not client). Get rhel7 install ISO image: rhel-server-7.9-x86_64-dvd.iso (sha256sum:2cb36122a74be084c551bc7173d2d38a1cfb75c8ffbc1489c630c916d1b31b25 size:4526702592) Get these packages: preupgrade-assistant-2.6.2-1.el6.noarch.rpm preupgrade-assistant-el6toel7-0.8.0-3.el6.noarch.rpm preupgrade-assistant-el6toel7-data-0.20200704-1.el6.noarch.rpm redhat-upgrade-tool-0.8.0-9.el6.noarch.rpm (for example from https://access.redhat.com/downloads/content/69/ver=/rhel---6/6.10/x86_64/packages) yum -y install *.rpm createrepo Get the test kernel, in my case kernel-3.10.0-1160.89.1.el7.kpq.test.x86_64.rpm. createrepo /path/to/test_kernel # a dir with kernel-3.10.0-1160.89.1.el7.kpq.test.x86_64.rpm Run a local http server which exports /path/to/test_kernel on http://127.0.0.1/ Run "preupg", it should finish with no errors precluding rhel6->rhel7 migration Final step is to run "redhat-upgrade-tool", then reboot when prompted, and watch boot process to see whether grub menu is not broken. (Note that failed test makes machine unbootable). redhat-upgrade-tool --nogpgcheck --iso rhel-server-7.9-x86_64-dvd.iso --cleanup-post # ^^^ this should work - old kernel with no %posttrans changes is used, from ISO image redhat-upgrade-tool --nogpgcheck --iso rhel-server-7.9-x86_64-dvd.iso --addrepo=latest='http://rhsm-pulp.corp.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os' --cleanup-post # ^^^ this should FAIL - kernel with buggy %posttrans change used, from rhsm-pulp redhat-upgrade-tool --nogpgcheck --iso rhel-server-7.9-x86_64-dvd.iso --addrepo=latest='http://127.0.0.1/' --cleanup-post # ^^^ this works in my testing (and I verified that the kernel used is indeed the test one) *** Bug 2108243 has been marked as a duplicate of this bug. *** (In reply to Petr Stodulka from comment #5) > Hi, just confirming that Renaud is right. I've investigated the issue > (https://bugzilla.redhat.com/show_bug.cgi?id=2108243#c10) and I see that the > valid fix - and the best way to fix the issue - is to fix the scriptlet - > either by providing more robust script or just reverting the change. Creating boot entries before there is initrd is guaranteed and over time proven to cause issues for customers. I'd rather see systemd (new-kernel-pkg) or grubby be made more robust - for example by storing kernel parameters somewhere, if it is last kernel being uninstalled. > Creating boot entries before there is initrd is guaranteed and over time proven to cause issues for customers.
> I'd rather see systemd (new-kernel-pkg) or grubby be made more robust - for example by storing kernel parameters somewhere, if it is last kernel being uninstalled.
Has there been any situation in the original bug, when the kernel posttrans scriptlet has not been executed? In case the scriptlet has been always executed, nothing should prevent kernel to deal with the situation. Fixing the issue anywhere else than in kernel scriptlet seems to me too much work when speaking about RHEL 7.9. Especially in case we speak about corner-corner case which we know that people could hit:
* if they in-place upgrade 6 -> 7 (in 100% cases on intel)
* if they boot to rescue kernel / live OS and from there remove all installed kernel packages manually and then installing a kernel again (which I would consider as unsupported action if somone does something like that)
(In reply to Petr Stodulka from comment #11) > > Creating boot entries before there is initrd is guaranteed and over time proven to cause issues for customers. > > I'd rather see systemd (new-kernel-pkg) or grubby be made more robust - for example by storing kernel parameters somewhere, if it is last kernel being uninstalled. > > Has there been any situation in the original bug, when the kernel posttrans > scriptlet has not been executed? Yes, indeed. The typical scenario when this happens in real world is when admin simply runs "yum update". This tries updating many packages, and if any package's update scripts is buggy in a way that "yum update" hangs, admin has little choice than killing it. In this case, if a newer kernel was already installed, there will be a new grub entry for it, but no initramfs. On next reboot, grub will not be able to find initramfs, and boot will fail. I think we had about 15 user complaints about this happening. Hi Denys, thanks for the info. Hearing for the first time about such issues on RHEL, but it's true that real systems contain a lot of custom & 3rd-party content too which could affect it also. Not mentioning all possible configurations of real systems. (In reply to Petr Stodulka from comment #6) > The bug has been introduce by the fix for the following BZ: Red > Hathttps://bugzilla.redhat.com/show_bug.cgi?id=1893756 @zhijwang Hi Zhijun, Can you also take this bug as it's a follow up for 1893756? Let us know if you need a hand. Thanks! (In reply to Linqing Lu from comment #15) > Hi Zhijun, > > Can you also take this bug as it's a follow up for 1893756? > Let us know if you need a hand. > > Thanks! Sure, I will take it. Thanks Linqing! |