Bug 1925078
Summary: | RHOSP13-16.1 FFU: Overcloud upgrade hangs in controller after failed attempt with reference to wrong ceph image. | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Shravan Kumar Tiwari <shtiwari> |
Component: | openstack-tripleo-heat-templates | Assignee: | Lukas Bezdicka <lbezdick> |
Status: | CLOSED ERRATA | QA Contact: | Jason Grosso <jgrosso> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 16.1 (Train) | CC: | apetrich, astupnik, fj-lsoft-ofuku, gfidente, igallagh, jjoyce, jkreger, jpretori, jschluet, kthakre, lbezdick, mburns, msufiyan, slinaber, spower, tvignaud, vgrosu |
Target Milestone: | z4 | Keywords: | Triaged |
Target Release: | 16.1 (Train on RHEL 8.2) | ||
Hardware: | All | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-11.3.2-1.20210104205662.el8ost.2 | Doc Type: | Known Issue |
Doc Text: |
Systems that use UEFI boot and a UEFI bootloader in OSP13 might run into an UEFI issue that results in:
* /etc/fstab not being updated
* grub-install used incorrectly on EFI system
If your systems use UEFI, contact Red Hat Technical Support. For more information, see the Red Hat Knowledgebase solution https://access.redhat.com/solutions/5861031[FFU 13 to 16.1: Leapp fails to update the kernel on UEFI based systems and /etc/fstab does not contain the EFI partition]
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-03-17 15:36:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1768952 |
Description
Shravan Kumar Tiwari
2021-02-04 11:20:09 UTC
Breakdown of the problem: 1) Customer used wrong ceph image and needed to update this in systemd service file for ceph to remove reoccurring log about podman failing to pull image. This was irrelevant to stuck upgrade. 2) The stuck upgrade came from mysql upgrade container getting stuck with podman reporting it running but nothing happening. 3) The container was stuck due to wrong kernel - post Leapp the system booted into old 3.10.0-1160 instead of 4... 4) This was due to Leapp failing to update kernel because system it self was EFI based but /etc/fstab does not contain the EFI partition. To break this down during deployment Ironic runs: ... Jan 11 05:22:25 host-192-168-0-200 ironic-python-agent[2108]: 2021-01-11 05:22:25.019 2108 DEBUG oslo_concurrency.processutils [-] CMD "mount /dev/sda1 /tmp/tmpFpRF6S/boot/efi" returned: 0 in 0.076s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409 Jan 11 05:22:25 host-192-168-0-200 ironic-python-agent[2108]: 2021-01-11 05:22:25.281 2108 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): chroot /tmp/tmpFpRF6S /bin/sh -c "grub2-install /dev/sda" execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:372 Jan 11 05:22:27 host-192-168-0-200 ironic-python-agent[2108]: 2021-01-11 05:22:27.290 2108 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): chroot /tmp/tmpFpRF6S /bin/sh -c "grub2-mkconfig -o /boot/grub2/grub.cfg" execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:372 ... On the controller we found sign of using grub-install on efi system which creates non secure boot compatible setups: Boot0018* red HD(1,GPT,be5dd387-fc63-4d37-b5a8-68ccca72b172,0x800,0x64000)/File(\EFI\red\grubx64.efi) Here we can see that partitions are present on the disk so if system boots via EFI it happens through unmounted and not updated partition: 0080-sosreport-oscar02ctr001-2021-01-14-keewgqn.tar.xz/sosreport-oscar02ctr001-2021-01-14-keewgqn/sos_commands/block/fdisk_-l_.dev.sda WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion. Disk /dev/sda: 300.0 GB, 299966136320 bytes, 585871360 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 262144 bytes / 262144 bytes Disk label type: gpt Disk identifier: 7954D191-CD2D-4DC3-A3A2-2696AB9E3634 # Start End Size Type Name 1 2048 411647 200M EFI System primary 2 411648 413695 1M Microsoft basic primary 3 413696 585871325 279.2G Microsoft basic primary 0080-sosreport-oscar02ctr001-2021-01-14-keewgqn.tar.xz/sosreport-oscar02ctr001-2021-01-14-keewgqn/etc/fstab LABEL=img-rootfs / xfs defaults 0 1 Issues: 1) /etc/fstab was not updated 2) grub-install was incorrectly used on EFI system blkid output: /dev/sda1: SEC_TYPE="msdos" LABEL="efi-part" UUID="1930-AFD0" TYPE="vfat" PARTLABEL="primary" PARTUUID="c5f32f78-0c85-469c-8649-1bfb1f56d116" add /etc/fstab record: UUID="1930-AFD0" /boot/efi vfat umask=0077 0 1 mount /boot/efi dnf/yum reinstall grub2-efi-x64 shim-x64 efibootmgr -c --disk /dev/sda -p 1 -w -L RHEL -l "\\EFI\\redhat\\grubx64.efi" grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-working_with_the_grub_2_boot_loader *** Bug 1906681 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0817 *** Bug 1936523 has been marked as a duplicate of this bug. *** |