Bug 2058717 - fstab entry for EFI partition written with partition UUID instead of of uuid - deployed node does not boot
Summary: fstab entry for EFI partition written with partition UUID instead of of uuid ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic-python-agent
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 17.0
Assignee: Julia Kreger
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-25 17:11 UTC by Harald Jensås
Modified: 2022-09-21 12:19 UTC (History)
3 users (show)

Fixed In Version: openstack-ironic-python-agent-7.0.3-0.20220315051950.881015a.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-21 12:19:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 831029 0 None MERGED Create fstab entry with appropriate label 2022-03-14 18:32:01 UTC
Red Hat Issue Tracker OSP-13197 0 None None None 2022-02-25 17:12:23 UTC
Red Hat Product Errata RHEA-2022:6543 0 None None None 2022-09-21 12:19:50 UTC

Description Harald Jensås 2022-02-25 17:11:33 UTC
Description of problem:
Deployed node does not boot, fails during switchroot and drops to emergency shell.
Console log:
 Timed out waiting for device dev-di…5c\x2da829\x2d37e4267c9978.device.

It is failing to mount the EFI partition.

On working node deployment:
Feb 24 13:55:37 host-192-168-24-34 ironic-python-agent[669]: 2022-02-24 13:55:37.379 669 DEBUG ironic_python_agent.extensions.image [-] Added entry to /etc/fstab for EFI partition auto-mount with uuid 481E-F2FD _append_uefi_to_fstab /usr/lib/python3.6/site-packages/ironic_python_agent/extensions/image.py:909

On failing node deployment:
Feb 24 13:55:37 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:55:37.075 670 DEBUG ironic_python_agent.extensions.image [-] Added entry to /etc/fstab for EFI partition auto-mount with uuid 68fe0417-0a56-445c-a829-37e4267c9978
_append_uefi_to_fstab /usr/lib/python3.6/site-packages/ironic_python_agent/extensions/image.py:909


The difference is the UUID used in fstab. On the failing node the "partuuid" as listed by lsblk of the EFI partition is used. On working node the "uuid" as listed by lsblk is used.


Version-Release number of selected component (if applicable):
rhosp-director-images-ipa-x86_64-17.0-20220216.2.el8ost.noarch

How reproducible:
10%

Steps to Reproduce:
1. Attempt to deploy many nodes in parallel
2. Most nodes deploy successfully
3. 

Actual results:
fstab entry written:

 UUID=68fe0417-0a56-445c-a829-37e4267c9978 /boot/efi       vfat    umask=0077      0       1

Consloe logs:
  Timed out waiting for device dev-di…5c\x2da829\x2d37e4267c9978.device.


Expected results:
fstab entry written:

 UUID=46A5-0817  /boot/efi       vfat    umask=0077      0       1

Additional info:
qemu-nbd --connect=/dev/nbd0 /root/debug_switchroot_issue/compute-3-disk1.qcow2
using `lsblk --output-all --json /dev/nbd0 | jq .` there is for the EFI partition.
 "partuuid": "68fe0417-0a56-445c-a829-37e4267c9978" 
 "uuid": "482E-AAF9" 

It seems IPA is not consistently using "uuid".

Comment 2 Harald Jensås 2022-02-25 17:19:53 UTC
OK DEPLOY
#########
Feb 24 13:55:32 host-192-168-24-34 ironic-python-agent[669]: 2022-02-24 13:55:32.694 669 DEBUG ironic_lib.utils [-] Command stdout is: "KNAME="vda" UUID="" PARTUUID="" TYPE="disk" LABEL=""
                                                             KNAME="vda1" UUID="481E-F2FD" PARTUUID="b7dac7a5-eefa-4966-bbcd-358f074618f3" TYPE="part" LABEL="efi-part"
                                                             KNAME="vda2" UUID="2022-02-24-18-48-59-00" PARTUUID="6be3281a-34e7-4ab0-b30a-0417a06f101b" TYPE="part" LABEL="config-2"
                                                             KNAME="vda3" UUID="e8bf71d2-a8ef-4176-a277-fd26220ef3fb" PARTUUID="3d0b48e3-00e4-40c1-99f0-653f5f694c01" TYPE="part" LABEL="img-rootfs"
                                                             " _log /usr/lib/python3.6/site-packages/ironic_lib/utils.py:99
Feb 24 13:55:32 host-192-168-24-34 ironic-python-agent[669]: 2022-02-24 13:55:32.704 669 DEBUG ironic_lib.utils [-] Command stderr is: "" _log /usr/lib/python3.6/site-packages/ironic_lib/utils.py:100
Feb 24 13:55:32 host-192-168-24-34 ironic-python-agent[669]: 2022-02-24 13:55:32.707 669 DEBUG ironic_python_agent.extensions.image [-] Partition 481E-F2FD found on device /dev/vda _get_partition /usr/lib/python3.6/site-packages/ironic_p
ython_agent/extensions/image.py:99

FAILED DEPLOY
#############
Feb 24 13:55:32 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:55:32.632 670 DEBUG oslo_concurrency.processutils [-] CMD "lsblk -PbioKNAME,UUID,PARTUUID,TYPE,LABEL /dev/vda" returned: 0 in 0.010s execute /usr/lib/python3.6/sit
e-packages/oslo_concurrency/processutils.py:423
Feb 24 13:55:32 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:55:32.636 670 DEBUG ironic_lib.utils [-] Command stdout is: "KNAME="vda" UUID="" PARTUUID="" TYPE="disk" LABEL=""
                                                            KNAME="vda1" UUID="482E-AAF9" PARTUUID="68fe0417-0a56-445c-a829-37e4267c9978" TYPE="part" LABEL="efi-part"
                                                            KNAME="vda2" UUID="2022-02-24-18-49-07-00" PARTUUID="2a41400b-765e-4fa3-98e8-b6d19f86ae25" TYPE="part" LABEL="config-2"
                                                            KNAME="vda3" UUID="e8bf71d2-a8ef-4176-a277-fd26220ef3fb" PARTUUID="3bec0aa9-f641-4a58-b5ba-14bb2147b2e5" TYPE="part" LABEL="img-rootfs"
                                                            " _log /usr/lib/python3.6/site-packages/ironic_lib/utils.py:99
Feb 24 13:55:32 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:55:32.645 670 DEBUG ironic_lib.utils [-] Command stderr is: "" _log /usr/lib/python3.6/site-packages/ironic_lib/utils.py:100
Feb 24 13:55:32 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:55:32.647 670 DEBUG ironic_python_agent.extensions.image [-] Partition 68fe0417-0a56-445c-a829-37e4267c9978 found on device /dev/vda _get_partition /usr/lib/python
3.6/site-packages/ironic_python_agent/extensions/image.py:103

Comment 3 Julia Kreger 2022-02-25 17:52:47 UTC
Failed:

Feb 24 13:55:25 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:55:25.881 670 DEBUG root [-] Executing command: image.install_bootloader with args: {'root_uuid': 'e8bf71d2-a8ef-4176-a277-fd26220ef3fb', 'efi_system_part_uuid': '68fe0417-0a56-445c-a829-37e4267c9978', 'prep_boot_part_uuid': None, 'target_boot_mode': 'uefi'} execute_command /usr/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py:255

Working:

Feb 24 13:55:01 host-192-168-24-41 ironic-python-agent[678]: 2022-02-24 13:55:01.853 678 DEBUG root [-] Executing command: image.install_bootloader with args: {'root_uuid': 'e8bf71d2-a8ef-4176-a277-fd26220ef3fb', 'efi_system_part_uuid': '46E9-A0BD', 'prep_boot_part_uuid': None, 'target_boot_mode': 'uefi'} execute_command /usr/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py:255

Looks like we're selecting the partuuid instead of the uuid, and that is why the mount is failing. Interestingly enough, this should work with just the partuuid AIUI.

which comes from ironic_lib's disk_utils.

Failed node:

Feb 24 13:54:59 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:54:59.050 670 DEBUG ironic_lib.disk_utils [-] Falling back to partition UUID as the block device UUID was not found while examining /dev/vda3 block_uuid /usr/lib/python3.6/site-packages/ironic_lib/disk_utils.py:563

A total of 2 partitions failed to lookup on the node that failed, vs only one (the root fs) on the one that succeeded.

The code, explicitly tries to return the UUID field, and then falls back to PARTUUID

Comment 4 Julia Kreger 2022-02-25 20:35:10 UTC
Feb 24 13:54:59 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:54:59.072 670 DEBUG oslo_concurrency.processutils [-] CMD "lsblk /dev/vda1 --pairs --bytes --ascii --nodeps --output UUID,PARTUUID" returned: 0 in 0.016s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423
Feb 24 13:54:59 host-192-168-24-8 ironic-python-agent[670]: 2022-02-24 13:54:59.079 670 DEBUG ironic_lib.utils [-] Command stdout is: "UUID="" PARTUUID="68fe0417-0a56-445c-a829-37e4267c9978"

Comment 7 Julia Kreger 2022-03-14 18:32:26 UTC
Fix cherry-picked downstream and in downstream review.

Comment 17 errata-xmlrpc 2022-09-21 12:19:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543


Note You need to log in before you can comment on or make changes to this bug.