Description of problem: When enabling debug log, in ironic, the /var/log/containers/ironic/journal records the following exception: ~~~ DEBUG ironic_python_agent.extensions.image [-] Exception encountered while attempting to setup the EFI loader from a root filesystem. Error: 'utf-8' codec can't encode characters in position 2623-2624: surrogates not allowed _try_preserve_efi_assets /usr/lib/python3.6/site-packages/ironic_python_agent/extensions/image.py:755 ~~~ This error is caused by the following UEFI boot entry: mvutcovi@supportshell-1:~/03465225/0220-efi-firmware-variables-hv57$ hexdump -C ./sys/firmware/efi/efivars/Boot0021-8be4df61-93ca-11d2-aa0d-00e098032b8c 00000000 07 00 00 00 00 01 00 00 2c 00 54 00 72 00 69 00 |........,.T.r.i.| 00000010 67 00 67 00 65 00 72 00 20 00 72 00 65 00 ff 00 |g.g.e.r. .r.e...| 00000020 64 00 79 00 2d 00 74 00 6f 00 2d 00 62 00 6f 00 |d.y.-.t.o.-.b.o.| 00000030 6f 00 74 00 20 00 65 00 76 00 65 00 6e 00 74 00 |o.t. .e.v.e.n.t.| 00000040 00 00 04 07 14 00 35 7b bb cd 33 68 d6 4e 9a b2 |......5{..3h.N..| 00000050 57 d2 ac dd f6 f0 04 06 14 00 b0 aa ff 4a 76 13 |W............Jv.| 00000060 b4 44 9c 6e e9 23 88 75 1b c6 7f ff 04 00 |.D.n.#.u......| 0000006e [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [17:20:55+0000] The command "efibootmgr -v" mvutcovi@supportshell-1:~/03465225$ cat 0200-efibootmgr-v-hv57-hexdump.txt|sed -r 's/ \|.*//;s/ / /g'|grep -v 000009c8|xxd -r|grep Boot0021 Boot0021 Trigger reÿdy-to-boot event FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(4affaab0-1376-44b4-9c6e-e92388751bc6) The problem is that ironic_python_agent is trying to run _efi_boot_setup(device, efi_system_part_uuid), which fails not because efibootmgr has failed, but because python could not parse it's output. Also ironic_python_agent is continuing with GRUB installation, leading to more errors. The solution would be to either notify the user about the broken UEFI boot entry, or catch it properly and log a warning message with details about the entry and the problematic character position in the entry. Also please note that in the future we might see more and more UEFI boot entries that are internationalized. https://github.com/openstack/ironic-python-agent/blob/0bf579c955477da9a43e546703146b8b2b24d05f/ironic_python_agent/extensions/image.py#L227 efi_preserved = _try_preserve_efi_assets( device, path, efi_system_part_uuid, efi_partition, efi_partition_mount_point) if efi_preserved: _append_uefi_to_fstab(path, efi_system_part_uuid) # Success preserving efi assets return else: # Failure, either via exception or not found # which in this case the partition needs to be # remounted. LOG.debug('No EFI assets were preserved for setup or the ' 'ramdisk was unable to complete the setup. ' 'falling back to bootloader installation from ' 'deployed image.') _mount_partition(root_partition, path) https://github.com/openstack/ironic-python-agent/blob/0bf579c955477da9a43e546703146b8b2b24d05f/ironic_python_agent/extensions/image.py#L471 try: # Since we have preserved the assets, we should be able # to call the _efi_boot_setup method to scan the device # and add loader entries efi_preserved = _efi_boot_setup(device, efi_system_part_uuid) # Executed before the return so we don't return and then begin # execution. return efi_preserved except Exception as e: # Remount the partition and proceed as we were. LOG.debug('Exception encountered while attempting to ' 'setup the EFI loader from a root ' 'filesystem. Error: %s', e) https://github.com/openstack/ironic-python-agent/blob/a1670753a23a79b6536f67eae9cca154e0ed2e65/ironic_python_agent/efi_utils.py#L273 def get_boot_records(): """Executes efibootmgr and returns boot records. :return: an iterator yielding pairs (boot number, boot record). """ efi_output = utils.execute('efibootmgr', '-v') for line in efi_output[0].split('\n'): match = _ENTRY_LABEL.match(line) if match is not None: yield (match[1], match[2]) Version-Release number of selected component (if applicable): How reproducible: All the time until the boot entry with untranslatable unicode characters is deleted Steps to Reproduce: 1. create a boot entry on a physical machine that contains "ÿ" character https://www.compart.com/en/unicode/U+00FF 2. Try provision this machine as a compute node 3.
This issue appears to be rooted in python2/python3 differences. The underlying call was written for python2 compatibility, which truncates the data down to the UTF-8 character set by using os.fsdecode on the standard-error and standard-output content before returning to the caller. This occurs in the oslo.concurrency library. It appears, just adding the binary=True option to the calls where we may get UTF-16 content, seems reasonable and the path forward. I'm going to upload a work in progress patch into CI to see if this if making the overall change results in things working as expected. Having an automated low level test of this is not really possible, but the underlying library *does* actually have a test for the option, so it is more a question of compatibility for the agent code using the library with the different option.
The fix has merged downstream, and is in the build pipeline process. It should be available in our next z stream.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:6307