In BZ1840222, we discovered an issue where Ironic forces machines off. The IPA image is missing efibootmgr, as can be seen after Ironic writes the image to disk: ``` 2020-05-28 13:43:11.525 1242 ERROR root [-] Command execution error: [Errno 2] No such file or directory: 'efibootmgr': 'efibootmgr': FileNotFoundError: [Errno 2] No such file or directory: 'efibootmgr': 'efibootmgr' 2020-05-28 13:43:11.525 1242 ERROR root Traceback (most recent call last): 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py", line 252, in execute_command 2020-05-28 13:43:11.525 1242 ERROR root result = ext.execute(command_part, **kwargs) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py", line 205, in execute 2020-05-28 13:43:11.525 1242 ERROR root return cmd(**kwargs) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py", line 321, in wrapper 2020-05-28 13:43:11.525 1242 ERROR root result = func(self, **command_params) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/ironic_python_agent/extensions/image.py", line 520, in install_bootloader 2020-05-28 13:43:11.525 1242 ERROR root utils.execute('efibootmgr', '--version') 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/ironic_python_agent/utils.py", line 74, in execute 2020-05-28 13:43:11.525 1242 ERROR root return ironic_utils.execute(*cmd, **kwargs) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/ironic_lib/utils.py", line 99, in execute 2020-05-28 13:43:11.525 1242 ERROR root result = processutils.execute(*cmd, **kwargs) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 391, in execute 2020-05-28 13:43:11.525 1242 ERROR root env=env_variables) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__ 2020-05-28 13:43:11.525 1242 ERROR root restore_signals, start_new_session) 2020-05-28 13:43:11.525 1242 ERROR root File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child 2020-05-28 13:43:11.525 1242 ERROR root raise child_exception_type(errno_num, err_msg, err_filename) 2020-05-28 13:43:11.525 1242 ERROR root FileNotFoundError: [Errno 2] No such file or directory: 'efibootmgr': 'efibootmgr' 2020-05-28 13:43:11.525 1242 ERROR root 2020-05-28 13:43:11.568 1242 INFO root [-] Command image.install_bootloader completed: Command name: image.install_bootloader, params: {'root_uuid': '0x00000000', 'efi_system_part_uuid': None, 'prep_boot_part_uuid': None}, status: FAILED, result: None. fd00:1101::2 - - [28/May/2020 13:43:11] "POST /v1/commands?wait=true HTTP/1.1" 200 402 ``` The working theory is that somehow that error in IPA is causing ironic to never get the message the host rebooted, so it ends up force powering off the host -- after RHCOS has already booted. The missing efibootmgr package was fixed by BZ1810604, however those images are missing from 4.5.0-ci builds. 4.5.0-nightlies have them. CI builds don't automatically get rebuilt unless it's done manually or a change is pushed to them.
I'm using this bug to force ironic-ipa-downloader to rebuild, and leaving https://bugzilla.redhat.com/show_bug.cgi?id=1840222 to possibly pick up https://review.opendev.org/#/c/731575/ which is a fix to ironic to prevent forcing the host off.
Could you, please, provide instructions how can we verify this fix?
I think we can verify this by checking the IPA package version, and/or verifying the missing efibootmanager file is present E.g: Make note of the provisioningNetworkCIDR from the install-config, this determines the IP to use below - fd00:1101::2 is the default provisioning IP for the bootstrap VM in a dev-scripts environment, or you could use fd00:1101::3 to download the files from a running cluster. mkdir ipa_tmp; cd ipa_tmp wget http://[fd00:1101::2]/images/ipa-ramdisk-pkgs.info wget http://[fd00:1101::2]/images/ironic-python-agent.initramfs cat ipa-ramdisk-pkgs.info This will show e.g rhosp-director-images-ipa-x86_64 16.0 20200513.1.el8ost noarch which can be verified to be equal to (or newer than) the fixed-in-version for bz1810604 To be completely sure we can also check for the existence of the previously missing file, e.g: zcat ironic-python-agent.initramfs | cpio -ivd find . -name efibootmgr This should show /usr/sbin/efibootmgr
both recommended verifications passed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409