Bug 1879034 - Ironic boot_mode capabilities can end up empty when using bootMode: UEFI on baremetal host
Summary: Ironic boot_mode capabilities can end up empty when using bootMode: UEFI on b...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.0
Assignee: Doug Hellmann
QA Contact: Lubov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-15 09:18 UTC by Asher Shoshan
Modified: 2020-10-27 16:41 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:41:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github metal3-io baremetal-operator pull 645 0 None closed ensure the boot mode is set in ironic before starting inspection 2021-02-17 13:05:54 UTC
Github openshift baremetal-operator pull 104 0 None closed Bug 1879034: Ensure the boot mode is set in ironic before starting inspection 2021-02-17 13:05:54 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:41:30 UTC

Description Asher Shoshan 2020-09-15 09:18:06 UTC
Description of problem:
Deploy 4.6 cluster with "provsioningNetework: Disabled", control plane  created, however worker nodes failed to provision.

Version-Release number of the following components:

How reproducible:

Steps to Reproduce:
1. deploy 4.6 cluster with "provisioningNetwork: Disabled", and redfish-virtualmedia in BMC address (in install-config.yaml) 
2.
3.

Actual results:
workers are not provisioned, and in "inspecting" state

Expected results:
workers to be provisioned

Additional info:
excerpt of metal3 pod (ns machine-config-api) container  metal3-ironic-conductor log:

2020-09-15 09:14:02.019 1 ERROR ironic.common.images [req-6fe71de1-2f0c-4980-bbff-a29f3a8cac39 ironic-user - - - -] Creating the filesystem root failed.: FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/syslinux/isolinux.bin'
2020-09-15 09:14:02.019 1 ERROR ironic.common.images Traceback (most recent call last):
2020-09-15 09:14:02.019 1 ERROR ironic.common.images   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 269, in create_isolinux_image_for_bios
2020-09-15 09:14:02.019 1 ERROR ironic.common.images     _create_root_fs(tmpdir, files_info)
2020-09-15 09:14:02.019 1 ERROR ironic.common.images   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 65, in _create_root_fs
2020-09-15 09:14:02.019 1 ERROR ironic.common.images     shutil.copyfile(src_file, target_file)
2020-09-15 09:14:02.019 1 ERROR ironic.common.images   File "/usr/lib64/python3.6/shutil.py", line 120, in copyfile
2020-09-15 09:14:02.019 1 ERROR ironic.common.images     with open(src, 'rb') as fsrc:
2020-09-15 09:14:02.019 1 ERROR ironic.common.images FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/syslinux/isolinux.bin'
2020-09-15 09:14:02.019 1 ERROR ironic.common.images 
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector [req-6fe71de1-2f0c-4980-bbff-a29f3a8cac39 ironic-user - - - -] Unable to start managed inspection for node f7842608-7caa-42fe-81c2-4d5c8e8c9c80: Creating iso image failed: [Errno 2] No such file or directory: '/usr/lib/syslinux/isolinux.bin': ironic.common.exception.ImageCreationFailed: Creating iso image failed: [Errno 2] No such file or directory: '/usr/lib/syslinux/isolinux.bin'
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector Traceback (most recent call last):
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 269, in create_isolinux_image_for_bios
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     _create_root_fs(tmpdir, files_info)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 65, in _create_root_fs
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     shutil.copyfile(src_file, target_file)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib64/python3.6/shutil.py", line 120, in copyfile
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     with open(src, 'rb') as fsrc:
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/syslinux/isolinux.bin'
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector 
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector During handling of the above exception, another exception occurred:
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector 
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector Traceback (most recent call last):
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/inspector.py", line 204, in _start_managed_inspection
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     task.driver.boot.prepare_ramdisk(task, ramdisk_params=params)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/redfish/boot.py", line 895, in prepare_ramdisk
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     iso_ref = _prepare_deploy_iso(task, ramdisk_params, mode)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/redfish/boot.py", line 644, in _prepare_deploy_iso
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     return prepare_iso_image()
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/redfish/boot.py", line 562, in _prepare_iso_image
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     base_iso=base_iso)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 620, in create_boot_iso
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     kernel_params=params, configdrive=configdrive_path)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 273, in create_isolinux_image_for_bios
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector     raise exception.ImageCreationFailed(image_type='iso', error=e)
2020-09-15 09:14:02.389 1 ERROR ironic.drivers.modules.inspector ironic.common.exception.ImageCreationFailed: Creating iso image failed: [Errno 2] No such file or directory: '/usr/lib/syslinux/isolinux.bin'

Comment 4 Stephen Benjamin 2020-09-15 11:03:06 UTC
This  isn't related to the disabled provisioning network, for some reason baremetal-operator isn't setting capabilities correctly to tell the host we're doing a UEFI boot.

BIOS-based virtualmedia provisioning won't work due to https://bugzilla.redhat.com/show_bug.cgi?id=1862608#
 

Here's the properites field of the Ironic host:

$ curl -g -X GET --user ironic-user:XXXXXXXX http://X.X.X.X:6385/v1/nodes/c4ccee33-20c3-471e-bf97-b20a8c730633 -H "Accept: application/json" -H "Content-Type: application/json" -H "X-Openstack-Ironic-API-Version: 1.67"  | jq ".properties"
shows:
{
  "capabilities": ""
}

However, the BMH correctly shows bootMode: UEFI.


@Doug: could there be some kind of race here and somehow capabilities ends up empty?

Comment 5 Doug Hellmann 2020-09-15 19:56:11 UTC
It looks like the problem is that we only set the boot mode in ironic before we provision, and not before we start inspection. See https://github.com/metal3-io/baremetal-operator/pull/635

Comment 6 Stephen Benjamin 2020-09-15 23:42:58 UTC
That's really odd because the e2e-metal-ipi-virtualmedia job was passing, using UEFI virtual media, did something change in Ironic or BMO?

Comment 7 Doug Hellmann 2020-09-16 20:08:00 UTC
(In reply to Stephen Benjamin from comment #6)
> That's really odd because the e2e-metal-ipi-virtualmedia job was passing,
> using UEFI virtual media, did something change in Ironic or BMO?

I had the impression that the boot mode setting for a VM didn't matter as much as it might for a physical host.

Regardless, the patch mentioned in comment 5 is updating a gap we've always had in the original implementation.

Comment 9 Lubov 2020-09-30 15:46:53 UTC
@Doug I run deployment with redgish-virtualmedia for provsioningNetework: Disabled and bootMode UEFI - it passed with no problem

But I understand that's not enough to verify this BZ

Could U, please, suggest what else should be verified?

Comment 10 Doug Hellmann 2020-09-30 16:06:17 UTC
The original issue was with the timing of when we passed the boot mode to ironic. Before the fix, we only told ironic which boot mode to use when we were provisioning. That meant the host could fail to boot properly for inspection. To verify that ironic has the correct boot mode during inspection you could look at the node settings in ironic while the host is being inspected and verify that the value of /properties/capabilities includes a boot_mode value that matches the setting in the BareMetalHost resource.

Comment 11 Lubov 2020-10-06 09:53:53 UTC
Verified on 4.6.0-0.nightly-2020-10-05-234751

While node is being inspected
(openstack-cli) [kni@provisionhost-0-0 ~]$ baremetal node show openshift-worker-0-2  
| properties             | {'capabilities': 'boot_mode:uefi'}

Comment 15 errata-xmlrpc 2020-10-27 16:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.