Bug 1973303 - Introspection not providing boot for UEFI x86_64 node in a mixed architecture configuration, ipxe disabled
Summary: Introspection not providing boot for UEFI x86_64 node in a mixed architecture...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-ironic
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z2
: 16.2 (Train on RHEL 8.4)
Assignee: Julia Kreger
QA Contact: Paras Babbar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-17 15:35 UTC by James E. LaBarre
Modified: 2022-03-23 22:11 UTC (History)
4 users (show)

Fixed In Version: puppet-ironic-15.5.0-2.20220109005702.3c15a73.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-23 22:10:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
DNSmasq configuration file (617 bytes, text/plain)
2021-06-17 15:35 UTC, James E. LaBarre
no flags Details
Nodes definition file for import (1.64 KB, text/plain)
2021-06-17 15:36 UTC, James E. LaBarre
no flags Details
additional introspection rules (713 bytes, text/plain)
2021-06-17 15:37 UTC, James E. LaBarre
no flags Details
sample log file for dnsmasq (411.63 KB, text/plain)
2021-06-17 15:38 UTC, James E. LaBarre
no flags Details
"baremetal node show" for Power9 compute node (2.21 KB, text/plain)
2021-06-17 17:58 UTC, James E. LaBarre
no flags Details
"baremetal node show" for x86_64 UEFI compute node (1.97 KB, text/plain)
2021-06-17 18:05 UTC, James E. LaBarre
no flags Details
"baremetal node show" for x86_64 UEFI node (earlier run w/different parameters) (2.47 KB, text/plain)
2021-06-17 18:11 UTC, James E. LaBarre
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 803969 0 None MERGED Fix ppc64le support to coexist with PXE preference 2021-11-09 13:21:43 UTC
Red Hat Issue Tracker OSP-5279 0 None None None 2021-11-10 20:30:57 UTC
Red Hat Product Errata RHBA-2022:1001 0 None None None 2022-03-23 22:11:15 UTC

Description James E. LaBarre 2021-06-17 15:35:19 UTC
Created attachment 1791852 [details]
DNSmasq configuration file

Product: 	Red Hat OpenStack 	Reporter: 	jlabarre
Component: 	
python-ironic-inspector-client
Version: 	16.2 (Train)
	Severity: 	high
Hardware: 	x86_64
OS: 	Linux
Summary
Introspection not providing boot for UEFI x86_64 node in a mixed architecture configuration, ipxe disabled



Description of problem:
Deploying an OpenStack 16.2 environment, with mixed architecture nodes (x86_64 and Power/ppc64le).  As the Power nodes cannot use iPXE, the undercloud needs to be deployed with "ipxe_enabled - False".  The controller node is a VM, and is defined as a BIOS-based system.

So to define the entire cluster:
    1)  Director:     x86_64 VM
    2)  Controller:   x86_64 VM
    3)  Compute1:     Power8
    4)  Compute2:     Power9, OpenBMC
    5)  Compute3:     x86_64, set for UEFI boot only

This configuration should still be able to support iPXE systems, just not as a default.  Nodes 2-4 are defined as ["boot_interface": "pxe"] in the nodes.json file used to import the node definitions, while 5 has ["boot_interface": "ipxe"].  Additionally the "capabilities:" lines for the nodes have "boot_mode:bios" or "boot_mode:uefi" as appropriate.  Also will set ipmi_disable_boot_timeout=False for the OpenBMC Power node.

After importing the node definitions, the Controller VM and 2 Power nodes will introspect correctly.  The x86_64 UEFI node will not.


Version-Release number of selected component (if applicable):
OpenStack 16.2 (RHOS-16.2-RHEL-8-20210614.n.1)

openstack-ironic-python-agent-builder-2.8.0-2.20210529034815.b133d4d.el8ost.2.noarch
puppet-ironic-15.5.0-2.20210601011633.d553541.el8ost.2.noarch
puppet-nova-15.8.0-2.20210601013941.99789e3.el8ost.2.noarch
python3-ironicclient-3.1.2-2.20210528013403.1220d76.el8ost.1.noarch
python3-ironic-inspector-client-3.7.1-2.20210528020511.3a41127.el8ost.1.noarch
python3-novaclient-15.1.1-2.20210528065428.79959ab.el8ost.1.noarch
rhosp-director-images-all-16.2-20210614.1.el8ost.noarch
rhosp-director-images-base-16.2-20210614.1.el8ost.noarch
rhosp-director-images-ipa-ppc64le-16.2-20210614.1.el8ost.noarch
rhosp-director-images-ipa-x86_64-16.2-20210614.1.el8ost.noarch
rhosp-director-images-metadata-16.2-20210614.1.el8ost.noarch
rhosp-director-images-minimal-16.2-20210614.1.el8ost.noarch
rhosp-director-images-ppc64le-16.2-20210614.1.el8ost.noarch
rhosp-director-images-x86_64-16.2-20210614.1.el8ost.noarch


How reproducible:
always

Steps to Reproduce:
1.  deploy OSP undercloud, with ipxe_enabled=False

2.  upload x86_64 and ppc64le images
    a. openstack overcloud image upload --image-path ~/images/ppc64le --architecture ppc64le \
        --whole-disk --http-boot /var/lib/ironic/tftpboot/ppc64le
    b. openstack overcloud image upload --image-path ~/images/x86_64 --architecture x86_64 \
       --http-boot /var/lib/ironic/tftpboot

3.  import node definitions from .json file
      openstack overcloud node import --http-boot /var/lib/ironic/tftpboot ~/nodes.json
    3a.  if the .json file for nodes definition did not already have "boot_interface" defined, 
         set boot interface for nodes
           openstack baremetal node set --boot-interface ...

4.  set OpenBMC node ipmi option
        openstack baremetal node set {{ name }} --driver-info ipmi_disable_boot_timeout=False

5.  create custom traits for nodes
      openstack --os-placement-api-version 1.6 trait create [trait]
    for traits:      
        - CUSTOM_HW_CPU_PPC64LE_POWER8
        - CUSTOM_HW_CPU_PPC64LE_POWER9
        - CUSTOM_HW_CONTROLLER

6.  Import introspection rules
      openstack baremetal introspection rule import ~stack/introspection-rules.json

7.  Run introspection against nodes
      openstack overcloud node introspect [node name]
    ran them individually to watch the status/output of each from their consoles



Actual results:

Controller (VM), Power/ppc64le compute nodes ran introspection sucessfully (all set as "pxe").  x86_64 UEFI compute node fails to even get a network boot when it requests it (configured for ipxe).


Expected results:

All nodes, regardless of being configured as pxe or ipxe, should run introspection sucessfully.  Would expect the overcloud deploy later should also install the nodes sucessfully as well (have not reached that point in this configuration)


Additional info:

I had previously tried setting up these same systems with "ipxe_enabled: True".  Under that configuration both the pxe Controller and ipxe x86_64 Compute nodes were able to run introspection correctly.  The Power8 & Power9 nodes both required providing an alternative TFTP URL to boot the introspection (PXE Autoconfiguration failed) and neither could run the overcloud install.  Apparently the "ipxe_enabled" is supposed to still be set as "False" for these configurations, but need the UEFI/iPXE boot to work for x86_64, as a customer with mixed x86_64 and ppc64le Compute nodes has new x86_64 systems that can only do UEFI/iPXE boot.

Comment 1 James E. LaBarre 2021-06-17 15:36:58 UTC
Created attachment 1791853 [details]
Nodes definition file for import

Comment 2 James E. LaBarre 2021-06-17 15:37:55 UTC
Created attachment 1791854 [details]
additional introspection rules

Comment 3 James E. LaBarre 2021-06-17 15:38:55 UTC
Created attachment 1791855 [details]
sample log file for dnsmasq

Comment 4 James E. LaBarre 2021-06-17 17:58:13 UTC
Created attachment 1791881 [details]
"baremetal node show" for Power9 compute node

baremetal node information for one of the Power/ppc64le nodes, after introspection (json format)

Comment 5 James E. LaBarre 2021-06-17 18:05:26 UTC
Created attachment 1791882 [details]
"baremetal node show" for x86_64 UEFI compute node

undercloud information for x86_64 UEFI compute node.  Node was imported but introspection failed to run against the node.  Sample from earlier undercloud deploy in next file.

Comment 6 James E. LaBarre 2021-06-17 18:11:15 UTC
Created attachment 1791883 [details]
"baremetal node show" for x86_64 UEFI node (earlier run w/different parameters)

In an earlier test, I had set "ipxe_enable = True".  Under this setting the x86_64 UEFI node was able to run introspection, but the Power nodes needed manual intervention, and could not install the overcloud.  This is the x86_64 output from "openstack baremetal node show" that shows what should be discovered in introspection (but this is from an undercloud that doesn't work with Power systems).

Supplied for comparison purposes, if it would be useful.

Comment 18 errata-xmlrpc 2022-03-23 22:10:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1001


Note You need to log in before you can comment on or make changes to this bug.