Bug 2114968

Summary: 4.12-nightly payloads blocked by metal jobs failing with "Still creating ..." when creating nodes
Product: OpenShift Container Platform Reporter: Dennis Periquet <dperique>
Component: Bare Metal Hardware ProvisioningAssignee: Riccardo Pittau <rpittau>
Bare Metal Hardware Provisioning sub component: OS Image Provider QA Contact: Wenxin Wei <wwei>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: rpittau, zbitter
Version: 4.12Keywords: Triaged
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:54:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2114976    

Description Dennis Periquet 2022-08-03 15:30:11 UTC
Description of problem:

In 4.12-nightly payloads starting with 4.12.0-0.nightly-2022-08-02-195033
we get these two jobs failing:

metal-ipi-ovn-ipv6
metal-ipi-sdn

with:

: Run multi-stage test e2e-metal-ipi-ovn-ipv6 - e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-setup container test
              1h45m15s
                {  l creating... [1h32m21s elapsed]
level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [1h32m31s elapsed]
level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [1h32m31s elapsed]
level=debug msg=ironic_node_v1.openshift-master-host[1]: Still creating... [1h32m31s elapsed]



Version-Release number of selected component (if applicable): 4.12


How reproducible: reproducible


Steps to Reproduce:
1. run 4.12-nightly payload jobs
2.
3.

Actual results:
see the above "Still creating ..." messages

Expected results:
BM nodes created in a timely manner


Additional info:

also see https://issues.redhat.com/browse/TRT-444
and possible fix https://github.com/openshift/ironic-agent-image/pull/59

Comment 1 Riccardo Pittau 2022-08-03 16:36:13 UTC
since the pin to the 4.11 ironic images was removed in https://issues.redhat.com/browse/TRT-444, all the images are now using RHEL9, even the untested ones
from the logs we can see that inspector does not have any info on the bmc address of the nodes
during investigation Derek Higgins found that the ip command was missing in the new image, since iproute is not installed by default in RHEL9 base images

a revert of https://issues.redhat.com/browse/TRT-444 has been created as a workaround https://github.com/openshift/ocp-build-data/pull/1820

the actual fix is https://github.com/openshift/ironic-agent-image/pull/59

Comment 3 Pedro Amoedo 2022-08-05 11:40:04 UTC
*** Bug 2114976 has been marked as a duplicate of this bug. ***

Comment 4 Wenxin Wei 2022-08-07 15:54:09 UTC
verified on 4.12.0-0.nightly-2022-08-07-091910

ironic_node creation on time

show output info below:

08-07 22:51:16.530  level=debug msg=ironic_node_v1.openshift-master-host[1]: Creation complete after 12m52s [id=4a2d2f41-9f5d-4870-8a99-e576cf4fdf9a]
08-07 22:51:26.514  level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [13m0s elapsed]
08-07 22:51:26.514  level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [13m0s elapsed]
08-07 22:51:36.493  level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [13m10s elapsed]
08-07 22:51:36.493  level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [13m10s elapsed]
08-07 22:51:46.428  level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [13m20s elapsed]
08-07 22:51:46.428  level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [13m20s elapsed]
08-07 22:51:56.408  level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [13m30s elapsed]
08-07 22:51:56.408  level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [13m30s elapsed]
08-07 22:52:06.366  level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [13m40s elapsed]
08-07 22:52:06.366  level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [13m40s elapsed]
08-07 22:52:18.169  level=debug msg=ironic_node_v1.openshift-master-host[2]: Still creating... [13m50s elapsed]
08-07 22:52:18.169  level=debug msg=ironic_node_v1.openshift-master-host[0]: Still creating... [13m50s elapsed]
08-07 22:52:18.169  level=debug msg=ironic_node_v1.openshift-master-host[2]: Creation complete after 13m51s [id=9acba7e3-c1ba-4089-bf87-126e9a6ddd9a]
08-07 22:52:18.170  level=debug msg=ironic_node_v1.openshift-master-host[0]: Creation complete after 13m51s [id=0c0f7621-bd73-48b6-b01f-e931c881fadd]

Comment 7 errata-xmlrpc 2023-01-17 19:54:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399