Bug 1957980 - [Webscale] Ironic fails to deploy OpenShift Baremetal IPI 4.7.9 on
Summary: [Webscale] Ironic fails to deploy OpenShift Baremetal IPI 4.7.9 on
Keywords:
Status: CLOSED DUPLICATE of bug 1957976
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Tomas Sedovic
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-06 21:48 UTC by Marc Methot
Modified: 2024-10-01 18:07 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-06 22:03:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Marc Methot 2021-05-06 21:48:09 UTC
The deployment fails with the following error:
~~~
Error: Agent API for node 32cbf50b-3cfd-49f9-a166-c5776a704529 returned HTTP status code 409 with error: Agent is busy: executing command execute_deploy_step.
~~~

The same deployment (same hosts, same install config) works when deploying 4.6.17

Version-Release number of selected component (if applicable):
4.6.9

How reproducible:
Fails every time, tested with both 4.7.6 and 4.7.9.


Actual results:
Checking on the half deployed host we eventually get to
~~~
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:15.422 3208 DEBUG ironic_python_agent.ironic_api_client [-] Heartbeat: announcing callback URL https://172.22.0.15:9999, API version is 1.68 heartbeat /usr/lib/python3.6/site-packages/ironic_python_agent/ironic_api_client.py:150
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:21.039 3208 INFO root [-] Asynchronous command execute_deploy_step completed: {'deploy_result': {'result': 'prepare_image: image (rhcos-47.83.202103251640-0-compressed.x86_64.qcow2) written to device /dev/sda root_uuid=0x00000000'}, 'deploy_step': 
{'interface': 'deploy', 'step': 'write_image', 'args': {'image_info': {'id': 'rhcos-47.83.202103251640-0-compressed.x86_64.qcow2', 'urls': ['http://172.22.0.2:80/images/rhcos-47.83.202103251640-0-openstack.x86_64.qcow2/rhcos-47.83.202103251640-0-compressed.x86_64.qcow2'], 'disk_format': None, 'container_format': Non
e, 'stream_raw_images': True, 'checksum': 'http://172.22.0.2:80/images/rhcos-47.83.202103251640-0-openstack.x86_64.qcow2/rhcos-47.83.202103251640-0-compressed.x86_64.qcow2.md5sum', 'node_uuid': 'a7046d77-3d11-400f-99cb-c1a20cb86678'}, 'configdrive': '<...>'}}}
May 06 13:48:21 master-0 rsyslogd[3212]: message too long (312102) with configured size 8096, begin of message is: 2021-05-06 13:47:18.864 3208 INFO root [-] Deploy step completed: {'interface':  [v8.1911.0-6.el8 try https://www.rsyslog.com/e/2445 ]
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent [-] error sending heartbeat to http://172.22.0.2:6385: ironic_python_agent.errors.HeartbeatError: Error heartbeating to agent API: Error 400: Invalid or missing agent token received.
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent Traceback (most recent call last):
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent   File "/usr/lib/python3.6/site-packages/ironic_python_agent/agent.py", line 109, in do_heartbeat
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent     generated_cert=self.agent.generated_cert,
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent   File "/usr/lib/python3.6/site-packages/ironic_python_agent/ironic_api_client.py", line 163, in heartbeat
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent     raise errors.HeartbeatError(error)
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent ironic_python_agent.errors.HeartbeatError: Error heartbeating to agent API: Error 400: Invalid or missing agent token received.
                                                    2021-05-06 13:48:21.154 3208 ERROR ironic_python_agent.agent 
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:21.237 3208 DEBUG ironic_python_agent.ironic_api_client [-] Heartbeat: announcing callback URL https://172.22.0.15:9999, API version is 1.68 heartbeat /usr/lib/python3.6/site-packages/ironic_python_agent/ironic_api_client.py:150
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:21.500 3208 INFO ironic_python_agent.agent [-] sleeping before next heartbeat, interval: 156.01437867497847
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent [-] error sending heartbeat to http://172.22.0.2:6385: ironic_python_agent.errors.HeartbeatError: Error heartbeating to agent API: Error 400: Invalid or missing agent token received.
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent Traceback (most recent call last):
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent   File "/usr/lib/python3.6/site-packages/ironic_python_agent/agent.py", line 109, in do_heartbeat
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent     generated_cert=self.agent.generated_cert,
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent   File "/usr/lib/python3.6/site-packages/ironic_python_agent/ironic_api_client.py", line 163, in heartbeat
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent     raise errors.HeartbeatError(error)
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent ironic_python_agent.errors.HeartbeatError: Error heartbeating to agent API: Error 400: Invalid or missing agent token received.
                                                    2021-05-06 13:48:21.581 3208 ERROR ironic_python_agent.agent 
May 06 13:48:21 master-0 ironic-python-agent[3208]: 2021-05-06 13:48:21.799 3208 INFO ironic_python_agent.agent [-] sleeping before next heartbeat, interval: 169.92832458453464
~~~

Expected results:
That it works just the same as 4.6 deployment.


Additional info:
bootstrap logs/data (`./openshift-install gather bootstrap --dir=<installation_directory>`) and while host is booted off ramdisk logs all attached to the case.

Comment 1 Marc Methot 2021-05-06 21:50:00 UTC
The hosts PXE boot normally using RedFish, then the cleanup is successful, and then it fails right after writing the image. Complains that the token is then no longer valid.

Comment 2 Bob Fournier 2021-05-06 21:59:54 UTC
we had actually just created a bug for same issue https://bugzilla.redhat.com/show_bug.cgi?id=1957976

Comment 3 Bob Fournier 2021-05-06 22:03:13 UTC
Closing this as a dup as same issue tracked in other one.

*** This bug has been marked as a duplicate of bug 1957976 ***


Note You need to log in before you can comment on or make changes to this bug.