Bug 1435591 - Node registration command hangs on incorrect IPMI credentials
Summary: Node registration command hangs on incorrect IPMI credentials
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 10.0 (Newton)
Hardware: All
OS: Linux
medium
medium
Target Milestone: beta
: 13.0 (Queens)
Assignee: Dmitry Tantsur
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
: 1440959 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-24 10:18 UTC by Joe Talerico
Modified: 2018-06-27 13:31 UTC (History)
12 users (show)

Fixed In Version: openstack-tripleo-common-8.6.1-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:29:27 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Launchpad 1667776 None None None 2017-03-24 10:18:31 UTC
OpenStack gerrit 559314 None MERGED Fix error handling in set_provision_state/set_power_state workflows 2020-09-17 09:01:43 UTC
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 13:31:00 UTC

Description Joe Talerico 2017-03-24 10:18:32 UTC
Description of problem:
The issue was ironic by default uses the ADMINISTRATOR, this env we had a OPERATOR account. However, ironic baremetal import shouldn't just hang forever. There should be a timeout, or if we fail to get power status we break out with a error to the user.

Version-Release number of selected component (if applicable):
OSP10

How reproducible:
100%

Steps to Reproduce:
1. Import a instackenv.json with the wrong account type

Actual results:
Import hangs

Expected results:
Import exits 1, with error.

Additional info:

Comment 1 Dmitry Tantsur 2017-04-03 15:21:38 UTC
Could you please fetch ironic and mistral logs?

Comment 2 Joe Talerico 2017-04-03 16:02:52 UTC
The CNCF lab is gone but this should be easily reproduced.

Comment 3 Dmitry Tantsur 2017-04-25 10:34:48 UTC
*** Bug 1440959 has been marked as a duplicate of this bug. ***

Comment 4 Dmitry Tantsur 2018-03-13 14:47:19 UTC
I can confirm this - by providing a wrong password one can make 'openstack overcloud node import' hang. Ironic itself correctly puts the node back to enroll, we should probably stop retrying in this case.

Comment 5 Dmitry Tantsur 2018-04-06 11:25:15 UTC
OSP 13 backport is https://review.openstack.org/#/c/559314/

Comment 6 Bob Fournier 2018-04-16 13:03:05 UTC
Moving to OSP-13.

Comment 11 Alexander Chuzhoy 2018-05-04 20:42:24 UTC
Verified:

Environment:
openstack-tripleo-common-8.6.1-6.el7ost.noarch

Set wrong password for one node and ran the import (didn't get stuck, completed within 1 min with error per below):

(undercloud) [stack@undercloud-0 ~]$ openstack overcloud node import instackenv.json 
Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: 9757f55d-c829-4b92-8c10-f8283be1f414
Waiting for messages on queue 'tripleo' with no timeout.

[{u'result': u'Node 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f. Error: IPMI call failed: power status.'}, {}, {}, {}, {}, {}, {}, {}, {}, {}]
{u'status': u'FAILED', u'message': [{u'result': u'Node 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f. Error: IPMI call failed: power status.'}, {}, {}, {}, {}, {}, {}, {}, {}, {}], u'result': None}
Exception registering nodes: {u'status': u'FAILED', u'message': [{u'result': u'Node 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f. Error: IPMI call failed: power status.'}, {}, {}, {}, {}, {}, {}, {}, {}, {}], u'result': None}




All nodes except the one with wrong password are managed (as expected):
(undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list
+--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name         | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
| 16c3e4ab-fa68-4306-b1c9-ba71b7ca338f | ceph-0       | None          | None        | enroll             | False       |
| 08606c14-eac6-4e29-a9e0-e8c3c4925679 | ceph-1       | None          | power on    | manageable         | False       |
| 6b5514db-ef09-4366-9131-89587749b4a2 | ceph-2       | None          | power on    | manageable         | False       |
| 1836f58e-4c44-4e43-9e5c-17a8adc02c58 | compute-0    | None          | power on    | manageable         | False       |
| 4a5ba3d1-241d-40a4-ae13-d148e924dc2a | compute-1    | None          | power on    | manageable         | False       |
| 9ea43886-fac5-4054-9169-75b41f45064a | controller-0 | None          | power on    | manageable         | False       |
| 9bc0eed1-6e78-4b27-a05a-386549a6efb6 | controller-1 | None          | power on    | manageable         | False       |
| 97f01df8-ff61-4bac-902d-dc0dd89451a7 | controller-2 | None          | power on    | manageable         | False       |
| 90bdc756-ab3e-48b2-b791-fce071de0033 | ironic-0     | None          | power off   | manageable         | False       |
| e4748770-7e91-40f5-b4cf-e1b067a84b4f | ironic-1     | None          | power off   | manageable         | False       |
+--------------------------------------+--------------+---------------+-------------+--------------------+-------------+

Comment 13 errata-xmlrpc 2018-06-27 13:29:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.