Bug 1263470 - configure boot failed when there is one misbehaving node
Summary: configure boot failed when there is one misbehaving node
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 7.0 (Kilo)
Assignee: chris alfonso
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-15 21:21 UTC by bigswitch
Modified: 2016-04-18 06:52 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-09-15 21:28:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description bigswitch 2015-09-15 21:21:14 UTC
Description of problem:
After doing an import json, followed by a configure boot, if there is one node that is not reachable, the rest of the nodes will fail

 ironic node-list
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | Provision State | Maintenance |
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
| f87b7487-0165-4e25-bb00-879c60a38d3e | None | None          | power off   | available       | False       |
| d58185b5-655d-4fa4-b7a6-b43b8adc7e34 | None | None          | power off   | available       | False       |
| db97f50c-3317-411a-a357-d9d8ff819945 | None | None          | power off   | available       | False       |
| 53542138-77bb-4ecf-9e78-b801024e7f44 | None | None          | power off   | available       | False       |
| a43aa6d2-8d5f-4c4b-8814-cfc93f99437d | None | None          | None        | available       | False       |
| 931c801b-235d-4d3c-a894-27d59cda0d9f | None | None          | None        | available       | True        |
| ad79ff94-530b-4a6f-a9b2-3e3b173c7756 | None | None          | None        | available       | True        |
+--------------------------------------+------+---------------+-------------+-----------------+-------------+

node a43aa6d2-8d5f-4c4b-8814-cfc93f99437d ipmi is unreachable, and because of that all subsequent nodes are not updated. Even after removing this node, the other nodes need to be manually set
re-running configure boot will fail

 ironic node-list
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | Provision State | Maintenance |
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
| f87b7487-0165-4e25-bb00-879c60a38d3e | None | None          | power off   | available       | False       |
| d58185b5-655d-4fa4-b7a6-b43b8adc7e34 | None | None          | power off   | available       | False       |
| db97f50c-3317-411a-a357-d9d8ff819945 | None | None          | power off   | available       | False       |
| 53542138-77bb-4ecf-9e78-b801024e7f44 | None | None          | power off   | available       | False       |
| 931c801b-235d-4d3c-a894-27d59cda0d9f | None | None          | None        | available       | True        |
| ad79ff94-530b-4a6f-a9b2-3e3b173c7756 | None | None          | None        | available       | True        |
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
[stack@c5220-01 ~]$ openstack baremetal configure boot
WARNING: rdomanager_oscplugin.v1.baremetal.ConfigureBaremetalBoot Node 931c801b-235d-4d3c-a894-27d59cda0d9f power state is in transition. Waiting up to 120 seconds for it to complete.
ERROR: openstack Timed out waiting for node 931c801b-235d-4d3c-a894-27d59cda0d9f power state.

 ironic node-show 931c801b-235d-4d3c-a894-27d59cda0d9f
+------------------------+-------------------------------------------------------------------------+
| Property               | Value                                                                   |
+------------------------+-------------------------------------------------------------------------+
| target_power_state     | None                                                                    |
| extra                  | {}                                                                      |
| last_error             | During sync_power_state, max retries exceeded for node 931c801b-235d-   |
|                        | 4d3c-a894-27d59cda0d9f, node state None does not match expected state   |
|                        | 'None'. Updating DB state to 'None' Switching node to maintenance mode. |
| updated_at             | 2015-09-15T20:47:57+00:00                                               |
| maintenance_reason     | During sync_power_state, max retries exceeded for node 931c801b-235d-   |
|                        | 4d3c-a894-27d59cda0d9f, node state None does not match expected state   |
|                        | 'None'. Updating DB state to 'None' Switching node to maintenance mode. |
| provision_state        | available                                                               |
| uuid                   | 931c801b-235d-4d3c-a894-27d59cda0d9f                                    |
| console_enabled        | False                                                                   |
| target_provision_state | None                                                                    |
| maintenance            | True                                                                    |
| inspection_started_at  | None                                                                    |
| inspection_finished_at | None                                                                    |
| power_state            | None                                                                    |
| driver                 | pxe_ipmitool                                                            |
| reservation            | None                                                                    |
| properties             | {u'memory_mb': u'8192', u'cpu_arch': u'x86_64', u'local_gb': u'500',    |
|                        | u'cpus': u'2'}                                                          |
| instance_uuid          | None                                                                    |
| name                   | None                                                                    |
| driver_info            | {u'ipmi_address': u'10.8.68.135', u'ipmi_username': u'root',            |
|                        | u'ipmi_password': u'******'}                                            |
| created_at             | 2015-09-15T20:44:22+00:00                                               |
| driver_internal_info   | {}                                                                      |
| chassis_uuid           |                                                                         |
| instance_info          | {}                                                                      |
+------------------------+-------------------------------------------------------------------------+
[stack@c5220-01 ~]$ ironic node-list
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | Provision State | Maintenance |
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
| f87b7487-0165-4e25-bb00-879c60a38d3e | None | None          | power off   | available       | False       |
| d58185b5-655d-4fa4-b7a6-b43b8adc7e34 | None | None          | power off   | available       | False       |
| db97f50c-3317-411a-a357-d9d8ff819945 | None | None          | power off   | available       | False       |
| 53542138-77bb-4ecf-9e78-b801024e7f44 | None | None          | power off   | available       | False       |
| 931c801b-235d-4d3c-a894-27d59cda0d9f | None | None          | None        | available       | True        |
| ad79ff94-530b-4a6f-a9b2-3e3b173c7756 | None | None          | None        | available       | True        |
+--------------------------------------+------+---------------+-------------+-----------------+-------------+
[stack@c5220-01 ~]$  ipmitool -I lanplus -H 10.8.68.135 -U root -P bsn123 power status
Chassis Power is off
[stack@c5220-01 ~]$


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Import json with a misbehaving node in the middle of the json file
2. configure boot will fail for the rest of the nodes behind that node
3.

Actual results:


Expected results:
It should be able to recover

Additional info:

Comment 3 bigswitch 2015-09-15 21:28:01 UTC
not a bug, the ipmi password is wrong for the rest of the nodes as well.


Note You need to log in before you can comment on or make changes to this bug.