Bug 1383627

Summary: `openstack baremetal import` fails with node locked error on "manage" transition
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-commonAssignee: Dmitry Tantsur <dtantsur>
Status: CLOSED ERRATA QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: athomas, dbecker, dhill, dsneddon, dtantsur, jcoufal, jschluet, mburns, mcornea, mlopes, morazi, racedoro, rbartal, rhel-osp-director-maint, slinaber
Target Milestone: rcKeywords: AutomationBlocker, Triaged
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-5.3.0-5.el7ost Doc Type: Known Issue
Doc Text:
Nodes that are imported using "openstack baremetal import --json instackenv.json" should be powered off prior to attempting import. If the nodes are powered on, Ironic will not attempt to add the nodes or attempt introspection. As a workaround, power off all overcloud nodes prior to running "openstack baremetal import --json instackenv.json". As a result, if the nodes are powered off, the import should work successfully.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 16:15:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ironic-conductor.log
none
ironic-api.log none

Description Marius Cornea 2016-10-11 09:45:40 UTC
Description of problem:
openstack baremetal import --json instackenv.json fails when the imported nodes are powered on. The instackenv.json contains 15 nodes(VMs).

This is blocking automation for the composable roles testing.

Version-Release number of selected component (if applicable):
python-tripleoclient-5.2.1-0.1.34590ccgit.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Build instackenv.json with 15 nodes which are powered on
2. openstack baremetal import --json instackenv.json

Actual results:
Import fails:

stdout: Started Mistral Workflow. Execution ID: 8a1a9930-1b53-4ce2-9a2f-c0c29eae406f

stderr: Exception registering nodes: [{u'__task_execution': {u'id': u'88fd8862-ab81-4291-b20e-aa82957508bf', u'name': u'wait_for_provision_state'}}, {u'__task_execution': {u'id': u'684d1464-7fa0-4c3b-aeac-dce73b7fd07f', u'name': u'wait_for_provision_state'}}, {u'__task_execution': {u'id': u'bc2be6f5-b202-467d-85d7-7937255335d9', u'name': u'wait_for_provision_state'}}, {u'__task_execution': {u'id': u'c5e84d18-eab2-4495-ba28-59f9c22c5ba4', u'name': u'wait_for_provision_state'}}, {u'__task_execution': {u'id': u'8f53465e-2d71-4eb4-bc7c-5fb625c11e3c', u'name': u'wait_for_provision_state'}}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=bfc88c27-106e-46e7-a170-9f301203e35e] -> Failed to run action [action_ex_id=146b08b5-da79-4b57-9336-0a4c464cbf15, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'e392ab23-772a-4d53-ab17-d7bb85a04de4', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node e392ab23-772a-4d53-ab17-d7bb85a04de4 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=146b08b5-da79-4b57-9336-0a4c464cbf15, idx=0]: Failed to run action [action_ex_id=146b08b5-da79-4b57-9336-0a4c464cbf15, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'e392ab23-772a-4d53-ab17-d7bb85a04de4', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node e392ab23-772a-4d53-ab17-d7bb85a04de4 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=a02fa3fe-0bb8-4607-8acd-e01a9a4dc4e4] -> Failed to run action [action_ex_id=d6a28a83-4c19-45b0-b152-907b03d45947, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'abf3f6b3-4db0-49e8-abcd-f8100620b92e', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node abf3f6b3-4db0-49e8-abcd-f8100620b92e is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=d6a28a83-4c19-45b0-b152-907b03d45947, idx=0]: Failed to run action [action_ex_id=d6a28a83-4c19-45b0-b152-907b03d45947, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'abf3f6b3-4db0-49e8-abcd-f8100620b92e', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node abf3f6b3-4db0-49e8-abcd-f8100620b92e is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=81f9f2b4-91f6-49da-b825-cb26a13b81b6] -> Failed to run action [action_ex_id=196b6c12-cc16-4f6e-9d95-8c5e00476427, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'11921dbe-267d-4810-b603-91eed0c5ecba', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 11921dbe-267d-4810-b603-91eed0c5ecba is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=196b6c12-cc16-4f6e-9d95-8c5e00476427, idx=0]: Failed to run action [action_ex_id=196b6c12-cc16-4f6e-9d95-8c5e00476427, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'11921dbe-267d-4810-b603-91eed0c5ecba', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 11921dbe-267d-4810-b603-91eed0c5ecba is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=07d2afef-3aea-4476-90df-e70ffcf1e466] -> Failed to run action [action_ex_id=2720cd16-8f77-4e6d-a465-7af662e75f0f, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'6f84e1ea-12f2-407d-9369-922687fd9b6a', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 6f84e1ea-12f2-407d-9369-922687fd9b6a is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=2720cd16-8f77-4e6d-a465-7af662e75f0f, idx=0]: Failed to run action [action_ex_id=2720cd16-8f77-4e6d-a465-7af662e75f0f, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'6f84e1ea-12f2-407d-9369-922687fd9b6a', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 6f84e1ea-12f2-407d-9369-922687fd9b6a is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=331cdb96-c96c-4daa-b110-7dcd0b4a1953] -> Failed to run action [action_ex_id=c9b61246-fa5e-4b4a-b96d-53dcfccb8115, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'df75748c-9036-4ba2-80cb-1db0fb8a5484', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node df75748c-9036-4ba2-80cb-1db0fb8a5484 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=c9b61246-fa5e-4b4a-b96d-53dcfccb8115, idx=0]: Failed to run action [action_ex_id=c9b61246-fa5e-4b4a-b96d-53dcfccb8115, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'df75748c-9036-4ba2-80cb-1db0fb8a5484', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node df75748c-9036-4ba2-80cb-1db0fb8a5484 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=12794227-534d-4d66-8177-0cdfb9aed4f4] -> Failed to run action [action_ex_id=173db264-22c6-4df3-ab4a-1e9df30637b9, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'07c42c39-23b1-4cee-8ec4-ca3a3f131012', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 07c42c39-23b1-4cee-8ec4-ca3a3f131012 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=173db264-22c6-4df3-ab4a-1e9df30637b9, idx=0]: Failed to run action [action_ex_id=173db264-22c6-4df3-ab4a-1e9df30637b9, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'07c42c39-23b1-4cee-8ec4-ca3a3f131012', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 07c42c39-23b1-4cee-8ec4-ca3a3f131012 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=1aa19bce-4666-4c66-b27b-83b88fcd256c] -> Failed to run action [action_ex_id=afeddb1e-87b1-4b27-8a11-c101ab4008ab, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'856071f4-b14c-4edf-9282-9e152148df51', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 856071f4-b14c-4edf-9282-9e152148df51 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=afeddb1e-87b1-4b27-8a11-c101ab4008ab, idx=0]: Failed to run action [action_ex_id=afeddb1e-87b1-4b27-8a11-c101ab4008ab, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'856071f4-b14c-4edf-9282-9e152148df51', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 856071f4-b14c-4edf-9282-9e152148df51 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=0661c6bb-fe14-4e8a-a4e0-88091b87af07] -> Failed to run action [action_ex_id=daf1d85a-0285-436b-b7fa-7088abd2b62c, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'33d10b6c-6740-44e7-b6fb-ffe37c1cca36', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 33d10b6c-6740-44e7-b6fb-ffe37c1cca36 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=daf1d85a-0285-436b-b7fa-7088abd2b62c, idx=0]: Failed to run action [action_ex_id=daf1d85a-0285-436b-b7fa-7088abd2b62c, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'33d10b6c-6740-44e7-b6fb-ffe37c1cca36', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 33d10b6c-6740-44e7-b6fb-ffe37c1cca36 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=acc18a31-8a33-4c27-9101-49c84a840898] -> Failed to run action [action_ex_id=3e241118-d5a2-497a-9426-131a732d6926, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'9237c20c-e38a-4f74-a944-377f74d166a9', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 9237c20c-e38a-4f74-a944-377f74d166a9 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=3e241118-d5a2-497a-9426-131a732d6926, idx=0]: Failed to run action [action_ex_id=3e241118-d5a2-497a-9426-131a732d6926, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'9237c20c-e38a-4f74-a944-377f74d166a9', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 9237c20c-e38a-4f74-a944-377f74d166a9 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}, {u'result': u"Failure caused by error in tasks: set_provision_state

  set_provision_state [task_ex_id=6364c269-13f9-436e-9eaa-96d9d4d88b2e] -> Failed to run action [action_ex_id=3729c810-a7f8-435c-88e9-a9f6131f1af8, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'897738ce-6cab-43eb-a37d-31214f4e8783', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 897738ce-6cab-43eb-a37d-31214f4e8783 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
    [action_ex_id=3729c810-a7f8-435c-88e9-a9f6131f1af8, idx=0]: Failed to run action [action_ex_id=3729c810-a7f8-435c-88e9-a9f6131f1af8, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'manage', u'node_uuid': u'897738ce-6cab-43eb-a37d-31214f4e8783', u'configdrive': None, u'cleansteps': None}']
 IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.Conflict'>: Node 897738ce-6cab-43eb-a37d-31214f4e8783 is locked by host undercloud-0.redhat.local, please retry after the current operation is completed.
"}]

Expected results:
Import finishes ok. 

Additional info:
The hypervisor host has enough resources to accomodate the VMs: 4 physical Xeon E7-8857 v2 with 1TB of RAM. The undercloud has assigned 4 vCPUs with 16G of RAM.

Comment 3 Marius Cornea 2016-10-17 08:24:00 UTC
With the latest build I am not able to workaround this issue by powering off the nodes before importing them.

Comment 4 Dmitry Tantsur 2016-10-17 08:37:44 UTC
Please provide openstack-ironic-conductor logs, so that we know what exactly is locking the nodes.

Comment 5 Marius Cornea 2016-10-17 08:50:41 UTC
Created attachment 1211249 [details]
ironic-conductor.log

Comment 6 Marius Cornea 2016-10-17 08:57:57 UTC
Created attachment 1211253 [details]
ironic-api.log

Adding the ironic-api.log as well.

Comment 7 Ramon Acedo 2016-10-20 11:13:35 UTC
I run into the same issue when all the nodes were powered off and they were registered as 'enroll'. 

As a workaround I set them to provision state 'manage':

for i in `ironic node-list|egrep -v "UU|\+"|awk '{print $2}'`;do ironic node-set-provision-state $i manage;done

Probably playing with the ironic config flags 'node_locked_retry_interval' and 'node_locked_retry_attempts' which default 3 and 1 respectively might help.

Comment 8 Dmitry Tantsur 2016-10-20 11:19:41 UTC
This does not seem to be strictly related to being powered on or off.

Comment 9 Dmitry Tantsur 2016-10-20 11:20:09 UTC
*** Bug 1385928 has been marked as a duplicate of this bug. ***

Comment 10 Dmitry Tantsur 2016-11-01 13:02:33 UTC
I think I know the root cause.

Comment 11 Dmitry Tantsur 2016-11-01 13:28:13 UTC
At least Marius' problem is because of we try powering off the nodes right after registration and before doing "manage". This is 1. wrong (we haven't validated power credentials yet), 2. causes a node to get locked (for a minute in Marius' case). I'm moving the power off to "provide" workflow.

Comment 15 errata-xmlrpc 2016-12-14 16:15:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html