Description of problem: Initial node introspection fails while setting nodes to available with: Error contacting Ironic server: Node 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 is locked by host director-r13.rcbd.lab, please retry after the current operation is completed. Note that running introspection again seems to succeed. Version-Release number of selected component (if applicable): OSP-D 8 Beta How reproducible: Run "openstack baremetal introspection bulk start" to launch node introspection. Steps to Reproduce: 1. Run "openstack baremetal introspection bulk start" 2. Node introspection errors out Actual results: Node introspection fails. Expected results: Node introspection should successfully set all nodes to available. Additional info: The full error follows: [stack@director-r13 ~]$ openstack baremetal introspection bulk start Setting available nodes to manageable... Starting introspection of node: a34bc8c3-0968-4e60-bed0-da39ada8e7f0 Starting introspection of node: fc7e23e4-a104-4882-b3af-98b326b92a80 Starting introspection of node: 4d6a461f-a7de-439e-ab47-c46abdc158a7 Starting introspection of node: 08523959-cb92-45aa-b32f-e5e8b854ea0a Starting introspection of node: 93ae4d69-5f31-484a-84e5-1b5dbf99aced Starting introspection of node: 0a6b3382-2c67-4792-900a-05624c7251af Starting introspection of node: 5ddc6c88-bd93-4fcd-8671-cf2a9b43ca53 Starting introspection of node: 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 Starting introspection of node: 80d914b9-0107-42d0-987a-83da84356982 Waiting for introspection to finish... Introspection for UUID a34bc8c3-0968-4e60-bed0-da39ada8e7f0 finished successfully. Introspection for UUID 4d6a461f-a7de-439e-ab47-c46abdc158a7 finished successfully. Introspection for UUID fc7e23e4-a104-4882-b3af-98b326b92a80 finished successfully. Introspection for UUID 08523959-cb92-45aa-b32f-e5e8b854ea0a finished successfully. Introspection for UUID 0a6b3382-2c67-4792-900a-05624c7251af finished successfully. Introspection for UUID 5ddc6c88-bd93-4fcd-8671-cf2a9b43ca53 finished successfully. Introspection for UUID 80d914b9-0107-42d0-987a-83da84356982 finished successfully. Introspection for UUID 93ae4d69-5f31-484a-84e5-1b5dbf99aced finished successfully. Introspection for UUID 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 finished successfully. Setting manageable nodes to available... Node a34bc8c3-0968-4e60-bed0-da39ada8e7f0 has been set to available. Node fc7e23e4-a104-4882-b3af-98b326b92a80 has been set to available. Node 4d6a461f-a7de-439e-ab47-c46abdc158a7 has been set to available. Node 08523959-cb92-45aa-b32f-e5e8b854ea0a has been set to available. Node 93ae4d69-5f31-484a-84e5-1b5dbf99aced has been set to available. Node 0a6b3382-2c67-4792-900a-05624c7251af has been set to available. Node 5ddc6c88-bd93-4fcd-8671-cf2a9b43ca53 has been set to available. Request returned failure status. Error contacting Ironic server: Node 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 is locked by host director-r13.rcbd.lab, please retry after the current operation is completed. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1149, in do_provisioning_action % action) as task: File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 152, in acquire driver_name=driver_name, purpose=purpose) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 221, in __init__ self.release_resources() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 203, in __init__ self._lock() File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 242, in _lock reserve_node() File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 235, in reserve_node self.node_id) File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 228, in reserve db_node = cls.dbapi.reserve_node(tag, node_id) File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 226, in reserve_node host=node['reservation']) NodeLocked: Node 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 is locked by host director-r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 1 of 6 Node 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 has been set to available. Node 80d914b9-0107-42d0-987a-83da84356982 has been set to available. Introspection completed. [stack@director-r13 ~]$
I slicked and reloaded my director node, and now node introspection is failing with the same error every time even after multiple runs.
We've done a couple deployments with OSP 8 Beta 3, and have not seen this issue. Evidently a change in beta 3 resolved the problem. Closing.
Just did a beta 3 deploy, and this happened again. Reopening.
Update: Node introspection appears to succeed, but this error or warning is shown.
Lucas, any thoughts?
Chris, can we get any logs from Ironic? Lucas, can you see anything?
I just completed introspection with OSP8 beta 4, and this is still an issue, though it does appear to be more of a warning than anything else. Looks pretty bad though: [stack@director-r13 ~]$ openstack baremetal introspection bulk start Setting available nodes to manageable... Starting introspection of node: ed756c4e-5a8b-4d60-ab36-9599131da568 Starting introspection of node: 54653cb5-89ea-48a3-96a3-71a9a07ba90b Starting introspection of node: ad38de44-6d87-4d2b-b3ee-05831e7aa29f Starting introspection of node: 4b5eedfa-2319-4656-94cf-44b4ab51f63a Starting introspection of node: 3645cd3c-c77a-4dc9-9d6c-39607349a859 Starting introspection of node: 4ac88109-0ef3-48a4-abe3-79f380d7c84b Starting introspection of node: e56cdb20-183a-414c-8ca5-47703d37ed41 Starting introspection of node: 1cfc22bc-9194-4c08-a9b7-bcc3bbd25ae5 Starting introspection of node: b0a32584-4614-4a6f-b649-173d63397a14 Waiting for introspection to finish... Introspection for UUID 54653cb5-89ea-48a3-96a3-71a9a07ba90b finished successfull y. Introspection for UUID ed756c4e-5a8b-4d60-ab36-9599131da568 finished successfull y. Introspection for UUID ad38de44-6d87-4d2b-b3ee-05831e7aa29f finished successfull y. Introspection for UUID 3645cd3c-c77a-4dc9-9d6c-39607349a859 finished successfull y. Introspection for UUID 4b5eedfa-2319-4656-94cf-44b4ab51f63a finished successfull y. Introspection for UUID e56cdb20-183a-414c-8ca5-47703d37ed41 finished successfull y. Introspection for UUID 4ac88109-0ef3-48a4-abe3-79f380d7c84b finished successfull y. Introspection for UUID 1cfc22bc-9194-4c08-a9b7-bcc3bbd25ae5 finished successfull y. Introspection for UUID b0a32584-4614-4a6f-b649-173d63397a14 finished successfull y. Setting manageable nodes to available... Node ed756c4e-5a8b-4d60-ab36-9599131da568 has been set to available. Node 54653cb5-89ea-48a3-96a3-71a9a07ba90b has been set to available. Node ad38de44-6d87-4d2b-b3ee-05831e7aa29f has been set to available. Node 4b5eedfa-2319-4656-94cf-44b4ab51f63a has been set to available. Node 3645cd3c-c77a-4dc9-9d6c-39607349a859 has been set to available. Node 4ac88109-0ef3-48a4-abe3-79f380d7c84b has been set to available. Node e56cdb20-183a-414c-8ca5-47703d37ed41 has been set to available. Node 1cfc22bc-9194-4c08-a9b7-bcc3bbd25ae5 has been set to available. Request returned failure status. Error contacting Ironic server: Node b0a32584-4614-4a6f-b649-173d63397a14 is loc ked by host director-r13.rcbd.lab, please retry after the current operation is c ompleted. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142 , in inner return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1150 , in do_provisioning_action % action) as task: File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 152, in acquire driver_name=driver_name, purpose=purpose) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 221, in __init__ self.release_resources() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in _ _exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 203, in __init__ self._lock() File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 242, in _lock reserve_node() File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 235, in reserve_node self.node_id) File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 228, in r eserve db_node = cls.dbapi.reserve_node(tag, node_id) File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 226, in reserve_node host=node['reservation']) NodeLocked: Node b0a32584-4614-4a6f-b649-173d63397a14 is locked by host director -r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 1 of 6 Request returned failure status. Error contacting Ironic server: Node b0a32584-4614-4a6f-b649-173d63397a14 is loc ked by host director-r13.rcbd.lab, please retry after the current operation is c ompleted. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142 , in inner return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1150 , in do_provisioning_action % action) as task: File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 152, in acquire driver_name=driver_name, purpose=purpose) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 221, in __init__ self.release_resources() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in _ _exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 203, in __init__ self._lock() File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 242, in _lock reserve_node() File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 235, in reserve_node self.node_id) File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 228, in r eserve db_node = cls.dbapi.reserve_node(tag, node_id) File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 226, in reserve_node host=node['reservation']) NodeLocked: Node b0a32584-4614-4a6f-b649-173d63397a14 is locked by host director -r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 2 of 6 Node b0a32584-4614-4a6f-b649-173d63397a14 has been set to available. Introspection completed. [stack@director-r13 ~]$
Created attachment 1119266 [details] ironic logs from the director node Added all ironic logs from the director node.
Lucas, suspicion is that this is a non-fatal warning. is this true?
Hi @Mike, @Chris, The NodeLocked means that when the operation was tried a conductor was doing something in the node (sync'ing power state, transitioning the power state from A to B, deploying etc...). This error is retry-able as we can see in the output there (and it's being automatically retried) [0]. I think the problem here is that the CLI command is leaking the exception. Ironic will print out exceptions in case it's in debug mode. @Chris can you please check the /etc/ironic/ironic.conf configuration file and see if Ironic is running in debug mode? e.g [DEFAULT] debug=True [0] NodeLocked: Node 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 is locked by host director-r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 1 of 6
Just checked, and in /etc/ironic/ironic.conf: # If set to false, will disable INFO logging level, making # WARNING the default. (boolean value) #verbose=true verbose=False
Also from ironic.conf: # Print debugging output (set logging level to DEBUG instead # of default INFO level). (boolean value) #debug=false debug=True
@Chris, Thanks, yeah setting debug=False will stop leaking those tracebacks on the API side. It was considered a "feature" in Ironic until recently. Theorically we should not run this in debug mode in production (debug messages may contain secrets and these kind of information), but in reality people do so we introduced a new configuration option to turn on/off those exceptions in the API side even if it's in debug mode, the patch is this one https://review.openstack.org/#/c/256615/. If you think it's important to run things in debug mode but do not want the traceback I can check about back porting it into OSP.
*** Bug 1356292 has been marked as a duplicate of this bug. ***
We've rewritten this logic completely in OSP10, and now it should not show ugly warnings to users.