Bug 1287848
Summary: | Node locked error during node introspection | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Chris Dearborn <christopher_dearborn> | ||||
Component: | python-tripleoclient | Assignee: | RHOS Maint <rhos-maint> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Arik Chernetsky <achernet> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 8.0 (Liberty) | CC: | akaris, arkady_kanevsky, cdevine, christopher_dearborn, hbrock, John_walsh, jslagle, kbasil, kurt_hey, lmartins, mburns, mgould, mlammon, morazi, rajini.karthik, randy_perryman, rhel-osp-director-maint, rsussman, sclewis, sreichar, srevivo, wayne_allen | ||||
Target Milestone: | --- | Keywords: | Reopened, ZStream | ||||
Target Release: | 8.0 (Liberty) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-10-03 15:44:28 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1261979 | ||||||
Attachments: |
|
Description
Chris Dearborn
2015-12-02 20:17:03 UTC
I slicked and reloaded my director node, and now node introspection is failing with the same error every time even after multiple runs. We've done a couple deployments with OSP 8 Beta 3, and have not seen this issue. Evidently a change in beta 3 resolved the problem. Closing. Just did a beta 3 deploy, and this happened again. Reopening. Update: Node introspection appears to succeed, but this error or warning is shown. Lucas, any thoughts? Chris, can we get any logs from Ironic? Lucas, can you see anything? I just completed introspection with OSP8 beta 4, and this is still an issue, though it does appear to be more of a warning than anything else. Looks pretty bad though: [stack@director-r13 ~]$ openstack baremetal introspection bulk start Setting available nodes to manageable... Starting introspection of node: ed756c4e-5a8b-4d60-ab36-9599131da568 Starting introspection of node: 54653cb5-89ea-48a3-96a3-71a9a07ba90b Starting introspection of node: ad38de44-6d87-4d2b-b3ee-05831e7aa29f Starting introspection of node: 4b5eedfa-2319-4656-94cf-44b4ab51f63a Starting introspection of node: 3645cd3c-c77a-4dc9-9d6c-39607349a859 Starting introspection of node: 4ac88109-0ef3-48a4-abe3-79f380d7c84b Starting introspection of node: e56cdb20-183a-414c-8ca5-47703d37ed41 Starting introspection of node: 1cfc22bc-9194-4c08-a9b7-bcc3bbd25ae5 Starting introspection of node: b0a32584-4614-4a6f-b649-173d63397a14 Waiting for introspection to finish... Introspection for UUID 54653cb5-89ea-48a3-96a3-71a9a07ba90b finished successfull y. Introspection for UUID ed756c4e-5a8b-4d60-ab36-9599131da568 finished successfull y. Introspection for UUID ad38de44-6d87-4d2b-b3ee-05831e7aa29f finished successfull y. Introspection for UUID 3645cd3c-c77a-4dc9-9d6c-39607349a859 finished successfull y. Introspection for UUID 4b5eedfa-2319-4656-94cf-44b4ab51f63a finished successfull y. Introspection for UUID e56cdb20-183a-414c-8ca5-47703d37ed41 finished successfull y. Introspection for UUID 4ac88109-0ef3-48a4-abe3-79f380d7c84b finished successfull y. Introspection for UUID 1cfc22bc-9194-4c08-a9b7-bcc3bbd25ae5 finished successfull y. Introspection for UUID b0a32584-4614-4a6f-b649-173d63397a14 finished successfull y. Setting manageable nodes to available... Node ed756c4e-5a8b-4d60-ab36-9599131da568 has been set to available. Node 54653cb5-89ea-48a3-96a3-71a9a07ba90b has been set to available. Node ad38de44-6d87-4d2b-b3ee-05831e7aa29f has been set to available. Node 4b5eedfa-2319-4656-94cf-44b4ab51f63a has been set to available. Node 3645cd3c-c77a-4dc9-9d6c-39607349a859 has been set to available. Node 4ac88109-0ef3-48a4-abe3-79f380d7c84b has been set to available. Node e56cdb20-183a-414c-8ca5-47703d37ed41 has been set to available. Node 1cfc22bc-9194-4c08-a9b7-bcc3bbd25ae5 has been set to available. Request returned failure status. Error contacting Ironic server: Node b0a32584-4614-4a6f-b649-173d63397a14 is loc ked by host director-r13.rcbd.lab, please retry after the current operation is c ompleted. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142 , in inner return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1150 , in do_provisioning_action % action) as task: File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 152, in acquire driver_name=driver_name, purpose=purpose) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 221, in __init__ self.release_resources() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in _ _exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 203, in __init__ self._lock() File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 242, in _lock reserve_node() File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 235, in reserve_node self.node_id) File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 228, in r eserve db_node = cls.dbapi.reserve_node(tag, node_id) File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 226, in reserve_node host=node['reservation']) NodeLocked: Node b0a32584-4614-4a6f-b649-173d63397a14 is locked by host director -r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 1 of 6 Request returned failure status. Error contacting Ironic server: Node b0a32584-4614-4a6f-b649-173d63397a14 is loc ked by host director-r13.rcbd.lab, please retry after the current operation is c ompleted. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142 , in inner return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1150 , in do_provisioning_action % action) as task: File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 152, in acquire driver_name=driver_name, purpose=purpose) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 221, in __init__ self.release_resources() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in _ _exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 203, in __init__ self._lock() File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 242, in _lock reserve_node() File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 235, in reserve_node self.node_id) File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 228, in r eserve db_node = cls.dbapi.reserve_node(tag, node_id) File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 226, in reserve_node host=node['reservation']) NodeLocked: Node b0a32584-4614-4a6f-b649-173d63397a14 is locked by host director -r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 2 of 6 Node b0a32584-4614-4a6f-b649-173d63397a14 has been set to available. Introspection completed. [stack@director-r13 ~]$ Created attachment 1119266 [details]
ironic logs from the director node
Added all ironic logs from the director node.
Lucas, suspicion is that this is a non-fatal warning. is this true? Hi @Mike, @Chris, The NodeLocked means that when the operation was tried a conductor was doing something in the node (sync'ing power state, transitioning the power state from A to B, deploying etc...). This error is retry-able as we can see in the output there (and it's being automatically retried) [0]. I think the problem here is that the CLI command is leaking the exception. Ironic will print out exceptions in case it's in debug mode. @Chris can you please check the /etc/ironic/ironic.conf configuration file and see if Ironic is running in debug mode? e.g [DEFAULT] debug=True [0] NodeLocked: Node 2eb0d12c-c49a-41df-aeb8-f54c03b67d55 is locked by host director-r13.rcbd.lab, please retry after the current operation is completed. (HTTP 409). Attempt 1 of 6 Just checked, and in /etc/ironic/ironic.conf: # If set to false, will disable INFO logging level, making # WARNING the default. (boolean value) #verbose=true verbose=False Also from ironic.conf: # Print debugging output (set logging level to DEBUG instead # of default INFO level). (boolean value) #debug=false debug=True @Chris, Thanks, yeah setting debug=False will stop leaking those tracebacks on the API side. It was considered a "feature" in Ironic until recently. Theorically we should not run this in debug mode in production (debug messages may contain secrets and these kind of information), but in reality people do so we introduced a new configuration option to turn on/off those exceptions in the API side even if it's in debug mode, the patch is this one https://review.openstack.org/#/c/256615/. If you think it's important to run things in debug mode but do not want the traceback I can check about back porting it into OSP. *** Bug 1356292 has been marked as a duplicate of this bug. *** We've rewritten this logic completely in OSP10, and now it should not show ugly warnings to users. |