Bug 1343744 - Retry timeout when setting node state to available
Summary: Retry timeout when setting node state to available
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: rc
: 10.0 (Newton)
Assignee: Derek Higgins
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
: 1290066 (view as bug list)
Depends On: 1284247
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-07 19:43 UTC by Jay Dobies
Modified: 2023-09-14 03:26 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1284247
Environment:
Last Closed: 2016-12-14 15:36:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC

Description Jay Dobies 2016-06-07 19:43:29 UTC
+++ This bug was initially created as a clone of Bug #1284247 +++

Description of problem:
When introspection retries it throws an exception and a traceback:

[stack@puma01 ~]$ openstack baremetal introspection bulk start
Setting available nodes to manageable...
Starting introspection of node: f1651455-d1a3-4716-9818-303c55a90e89
Starting introspection of node: 404100ea-df81-448d-ab75-7cdaa3f45373
Starting introspection of node: f2d7b8b0-cbd7-4466-9aac-3d5d07af5559
Starting introspection of node: f1c24a81-34c1-4646-bcf6-92c82a8c591e
Starting introspection of node: ab1cd1a1-d52b-459e-b03a-2a4753e8692c
Starting introspection of node: 1bbc969d-479d-4f9b-becf-713be9085289
Waiting for introspection to finish...
Introspection for UUID f1651455-d1a3-4716-9818-303c55a90e89 finished successfully.
Introspection for UUID 404100ea-df81-448d-ab75-7cdaa3f45373 finished successfully.
Introspection for UUID f2d7b8b0-cbd7-4466-9aac-3d5d07af5559 finished successfully.
Introspection for UUID f1c24a81-34c1-4646-bcf6-92c82a8c591e finished successfully.
Introspection for UUID ab1cd1a1-d52b-459e-b03a-2a4753e8692c finished successfully.
Introspection for UUID 1bbc969d-479d-4f9b-becf-713be9085289 finished successfully.
Setting manageable nodes to available...
Node f1651455-d1a3-4716-9818-303c55a90e89 has been set to available.
Node 404100ea-df81-448d-ab75-7cdaa3f45373 has been set to available.
Node f2d7b8b0-cbd7-4466-9aac-3d5d07af5559 has been set to available.
Node f1c24a81-34c1-4646-bcf6-92c82a8c591e has been set to available.
Request returned failure status.
Error contacting Ironic server: Node ab1cd1a1-d52b-459e-b03a-2a4753e8692c is locked by host puma01.scl.lab.tlv.redhat.com, please retry after the current operation is completed.
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1149, in do_provisioning_action
    % action) as task:

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 152, in acquire
    driver_name=driver_name, purpose=purpose)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 221, in __init__
    self.release_resources()

  File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__
    six.reraise(self.type_, self.value, self.tb)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 203, in __init__
    self._lock()

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 242, in _lock
    reserve_node()

  File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f
    return Retrying(*dargs, **dkw).call(f, *args, **kw)

  File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call
    raise attempt.get()

  File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get
    six.reraise(self.value[0], self.value[1], self.value[2])

  File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 235, in reserve_node
    self.node_id)

  File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 228, in reserve
    db_node = cls.dbapi.reserve_node(tag, node_id)

  File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 226, in reserve_node
    host=node['reservation'])

NodeLocked: Node ab1cd1a1-d52b-459e-b03a-2a4753e8692c is locked by host puma01.scl.lab.tlv.redhat.com, please retry after the current operation is completed.
 (HTTP 409). Attempt 1 of 6
Node ab1cd1a1-d52b-459e-b03a-2a4753e8692c has been set to available.
Request returned failure status.


Version-Release number of selected component (if applicable):
openstack-ironic-conductor-4.2.0-2.1.el7ost.noarch
openstack-ironic-inspector-2.2.2-1.el7ost.noarch
python-ironic-inspector-client-1.2.0-5.el7ost.noarch
python-ironicclient-0.8.1-1.el7ost.noarch
openstack-ironic-common-4.2.0-2.1.el7ost.noarch
openstack-ironic-api-4.2.0-2.1.el7ost.noarch


How reproducible:
When retries occur


Steps to Reproduce:
1. Install OSP-d puddle 2015.11.19.2 (final beta)
2. Run introspection of bare metal nodes

--- Additional comment from Red Hat Bugzilla Rules Engine on 2015-11-22 06:45:17 EST ---

Since this issue was entered in bugzilla without a release flag set, rhos-8.0? has been automatically added to ensure that it is properly evaluated for this release.

--- Additional comment from Dmitry Tantsur on 2015-11-23 04:39:36 EST ---

First of all, this bug is not directly related to ironic and/or inspector ("thanks" to bulk start command for being so confusing). I think the root cause is that we've dropped the ironicclient patch for bumping retries number. tripleoclient must do it now instead when trying to update node provision state.

--- Additional comment from Dmitry Tantsur on 2015-11-23 05:46:55 EST ---

Ofer, could you explain why you changed the component? This error does not even involve discoverd. I've set it to the correct component previously.

--- Additional comment from Ofer Blaut on 2015-11-29 05:37:27 EST ---

Hi

We don't have python-rdomanager-oscplugin in OSPd8 so i moved it to ironic 

I will move it to ospd 

Ofer

--- Additional comment from Mike Burns on 2016-02-23 08:18:14 EST ---

marking urgent due to blocker flag

--- Additional comment from John Skeoch on 2016-04-18 05:09:59 EDT ---

User yeylon's account has been closed

Comment 1 Dmitry Tantsur 2016-08-19 08:15:16 UTC
*** Bug 1290066 has been marked as a duplicate of this bug. ***

Comment 2 Dmitry Tantsur 2016-10-05 08:34:56 UTC
This issue was fixed upstream by bumping (again) the retry timeout when creating an ironic client for mistral workflows. I think we can close it.

Comment 4 Raviv Bar-Tal 2016-10-10 14:21:46 UTC
Hi Jay,
I tested this on puma servers and did not reproduce the error your getting.
If you still have this problem please contact my and I will check your environment.

B.R
Raviv

Comment 6 errata-xmlrpc 2016-12-14 15:36:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html

Comment 7 Red Hat Bugzilla 2023-09-14 03:26:30 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.