Bug 1233452
Summary: | 'Node is locked by host' error causing overcloud deploy to fail | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Ronelle Landy <rlandy> | ||||||
Component: | openstack-ironic | Assignee: | Dmitry Tantsur <dtantsur> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.0 (Kilo) | CC: | calfonso, dtantsur, jslagle, mburns, mlopes, morazi, rhel-osp-director-maint, rlandy, rrosa, whayutin, yeylon | ||||||
Target Milestone: | ga | Keywords: | Automation | ||||||
Target Release: | 7.0 (Kilo) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | python-ironicclient-0.5.1-8.el7ost openstack-ironic-2015.1.0-8.el7ost | Doc Type: | Bug Fix | ||||||
Doc Text: |
Prior to this update, OpenStack Bare Metal Provisioning (Ironic) operations, such as 'Power off' held a lock on a node for longer than expected.
Consequently, certain operations would fail to run while the node was still considered locked.
This update adjusts the retry timeout to two minutes. As a result, no further node lock errors have been noted.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-08-05 13:27:49 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Ronelle Landy
2015-06-19 01:06:36 UTC
Oh... please provide ironic conductor and API logs around failure time (sudo journalctl -u openstack-ironic-api -u openstack-ironic-conductor) I've started an rdo-list thread to discuss the issue: https://www.redhat.com/archives/rdo-list/2015-June/msg00149.html Will copy logs and journalctl output when we hit the error again - it's sporadic. https://review.gerrithub.io/#/c/237471/ is an instack-undercloud patch to bump retry interval for Ironic globally. I'm still interested in logs, however. This occurred in CI again on Dell BM. will pull the logs from the job and post here Created attachment 1043486 [details]
undercloud logs
Created attachment 1043487 [details]
host0 logs
Upstream patch to bump retry interval: https://review.openstack.org/#/c/196020/ I intend to backport it asap. I also suggest backporting https://review.openstack.org/#/c/194619/ for ga or for later to make such problems debugging simpler. I couldn't reproduce this neither on virtual nor baremetal environment. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1548 |