Bug 2007268

Summary: power_sync failed ironic conductor when using ipmi driver
Product: Red Hat OpenStack Reporter: Uemit Seren <uemit.seren>
Component: openstack-ironicAssignee: Julia Kreger <jkreger>
Status: CLOSED ERRATA QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: 16.2 (Train)CC: dsorrent, gregraka, jkreger, jparoly, jschluet, ltamagno, sbaker, slinaber, spower, sputhenp
Target Milestone: z1Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-ironic-13.0.8-2.20210528130909.el8ost.1 Doc Type: Bug Fix
Doc Text:
Before this update, a lock handling issue prevented IPMI-based nodes from recording the hardware vendor as part of power state synchronization. This issue caused the power state synchronization to fail, and nodes that used the `ipmi` hardware type entered the `Maintenance` state. With this update, the lock is handled correctly and the power state synchronization for bare metal nodes that use the `ipmi` hardware type work correctly and no locking errors occur.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-09 20:41:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Uemit Seren 2021-09-23 12:53:32 UTC
Description of problem:

The power_sync for baremetal nodes that use the ipmi driver fails with following error messages and the node is put into maintenance mode:

2021-09-23 13:42:33.514 7 ERROR ironic.conductor.manager [req-06c1ac49-2619-429f-9d3d-c96953a37def - - - - -] During sync_power_state, max retries exceeded for node 5a770848-5ee9-4b96-8560-d0434088779a, node state None does not match expected state 'None'. Updating DB state to 'None' Switching node to maintenance mode. Error: An exclusive lock is required, but the current context has a shared lock.: ironic.common.exception.ExclusiveLockRequired: An exclusive lock is required, but the current context has a shared lock.

Version-Release number of selected component (if applicable):


How reproducible:

always

Steps to Reproduce:
1. Enroll & manage a node
2. Wait until node is put into maintenance mode

Actual results:

Node power state is "None" and node is put into maintenance state

Expected results:

Node power state should be "Power off" and node should not be in maintenance state

Additional info:

Current workaround is to set vendor=ignoreme:
openstack baremetal node set <node>--property vendor=ignoreme

According to upstream developers following patch needs to be backported to Train release: 
https://review.opendev.org/c/openstack/ironic/+/810656

Comment 5 Steve Baker 2021-09-28 19:40:41 UTC
*** Bug 2006603 has been marked as a duplicate of this bug. ***

Comment 6 Steve Baker 2021-10-12 20:02:59 UTC
*** Bug 2011422 has been marked as a duplicate of this bug. ***

Comment 27 errata-xmlrpc 2021-12-09 20:41:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.1 (Train)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:5067