Bug 2011422

Summary: After upgrading from 16.1 to 16.2 all baremetal nodes listed in ironic are stuck in maintenance mode in Director & Overcloud
Product: Red Hat OpenStack Reporter: Darin Sorrentino <dsorrent>
Component: openstack-ironicAssignee: OSP Team <rhos-maint>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16.2 (Train)CC: sbaker
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-12 20:02:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Darin Sorrentino 2021-10-06 15:01:05 UTC
Description of problem:

Ironic is setting maintenance mode enabled on all registered nodes after upgrading from 16.1 to 16.2.

Environment summary:

Environment was a 16.1 DCN deployment with Ironic in the Overcloud which I performed an upgrade to 16.2 on.  While upgrading the overcloud, it was noticed all of the baremetal nodes registered to director were in maintenance mode. At the conclusion of the upgrade, I attempted to deploy a baremetal instance in the overcloud and saw I was getting NoValidHost Found.

Ironic in the overcloud had it's baremetal nodes set to maintenance true.  I unset them and they were immediately changed back to maintenance mode true.

I checked Ironic in the undercloud and noted the same behaviour.

ironic-conductor.log shows:

2021-10-06 10:39:15.653 7 ERROR ironic.conductor.manager [req-9d2a85ef-2fca-497b-8ef2-7aa28f10386d - - - - -] During sync_power_state, max retries exceeded for node cad0d948-b462-4d11-af75-65f15ee6ef23, node state None does not m│·······
atch expected state 'None'. Updating DB state to 'None' Switching node to maintenance mode. Error: An exclusive lock is required, but the current context has a shared lock.: ironic.common.exception.ExclusiveLockRequired: An exclu│·······
sive lock is required, but the current context has a shared lock. 



I confirmed I can use ipmitool from within the ironic conductor container on Director using the credentials registered with ironic and successfully obtain a power status on the above referenced node.

Version-Release number of selected component (if applicable):
16.2

How reproducible:


Steps to Reproduce:
1. Upgrade environment from 16.1 to 16.2
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Darin Sorrentino 2021-10-06 17:26:30 UTC
Issue looks like the same as https://bugzilla.redhat.com/show_bug.cgi?id=2007268

Testing workaround.

Comment 2 Steve Baker 2021-10-12 20:02:59 UTC

*** This bug has been marked as a duplicate of bug 2007268 ***